Sharpen the potential of your next Voice Analytics AI project

Sharpen the potential of your next Voice Analytics AI project.

Hey Google, turn on the light PLEASE!

You may well ask, why I say “please” to what is essentially just a bunch of 1’s & 0’s? Well, my theory is; if AI (Artificial Intelligence) ever becomes self-aware on this planet, I want to ensure it holds me in good standing while respecting my need to breathe fresh air and exist!

So, is this just another Blog about Artificial Intelligence? No. It is focussed on what Artificial Intelligence needs for the successful analysis of phone calls made and received by your organization.  More Specifically, we discuss the fundamental requirements of Voice Analytics, to ensure it has the absolute best chance of delivering accurate transcription & sentiment analysis for both Customer and Staff conversing by phone.

That fundamental requirement is Clean Voice Recording which starts with a clean voice itself. Our definition of “clean voice recording” is a recording that represents the closest possible copy of the original voices active on any given call. To obtain the best possible “clean voice recording” there are many factors at play, many of which are often overlooked.

More importantly, we need to look at the challenges of passing voice cleanly through the myriad of voice and data networks. Each challenge, can be classified into one of two groups, when looking at it from the point of view of a contact centre owner/manager:

  1. Challenges outside your control – which you should always consider and remain aware of
  2. Challenges inside your control – particularly when considering current and future architecture of your voice network

Pretty much everything outside of your own corporate network can be deemed “outside of your control”, or is it? Well, to look at this in more detail, let’s take the scenario of an inbound call to a contact centre, being made from a mobile network to a toll-free number that points to a cloud-based contact centre. The challenges experienced along the way, with respect to voice quality, are numerous as the caller may:

  1. not be in a good mobile coverage area
  2. roam in and out of good coverage during the call
  3. roam out of coverage and thus ending the call abruptly
  4. remain in a good coverage area for the duration of the call, however the mobile cell may become congested during the call, at which point the network provider may dynamically introduce various levels of compression, thus degrading the quality of the call for all or part(s) of the call duration
  5. originate by dialling a toll-free number which may point to a different carrier, that may or may not introduce compression of its own, thus further degrading the quality of the call

The Toll-Free Number Network will ultimately route the call across the PSTN (Public Switched Telephone Network) which is typically land-based without the use of compression, It will present the call to a public network gateway such as an SBC (Session Boarder Controller), in the case of SIP (Session Initiation Protocol) trunks, of the contact centre. The SBC(s), also known as Edge Appliance(s), may be hosted in your Data Centre, the Contact Centre Vendor, Integrator or Carrier. Either way, the call is routed there by way of the Toll-Free number being mapped to the DID (Direct Inward Dial) number allocated to the Carrier provided voice trunk.  After this, the call is then under control of the contact centre routing logic, and may route to an agent that uses either a:

  1. hard/physical SIP phone
  2. soft SIP client running on a PC


  1. WebRTC (Web Real-Time Communication) browser-based client


  1. Fixed headsets – USB or analogue
  2. Bluetooth Wireless Headset – which introduces compression and may suffer further degradation due to local radio saturation and/or distance should the agent roam from the desk
  3. DECT (Digital Enhanced Cordless Telecommunications) headset – while providing better radio signal distance and object penetration, DECT can also suffer the same effects of its cousin Bluetooth and introduces a form of compression.

For all these clients, the connectivity from the trunk to the agent, may or may not use compression. Also, for this exercise, let’s assume that your own internal network is well engineered with no conceivable delay, jitter or compression and that the agents remain on the network with direct connectivity to the cloud contact centre provider, which means they do not connect over the open (uncontrolled) internet. If not, then there is another challenge point to consider. It’s additionally worth noting that some WebRTC clients will only support the OPUS codec, which will also introduce compression.

Now the call is complete and hopefully recorded in the cloud. Have you considered that:

  1. many cloud-based Contact Centre providers, to keep storage costs to a minimum, will default to using further compression of the voice before storing, which in many cases can be reconfigured to eliminate compression?
  2. your recording may be mono and not stereo? A dual channel recording will improve the performance of external 3rd party speech/voice analytics tools. Otherwise, they may need to deploy a speech recognition layer, that will need to be trained to differentiate agent voices over caller voices which may introduce further inaccuracies
  3. external 3rd party voice analytics tools will usually extract your recordings via an open API (application program interface) provided by the cloud-based contact centre vendor. These API’s often provide a choice of file formats for the recordings being retrieved, and, yes, some of which will introduce further compression of the voice.

So why all this concern about the introduction of compression? These days not only do all voice calls start as acoustic energy being converted to electrical energy, which can be considered an analogue waveform, they are immediately digitized at the handset, headset, or PC. If good equipment is used, this means the “Clean Voice” is converted into a bunch of 1’s and 0’s which can be reconstructed at the other end to replicate that “Clean Voice”. However, when compression is introduced, some of those 1’s and 0’s are simply robbed, and never transmitted, thus reducing the chance of quality reproduction of the original “Clean Voice” signal. Granted though, that good compression algorithms with predictive intelligence, will arguably recreate the original signal with little to no noticeable degradation to the human ear. However, with original data being robbed, there will always be a finite degree of degradation. Add the number of finite degradations along the path of your typical call together, and you will always have a noticeable resultant degradation. The trick is to keep it all to a minimum.

I now bring you back to the initial two points, “challenges outside and within your control”. First, we need to accept, that those outside of our control are here to stay for the foreseeable future and we must remain aware of these, yet how many of the listed challenges can we influence or control? The answer is quite simple, almost 70% is within your power to influence or eliminate. This means that above points a, b, c, & d are outside of your control, however e, f, g, h, i, j, k, l, m & n are definitely within our power to change by simply asking the right questions and making the right architectural decisions when designing and administering your cloud-based Contact Centre or Unified Communications platform. An example of this is point “e” above; while customers of carriers have no control over whether or not compression is used within the Toll-Free network, you do have the right to ask for a Toll-Free Number Product that does not compress across their network. It’s also worth noting that compression technologies were introduced for two main reasons only; to reduce the need of expensive bandwidth, and to reduce the need of expensive storage capacity, both of which have rapidly become a cheap commodity in recent times.

Address these voice challenges, keep compression to a minimum, and you will be giving your Artificial Intelligence based Voice Analytics project the best fighting chance for success.

Connect with me, Rob & let’s have an informative and friendly 30 min catch up to discuss your AI & analytic needs, challenges and goals!

Talk to an expert