Voice Quality: Enhancing Voice Signals to Achieve Pure and Natural Speech over the IP Network

Voice Quality is highly subjective and difficult to measure yet everyone knows immediately when the VoIP application they are using has poor voice quality. Good voice quality is an inert feature, not calling attention to itself unless problematic. We have a certain expectation of voice quality on our calls, and have difficulty dealing with anything less than perfect.

The human voice has several features including pitch, volume, quality and resonance, which are used to convey information. Each person has a voice unique unto themselves in which they can project through vocal words strength, emotion, power, uncertainties, knowledge, and empathy among other things. Preservation of the quality of real-time spoken language over IP network is critical to retaining all the nuances of human speech.

The importance of voice quality is well understood in the telecom community, unfortunately the factors affecting voice in a VoIP network are varied and many. The principal problems faced by VoIP network calls are echo, packet loss, latency and jitter.

Echo is arguably the worst type of impairment that can be encountered during a telephone conversation. Echo is so disruptive to the natural cadence of speech that it can render the talker speechless. We hear our own voice as a combination of the sound waves from our vocal cords and lower-frequency vibrations in our bones, so in addition to delay, hearing our air-conducted voice less the bone-conducted pathway at a different frequency makes the effect more pronounced. The exact why isn’t known, but it’s suspected that the brain becomes confused when trying to synchronize what it’s saying to what it’s hearing triggering the brain to “correct” itself. This in turn can cause a person to stop talking or at a minimum distracts their line of thought.

Acoustic echo is generated when the speech signal (Far End) played out of a speaker (Near End) is coupled back to the microphone either directly or indirectly by reflection. The speaker then hears them self after a perceptible delay.

HD AEC Simplified block diagram

Adaptive Digital’s High Definition Acoustic Echo Canceller (HD AEC) is exceptional at handling echo. HD AEC delivers superior voice clarity and true full duplex performance under a wide set of challenging acoustic environments. It can eliminate the acoustic echo in difficult conditions such as unbalanced speech levels, close speaker to mic proximity, background noise, double talk, echoic reflective room surfaces, and echo path changes.

The HD AEC algorithm integrates both Noise Reduction, and Automatic Gain Control (AGC), as well as anti-howling, adaptive filtering, nonlinear processing, and double-talk detection.  HD AEC adapts to changes in the acoustic path (including gain/loss changes), and automatically adjusts for unknown bulk delay. Additionally, its’ configurable parameters are tunable which gives developers more control when designing for harsh environments.

Adaptive Digital’s HD AEC cancels the echo that occurs between the speaker output and the microphone input. The adaptive filter estimates the echo and subtracts it from the TxIn signal to form the residual signal.

The residual signal is fed to the noise reduction block. This noise reduction block removes background noise and therefore improves the signal to noise ratio of the transmit signal.

The adaptive filter works in conjunction with the bulk delay monitor, which monitors and adjusts bulk delay in situations where the bulk delay is unknown due non-deterministic audio drivers.

Latency in Voice over IP communication refers to period of delay between the spoken message (data packet(s)) and the listeners’ ear receiving it. The effects of latency on voice quality include: Echo, synchronization issues between speech and data, distortion, garbled, choppy, drop-outs and overlaps in conversation. Causes of latency include packet loss and jitter.

The human ear is very good at handling the short gaps; a latency of 20ms is normal for IP calls and 150ms is barely noticeable and therefore acceptable. VoIP providers target 150ms as the maximum one-way latency a voice call can withstand while still maintaining acceptable voice quality. When this latency

threshold is exceeded, the voice call increasingly degrades to the point where communication at a latency of 300ms normal conversation is difficult. End-to-end delay above 500 milliseconds can makes normal conversations impossible. When packets arrive outside this upper limit, the packets are discarded or ignored causing what amounts to packet loss. Adaptive Digital combats packet loss with its robust Packet Loss Concealment (PLC) algorithms part of its Voice Quality Enhancement (VQE) Suite.

In a VoIP network, voice signals are compressed and framed as packets. These voice packets will go through several impediments (network congestion, improper queuing, or config errors)

while navigating from the talker to the destination in some cases creating a lag or additional space between packets. The Adaptive Jitter Buffer (AJB) algorithm regulates the flow between incoming packet stream and the voice decoder, providing the user with better audio quality.

Illustrates Jitter/Adaptive Jitter Buffer

Other problems affecting voice quality: Background Noise, Audio feedback, double-talk.

Noise Reduction algorithm reduces background noise in speech signals under such circumstances where the background sound intensity levels are high. The algorithm designed to reduce the noise while retaining the voice signal therefore improves the signal to noise ratio of the transmit signal.

Adaptive Digital’s Voice Quality Enhancement (VQE) algorithms dramatically improve the quality and clarity of speech communications. Adaptive Digital meets the VQE challenge by delivering fielded, scalable, state-of-the-art voice enhancement algorithms/solutions, flexible configuration options, and real-world experience enabling exceptional voice call performance across each users’ environment.

  • High Definition Acoustic Echo Canceller (HD AEC)- Multi, and Single Mic, Stereo Acoustic EC (SAEC)
  • Noise Reduction (NR) featuring noise, siren, howling and feedback cancellation, Noise Suppression (NS)
  • Packet/Line Echo Canceller (LEC)
  • Automatic Gain Control (AGC)
  • Packet Loss Concealment (PLC)
  • Acoustic Beamformer
  • Adaptive Jitter Buffer (AJB)
  • Comfort Noise Generation
  • Dual Mic Noise Cancellation
  • Robust Bulk Delay Finder

The VQE software is available as individual algorithms, or as turnkey solutions for TI processors on many device platforms for a vast range of communication applications.

The key to providing a VQE solution is to put the correct algorithms together in such a way to create impairment free communication that maximizes speech quality. Adaptive Digital will provide consultation services or customize a solution to fit your specific requirements.

Good hardware/software design is also key to an exceptional product. A poor enclosure design can negate the cancellation performance of the very best acoustic echo canceller. Adaptive Digital provides design consultation at the application concept level.

Applications for the Adaptive Digital Voice Quality Solutions include Doorbell/intercom, hand-held, desktop, wearables, multiple microphone video conferencing systems, soft phones, IP cameras, automobile cabins, USB headsets, voice Command and control applications (voice enabled Smart Speaker, voice enabled Smart Gateway) and security to name a few.

Acoustic Echo

Acoustic Echo Cancellation Implemented

About Adaptive Digital Technologies, Inc.
Adaptive Digital is an industry-leading provider of voice technology. Adaptive Digital’s products include high definition acoustic echo cancellation (HD AEC), HD stereo acoustic echo cancellation, Packet, line and Network Echo Cancellation, Voice Engine with SIP, high-density conference engines, speech compression, telephony, and voice-quality algorithms across many platforms. Adaptive Digital’s customers include Cisco Systems Inc., Foxconn, General Dynamics, Harris, Motorola, Northrup Grumman, Sonus, Starleaf, Texas Instrument Windstream, and Yealink. Adaptive Digital is a member of the Texas Instruments’ Design Network.

Contact Information

Adaptive Digital Technologies, Inc.

525 Plymouth Road
Suite 316
Plymouth Meeting, PA, 19462

tele: 610-825-0182 x120
toll-free: 1-800-340-2066

Share and Enjoy:
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google
  • TwitThis