Intelligibility Robustness of Five Speech Codec Modes in Frame-Erasure and Background-Noise Environments
Frame erasures and background noise are two factors that can interact with speech coding to reduce speech intelligibility and thus impair public safety mission-critical voice communications. We conducted two tests of intelligibility in the face of these factors. The tests covered five adaptive multi-rate (AMR) and enhanced voice services (EVS) speech coding modes, each using a bit rate near 13 kb/s. Two EVS Channel Aware (CA) modes were included. Both tests use the Modified Rhyme Test (MRT) protocol and together they comprise over 150,000 trials. The first test used frame erasures targeted at critical consonants for maximum sensitivity and the second used frame erasures generated at random by a two-state Gauss-Markov model. By using these large numbers of MRT trials we found that the CA codec modes offer small but statistically significant speech intelligibility improvements in numerous frame-erasure environments.
Keywords: noise; speech coding; speech quality; modified rhyme test (MRT); packet loss; speech intelligibility; frame erasures; AMR; EVS; channel aware; frame loss