Share this Page URL

Chapter 11. General Audio Coding - Pg. 389

389 Chapter 11. General Audio Coding by Jürgen Herre and Heiko Purnhagen Keywords:natural audio coding, audio coding tools, advanced audio coding, AAC, TwinVQ, T/F coding, transform coding, PNS, LTP, low-delay audio coding, parametric audio coding, HILN, BSAC, scalable audio coding, error resilience, error robustness This chapter introduces the concepts and tools behind the MPEG-4 General Audio coding technol- ogy--that is, the coding algorithms within the MPEG-4 natural audio framework that are not targeted at specific types of audio signals but aim at the faithful reproduction of all types of input audio signals. This implies that encoding has to be done in a flexible way rather than relying on a specific source model. Originally, the term general audio coding was created to refer to MPEG-4 audio coders based on the coding of spectral components derived from an analysis filterbank, the so-called time/frequency (T/F) coders. In a wider sense, the MPEG-4 parametric audio coder also can be considered a gen- eral audio coder, as it aims at the good reproduction of arbitrary input audio signals by means of a flexible decomposition of the input signal into distinct sound components. This chapter will cover both technologies--T/F and parametric coders--in terms of their basic concepts and the actual specification in the context of the MPEG-4 Audio standard [MPEG4-3]. Starting with T/F-based coding, the basic concepts underpinning this type of coder will be discussed briefly. Because most parts of the MPEG-4 T/F coder are based on MPEG-2 advanced audio coding (AAC) technology, this coder will be explained. Building on the MPEG-2 AAC technology, MPEG-4 defines a number of extensions to enhance compression performance (perceptual noise substitu- tion, long-term prediction) and enable operation at extremely low bit rates (TwinVQ), very low delays (low-delay AAC), and under error-prone transmission conditions (error-resilience tools). The prin- ciples and algorithms involved in these extensions will be described one by one. One major novel functionality that was not supported by any previous MPEG Audio standard is bit-rate scalability . This aspect will be covered in a separate section by discussing the basic principles behind scalability and describing the two different approaches MPEG-4 Audio offers for realizing this concept--that is, large-step scalable audio coding and bit-sliced arithmetic coding. Finally, an introduction into parametric audio coding techniques and a section on the MPEG-4 Harmonic and Individual Lines plus Noise (HILN) parametric audio coder will conclude the description of MPEG-4 general audio coding technology. Introduction to Time/Frequency Audio Coding The term T/F coder was chosen in MPEG-4 to refer to coders that adhere to the traditional paradigm of perceptual audio coding by coding a spectral (frequency domain) representation of the input signal rather than the time domain signal itself. This type of coding technology has made tremendous progress in the past 10 years and has become the coder type of choice for music distribution in broadcasting, over the Internet, and on other media. This may be explained by the fact that the T/ F coder framework combines both redundancy reduction and exploitation of the potential provided by irrelevancy removal. Coding of a spectral representation is an efficient way of exploiting linear