MPEG-4 Part 3

From Wikipedia, the free encyclopedia

Jump to: navigation, search

MPEG-4 Part 3 (formally ISO/IEC 14496-3) is the third part of the ISO/IEC MPEG-4 international standard. It specifies audio coding methods.

Contents

[edit] Bifurcation in the AAC technical standard

The Advanced Audio Coding in MPEG-4 Part 3 was enhanced relative to the previous standard MPEG-2 Part 7, in order to provide better sound quality for a given encoding bitrate.

It is assumed that any Part 3 and Part 7 differences will be ironed out by the ISO standards body in the near future to avoid the possibility of future bitstream incompatibilities. At present there are no known player or codec incompatibilities due to the newness of the standard.

AAC's multiple codecs:

  • Low Complexity Advanced Audio Coding (LC-AAC)
  • High-Efficiency Advanced Audio Coding (HE-AAC)
  • Scalable Sample Rate Advanced Audio Coding (AAC-SSR)
  • Bit Sliced Arithmetic Coding (BSAC)
  • Long Term Predictor (LTP)

[edit] HE-AAC

HE-AAC is an extension of AAC using Spectral Band Replication (SBR), and Parametric Stereo (PS). It is designed to increase coding efficiency at low bitrates by using partial parametric representation of audio.

[edit] AAC-SSR

AAC Scalable Sample Rate was introduced by Sony to the MPEG-4 standard. The audio signal is first split into 4 bands using a 4 band polyphase quadrature filter bank. Then these 4 bands are further split using MDCTs with a size k of 32 or 256 samples. This is similar to normal MPEG-4 AAC which uses MDCTs with a size k of 128 or 1024 directly on the audio signal.

The advantage of this technique is that short block switching can be done separately for every PQF band. So high frequencies can be encoded using a short block to enhance temporal resolution, low frequencies can be still encoded with high spectral resolution. However, due to aliasing between the 4 PQF bands coding efficiencies around (1,2,3) * fs/8 is worse than normal MPEG-4 AAC.

MPEG-4 AAC-SSR is very similar to ATRAC and ATRAC-3.

[edit] Why AAC-SSR was introduced

The idea behind AAC-SSR was not only the advantage listed above, but also the possibility of reducing the data rate by removing 1, 2 or 3 of the upper PQF bands. A very simple bitstream splitter can remove these bands and thus reduce the bitrate and sample rate.

Example:

  • 4 subbands: bitrate = 128 kbit/s, sample rate = 48 kHz, f_lowpass = 20 kHz
  • 3 subbands: bitrate ~ 120 kbit/s, sample rate = 48 kHz, f_lowpass = 18 kHz
  • 2 subbands: bitrate ~ 100 kbit/s, sample rate = 24 kHz, f_lowpass = 12 kHz
  • 1 subband: bitrate ~ 65 kbit/s, sample rate = 12 kHz, f_lowpass = 6 kHz

Note: although possible, the resulting quality is much worse than typical for this bitrate. So for normal 64 kbit/s AAC a bandwidth of 14-16 kHz is achieved by using intensity stereo and reduced NMRs. This degrades audible quality less than transmitting 6 kHz bandwidth with perfect quality.

[edit] BSAC

Bit Sliced Arithmetic Coding is an MPEG-4 standard (ISO/IEC 14496-3 subpart 4) for scalable audio coding. BSAC uses an alternative noiseless coding to AAC, with the rest of the processing being identical to AAC. This support for scalability allows for nearly transparent sound quality at 64 kbit/s and graceful degradation at lower bit rates. BSAC coding is best performed in the range of 40 kbit/s to 64 kbit/s, though it operates in the range of 16 kbit/s to 64 kbit/s. The AAC-BSAC codec is used in Digital Multimedia Broadcasting (DMB) applications.

[edit] See also

[edit] External links

Personal tools