AAC Encoder on TriMedia TM-1300

The projects main target was to implement an advanced audio encoder on a DSP platform.

Abstract
The projects main target was to implement an advanced audio encoder on a DSP platform.
Audio standards are widely distributed and the most popular one is MPEG-1 layer III, also known as MP3. The decision was to focus on a future audio format; therefore the AAC format was selected.
The selected DSP platform is a TriMedia TM-1300, an advanced digital signal processor that was designed mainly for demanding multimedia tasks.

    Project goals:

  • Write an AAC encoder in ANSI C
  • Optimize the code, so it will be available for 32-Bit platform
  • Port the working code to the TM-1300
  • Achieve near “real-time” performance

Background
In recent years, research conducted by several different organizations has contributed to dramatic advances in audio compression. AAC, Advanced Audio Coding, is a combination of state-of-the-art technologies for high-quality multichannel audio coding from four organizations: AT&T Corp., Dolby Laboratories, Fraunhofer Institute for Integrated Circuits (Fraunhofer IIS), and Sony Corporation. AAC has been standardized under the joint direction of the International Organization for Standardization (ISO) and the International Electro-Technical Commission (IEC), as part of the MPEG-2 specification.

The Philips TriMedia, nicknamed “media processor” is designed as a versatile processor for video, audio and image. The processor’s main target is to deal efficiently with multiple media. The heart of the processor is a Very Long Instruction Word (VLIW) core: this offers high speed and multiple simultaneous operations, while keeping cost down by avoiding on-chip scheduling logic. The scheduling logic is moved into an optimizing compiler which schedule the code at run time and so becomes in effect an integral part of the processor. The compiler only has to be bought once, whereas the on-chip scheduling logic would have to be paid with every chip used. In addition the TriMedia has several on-chip I/O peripherals so it will be able to cope with the multiple media. Last, but not least, is the TriMedia Co-Processor, which operate in parallel with the VLIW core and can handle demanding operations.

The problem
The migration from analogue to digital communications has necessitated the development of high quality compression techniques for telephony speech, wide-band audio, video and data signals. The most silicon efficient implementations of the various compression engines will usually be based on ASIC designs. However many multi-media applications require the integration of a broad range of functions such as demodulation, channel decoding and system control. For these applications, the use of general purpose DSP devices is attractive because it reduces the development costs and time-scales dramatically. State-of-the-art DSP devices are capable of more than 300 MIPS (Million Instructions Per Second), which in conjunction with simple off-core acceleration hardware for key functions, is ample for such applications. The main drawback of fixed-point DSP architectures is that they predominately have a data path width of only 32-bits. This creates a problem particularly for implementation of wide-band high quality audio compression standards where data paths of 40-bits or more are typically required. In addition, lack of floating point capabilities can degrade sound quality due to lack of processing precision.

The solution
In order to achieve the lowest computational complexity it is important to match between the DSP architecture and the processing algorithm.
In our case the following tunings have been made in order to optimize the code:

1) An efficient integer arithmetic implementation of the FFT, that takes good advantage over the VLIW architecture of the TriMedia DSP.

2) Dividing 64-bit floating-point variables into two 32-bit fixed point types, and changing the logic (using shift operations) so it will match the AAC flowchart.

3) Creating optimize make file for efficient and highly optimized scheduler operation, so that the generated assembly code will make the most efficient use of the parallel processing units.

Tools

  • PC workstation, Windows 2000 OS
  • MS Visual C++, version 6.0
  • TriMedia TM-1300 IREF board
  • 166 MHz Philips TriMedia TM1300 VLIW processor
  • Two channel audio I/O and microphone input
  • TriMedia Software Development Environment (SDE)

Conclusions

  1. The tested streams on the IREF board produced the same results as the original code generated on the PC version (Pentium III 400MHz).
  2. There is no ultimate compiler; it can’t be efficient enough in terms of optimizing code, not more than a human hand can produce. Writing in machine language (assembler) level is a must in order to achieve real time performance.
  3. The VLD feature in the TriMedia is not efficient enough. The stalls caused due to frequent memory access are costly; therefore it’s highly recommended to build a clever software algorithm for the Huffman coding stage than to use the co-processor.
  4. According to the profiler the most demanding function is the “Filter Bank” process. In order to achieve real-time encoding it’s highly recommended to write this function in assembler.
  5. Further improvement can be reached while reading the whole raw file to a buffer and producing the output into another buffer. The read/write operations to the hard disk are expensive in terms of cycles.

Acknowledgment
We would like to express our gratitude to the Ollendorff Minerva Center for supporting this project, to Philips Semiconductors for supplying us the MDS TriMedia evaluation board and software tools and to all the people who helped us throughout the work.

Collaboration:

Philips Semiconductors