< retour aux publications

Compilation Methods for the Address Calculation Units of Embedded Processor Systems

Auteur(s) : Cl. B. Liem, P. Paulin, A. A. Jerraya

Journal : Design Automation for Embedded Systems

Volume : 2

Issue : 1

Pages : 71-77

Doi : 10.1023/A:1008862510877

An essential component of today''s embedded system is an instruction-set processor running real-time software. All variations of these core components contain at least the minimum data-flow processing capabilities, while a certain class contain specialized units for highly data-intensive operations for Digital Signal Processing (DSP). For the required level of memory interaction, the parallel executing Address Calculation Unit (ACU) is often used to tune the architecture to the memory access characteristics of the application. The design of the ACU is performance critical. In today''s typical design flow, this design task is somewhat driven by intuition as the transformation from application algorithm to architecture is complex and the exploration space is immense. Automatic utilities to aid the designer are essential; however, the key compilation techniques which map high-level language constructs onto addressing units have lagged far behind the emergence of these units. This paper presents a new retargetable approach and prototype tool for the analysis of array references and traversals for efficient use of ACUs. In addition to being an enhancement to existing compiler systems, the ArrSyn utility may be used as an aid to architecture exploration. A simple specification of the addressing resources and basic operations drives the available transformations and allows the designer to quickly evaluate the effects on speed and code size of his/her algorithm. Thus, the designer can tune the design of the ACU toward the application constraints. ArrSyn has been successfully used together with a C compiler developed for a VLIW architecture for an MPEG audio decoding application. The combination of these methods with the C compiler showed on average a 39% speedup and 29% code size reduction for a representative set of DSP benchmarks.