- Research Article
- Open Access
SPRINT: A Tool to Generate Concurrent Transaction-Level Models from Sequential Code
EURASIP Journal on Advances in Signal Processing volume 2007, Article number: 075373 (2007)
A high-level concurrent model such as a SystemC transaction-level model can provide early feedback during the exploration of implementation alternatives for state-of-the-art signal processing applications like video codecs on a multiprocessor platform. However, the creation of such a model starting from sequential code is a time-consuming and error-prone task. It is typically done only once, if at all, for a given design. This lack of exploration of the design space often leads to a suboptimal implementation. To support our systematic C-based design flow, we have developed a tool to generate a concurrent SystemC transaction-level model for user-selected task boundaries. Using this tool, different parallelization alternatives have been evaluated during the design of an MPEG-4 simple profile encoder and an embedded zero-tree coder. Generation plus evaluation of an alternative was possible in less than six minutes. This is fast enough to allow extensive exploration of the design space.
Horowitz M, Indermaur T, Gonzalez R: Low-power digital design. Proceedings of the IEEE Symposium on Low Power Electronics, October 1994, San Diego, Calif, USA 8–11.
Horowitz M, Alon E, Patil D, Naffziger S, Kumar R, Bernstein K: Scaling, power, and the future of CMOS. Proceedings of IEEE International Electron Devices Meeting (IEDM '05), December 2005, Washington, DC, USA 7.
De Man H, Catthoor F, Goossens G, et al.: Architecture-driven synthesis techniques for VLSI implementation of DSP algorithms. Proceedings of the IEEE 1990,78(2):319-335. special issue on the future of computer-aided design 10.1109/5.52215
Meng TH, Gordon BM, Tsern EK, Hung AC: Portable video-on-demand in wireless communication. Proceedings of the IEEE 1995,83(4):659-680. special issue on low power electronics 10.1109/5.371972
Lambrechts A, Raghavan P, Leroy A, et al.: Power breakdown analysis for a heterogeneous NoC platform running a video application. Proceedings of the 16th IEEE International Conference on Application-Specific Systems, Architectures and Processors (ASAP '05), July 2005, Samos, Greece 179–184.
Viredaz MA, Wallach DA: Power evaluation of a handheld computer. IEEE Micro 2003,23(1):66-74. 10.1109/MM.2003.1179900
Cai L, Gajski D: Transaction level modeling in system level design. In Tech. Rep. 03-10. CECS, University of California, Irvine, Calif, USA; 2003.
Vanhoof B, Peon M, Lafruit G, Bormans J, Engels M, Bolsens I: A scalable architecture for MPEG-4 embedded zero tree coding. Proceedings of the 21st IEEE Annual Custom Integrated Circuits Conference (CICC '99), May 1999, San Diego, Calif, USA 65–68.
Denolf K, De Vleeschouwer C, Turney R, Lafruit G, Bormans J: Memory centric design of an MPEG-4 video encoder. IEEE Transactions on Circuits and Systems for Video Technology 2005,15(5):609-619.
Denolf K, et al.: A systematic design of an MPEG-4 video encoder and decoder for FPGAs. Proceedings of the Global Signal Processing Expo and Conference (GSPx '04), September 2004, Santa Clara, Calif, USA
Catthoor F, de Greef E, Suytack S: Custom Memory Management Methodology. Kluwer Academic Publishers, Boston, Mass, USA; 1998.
Hall MW, Anderson JM, Amarasinghe SP, et al.: Maximizing multiprocessor performance with the SUIF compiler. Computer 1996,29(12):84-89. 10.1109/2.546613
Padua DA, Wolfe MJ: Advanced compiler optimizations for supercomputers. Communications of the ACM 1986,29(12):1184-1201. special issue 10.1145/7902.7904
Wolf ME, Lam MS: A loop transformation theory and an algorithm to maximize parallelism. IEEE Transactions on Parallel and Distributed Systems 1991,2(4):452-471. 10.1109/71.97902
Cheriton DR: The V distributed system. Communications of the ACM 1988,31(3):314-333. 10.1145/42392.42400
de Kock EA, Essink G, Smits WJM, et al.: YAPI: application modeling for signal processing systems. Proceedings of the 37th Design Automation Conference (DAC '00), June 2000, Los Angeles, Calif, USA 402–405.
Najjar WA, Lee EA, Gao GR: Advances in the dataflow computational model. Parallel Computing 1999,25(13-14):1907-1929. 10.1016/S0167-8191(99)00070-8
Kahn G: The semantics of simple language for parallel programming. In Proceedings of the IFIP Congress on Information Processing, August 1974, Stockholm, Sweden. Edited by: Rosenfeld JL. North-Holland; 471–475.
Sriram S, Bhattacharyya SS: Embedded Multiprocessors: Scheduling and Synchronization. Marcel Dekker, New York, NY, USA; 2000.
Eigenmann R, McClaughry P: Practical tools for optimizing parallel programs. Proceedings of the SCS Multiconference, March-April 1993, Arlington, Va, USA
Park I, Voss MJ, Armstrong B, Eigenmann R: Interactive compilation and performance analysis with ursa minor. Proceedings of the 10th International Workshop on Languages and Compilers for Parallel Computing, August 1997, Minneapolis, Minn, USA 163–176.
Reed DA, Roth PC, Aydt RA, et al.: Scalable performance analysis: the Pablo performance analysis environment. Proceedings of the Scalable Parallel Libraries Conference, October 1993, Mississippi State, Miss, USA 104–113.
Aho A, Sethi R, Ullman JD: Principles of Compiler Design. Addison-Wesley, Reading, Mass, USA; 1986.
Andersen LO: Program analysis and specialization for the C programming language, Ph.D. thesis. Computer Science Department, University of Copenhagen, Copenhagen, Denmark; 1994.
Shapiro JM: Embedded image coding using zero trees of wavelet coefficients. IEEE Transactions on Signal Processing 1993,41(12):3445-3462. 10.1109/78.258085
Shapiro JM: A fast technique for identifying zero trees in the EZW algorithm. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '96), May 1996, Atlanta, Ga, USA 3: 1455–1458.
Denolf K, Chirila-Rus A, Verkest D: Low-power MPEG-4 video encoder design. Proceedings of the IEEE Workshop on Signal Processing Systems Design and Implementation (SIPS '05), November 2005, Athens, Greece 284–289.
Information technology-Generic coding of audio-visual objects—part 2: visual ISO/IEC 14496-2:2004, June 2004
Bhaskaran V, Konstantinides K: Image and Video Compression Standards: Algorithms and Architectures. Kluwer Academic Publishers, Boston, Mass, USA; 1997.
Downton AC, Tregidgo RWS, Cuhadar A: Top-down structured parallelization of embedded image processing applications. IEE Proceedings-Vision, Image and Signal Processing 1994,141(6):431-437. 10.1049/ip-vis:19941556
Lewis EC, Snyder L: Pipelining wavefront computations: experiences and performance. Proceedings of the 5th IEEE International Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS '00), May 2000, Cancun, Mexico 261–268.
Snyder L: A Programmer's Guide to ZPL. MIT Press, Cambridge, Mass, USA; 1999.
Das A, Dally WJ, Mattson P: Compiling for stream processing. Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques (PACT '06), September 2006, Seattle, Wash, USA 33–42.
Mattson P, Dally WJ, Rixner S, Kapasi UJ, Owens JD: Communication scheduling. Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '00), November 2000, Cambridge, Mass, USA 82–92.
Gupta M, Midkiff S, Schonberg E, et al.: HPF compiler for the IBM SP2. Proceedings of the ACM/IEEE Supercomputing Conference, December 1995, San Diego, Calif, USA 2: 1944–1984.
Loveman DB: Program improvement by source-to-source transformation. Journal of the Association for Computing Machinery 1977,24(1):121-145.
Stefanov T, Zissulescu C, Turjan A, Kienhuis B, Deprettere E: Compaan: deriving process networks from Matlab for embedded signal processing architectures. Proceedings of the International Conference on Design Automation and Test in Europe, February 2004, Paris, France
Turjan A, Kienhuis B, Deprettere E: Translating affine nested-loop programs to process networks. Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, September 2004, Washington, DC, USA 220–229.
Karkowski I, Corporaal H: FP-map - an approach to the functional pipelining of embedded programs. Proceedings of the 4th International Conference on High Performance Computing (HiPC '97), December 1997, Bangalore, India 415–420.
Ottoni G, Rangan R, Stoler A, Bridges MJ, August DI: From sequential programs to concurrent threads. IEEE Computer Architecture Letters 2006,5(1):6-9. 10.1109/L-CA.2006.1
Rangan R, Vachharajani N, Vachharajani M, August DI: Decoupled software pipelining with the synchronization array. Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques (PACT '04), September-October 2004, Antibes Juan-les-Pins, France 177–188.