Multicore Parallel Implementation of 2D-FFT Based on TMS320C6678 DSP

Scientific Journal of Information Engineering June 2015, Volume 5, Issue 3, PP.61-66

Multicore Parallel Implementation of 2D-FFT Based on TMS320C6678 DSP Wende Wu1 ,2#, Zhiyong Xu1 1. Institute of Optics and Electronics of Chinese Academy of Sciences, Chengdu 610209, China 2. University of Chinese Academy of Sciences, Beijing 100039, China #

Email: wuwende2008@126.com

Abstract We put forward a multicore parallel plan for 2D-FFT and implement it on TMS320C6678 DSP after we research the characteristics of different multicore DSP programming models and two-dimension FFT (2D-FFT). We bring the parallel computing capability of multicore DSP into full play and improve working efficiency of 2D-FFT. It has hugely referential value in implementing image processing arithmetic based on 2D-FFT. Keywords: Multicore DSP; Parallel Programming; 2D-FFT; Inter-Processor Communication

1. INTRODUCTION 2D-FFT is a basic arithmetic which is widely used in image processing industry. Owing to the big data and multiple dimensions of image, 2D-FFT is characterized as complex and long-playing operation, and it severely restricts the improvement of efficiency of image processing arithmetic. Platforms which consist of multiple DSPs and FPGAs are used to meet the real time requirement of image processing arithmetic based on 2D-FFT[1],[2]. But multiple DSPs add power and volume of the platforms which are very limited in embedded systems. After the Texas Instruments (TI) presented a piece of high-performance multicore DSP called TMS320C6678 in 2010 ďź&#x152; applying it to image processing platforms has become a trend in image processing industry, but bringing the parallel computing capability of multicore DSP into full play and improving working efficiency of 2D-FFT become problems. By researching the characteristics of 2D-FFT and C6678, we put forward a multicore parallel arithmetic plan based on data division for 2D-FFT. Experimental results show when the size of image is appropriate, the multicore parallel arithmetic has good speed ratio and parallel efficiency on C6678 DSP. This paper provides good reference for multicore parallel implementation of image processing arithmetic based on 2D-FFT.

2. MULTICORE PARALLEL PROCESSING MODEL Multicore DSPs mainly have three kinds of parallel programming models and they are Master/Slave Processing Model, Data Flow Processing Model and OpenMP Fork-Join Model. The Master/Slave Processing model, shown in Fig.1, represents centralized control with distributed execution. A master core is responsible for scheduling various threads of execution that can be allocated to any available core for processing. It also must deliver any data required by the thread to the slave core. Applications that fit this model inherently consist of many small independent threads that fit easily within the processing resources of a single core[3]. The Data Flow model, shown in Fig.2, represents distributed control and execution. Each core processes a block of data using various algorithms and then the data is passed to another core for further processing. The initial core is often connected to an input interface supplying the initial data for processing from either a sensor or FPGA. Scheduling is triggered upon data availability. Applications that fit the Data Flow model often contain large and computationally complex components that are dependent on each other and may not fit on a single core[3]. OpenMP is an Application Programming Interface (API) for developing multi-threaded applications in C/C++ or - 61 http://www.sjie.org

Turn static files into dynamic content formats.

Create a flipbook