Issuu

Poster Paper Proc. of Int. Conf. on Advances in Computer Engineering 2011

Asynchronous Implementation of Split-radix FFT Algorithm Using FPGA Shobith Thomas Jacob1, R. Ramesh2 and S. Malarvizhi3 1

Department of Electronics and Communication Engineering S.R.M University, Tamil Nadu, 603 203, India. Email: stjcec@yahoo.com 2 Department of Electronics and Communication Engineering S.R.M University, Tamil Nadu, 603 203, India. Email: raammesh1976@yahoo.co.in 3 Department of Electronics and Communication Engineering S.R.M University, Tamil Nadu, 603 203, India. Email: hod.ece@ktr.srmuniv.ac.in II. COMMONLY USED ALGORITHMS

Abstract—The Fast Fourier Transform (FFT) finds a variety of applications particularly in the field of communication engineering. The Split Radix FFT (SRFFT) is one of the very many algorithms to compute FFT. In this paper, we propose a completely asynchronous design of the split-radix FFT processor core with absolutely no glitches at the output. It reduces the system latency drastically. The asynchronous design makes sure that there is no separate data load stage in the processor. The reading of the output and the loading of the new inputs can be done simultaneously. This helps to improve the speed of the device. Further, we present an algorithm to reduce the number of ROMs to store the twiddle factors.

Since Cooley-Tukey algorithm was proposed Ref. [3], many new algorithms were put forward. Radix-2, radix-4, radix8, split-radix and mixed-radix are some of the very common algorithms. The first three envisages the use of a single type of radix in the entire calculation of the FFT. The split-radix algorithm utilizes multiple radices in a single stage while the mixed uses different radices alternatively in different stages of computation of the FFT. It can be advantageously used to compute multiple internal stages in parallel. Research shows that split-radix approximates the minimum multiplication by theory (Ref. [4] ). Winograd algorithm is yet another class of algorithm. However, the FFTs are more modular, which is an advantage in hardware implementations, especially in very large scale integration (Ref[5] ). No major works were undertaken in the field of FPGA implementation of split-radix algorithm asynchronously.

Index Terms— FFT, SRFFT, Asynchronous system, CORDIC Algorithm.

I. INTRODUCTION The invention of Fast Fourier Transforms has given a giant leap in the performance of modern communication systems. The Fast Fourier Transform (FFT) is one of the most important algorithms in signal processing and communications and is used in orthogonal frequency division multiplexing (OFDM) systems (Ref. [1]). As the faster version of DFT, FFT and its inverse transform IFFT are important analysis methods in digital signal spectrum analysis (Ref. [2] ). Most of the existing designs use synchronous design of the processor where the speed of the processor core is very much limited by the clock input. The speed of the input and output sections of the device also has a major effect on the clock rate. In this paper, we present an asynchronous core split-radix FFT wherein the processor core speed is limited only by the intrinsic delay of the device so that it can be used in real-time applications but without any glitches in the output section. The system latency is significantly reduced by this design. The paper also uses an algorithm to reduce the number of ROMs for storing the twiddle factors by using certain properties of the complex numbers. This algorithm is true even for very complex systems.

III. THE SPLIT-RADIX FFT This algorithm for computing the FFT utilizes the separation of the input sequences into odd and even indexed samples. For DFT with N equals to 2m (m is any natural number), the even and odd indexed output frequencies are clearly given in Ref. [6]. Ref. [6] specifies the general equation in terms of X(2r), X(4r + 1) and X(4r + 3). For a 16-point FFT, more specific equations are used in Ref. [2]. The three main equations used in Ref. [2] for calculations of subsequent stages are as below:

Radix-4 algorithm is used to compute the odd indexed outputs. It is given by the following equations:

where r = 0, 1, 2, ..., (N/4)-1 . This algorithm uses an L-section butterfly structure as given in Ref. [2]. The detailed description of how this equation is derived and transformed in subsequent stages is clearly presented in Ref. [2]. The general signal flow graph is as in Fig.1. 197