International Journal of Research in Advent Technology, Vol.2, No.6, June 2014 E-ISSN: 2321-9637
Implementation of LNS Adder/Subtractor in MAC Unit Kanika Chugh1, Dinesh Kumar Verma2, Nitin Tiwari3 P.D.M College of Engineering1, 2, DKOP Labs Pvt Ltd, Noida, India3 Email id-kanika.218@gmail.com1, er.dineshverma@gmail.com2, nitin.vd025@gmail.com3 Abstract-This paper tells the technique for low power addition/subtraction of Logarithmic number system(LNS).Firstly, the impact of partitioning the look up tables (LUT) required for addition/subtraction on complexity, performance is studied. For addition and subtraction of LNS, Matlab is also used to calculate the output which is further stored in partitioned LUTs.Then the LNS adder/subtractor so obtained is implemented in binary MAC and LNS MAC and difference is compared by synthesis report so obtained. The results are obtained through Verilog code written in Xilinx (version-ISE-13.2) and is implemented in FPGA. Index Term- LNS adder/subtractor, LUT, FPGA
1.
INTRODUCTION
The logarithmic number system can be used in various ways to represent the data in efficient way which is used in various special purpose VLSI processors. The LNS exploites the property to reduce the basic arithmetic function of multiplication,division,roots and powers to binary addition,subtraction,and right and left shifts respectively.Furthermore,the LNS also provides additional benefit as it gives us freedom to choose the logarithmic base. Digital filters are commonly implemented using either fixed-point or floating-point arithmetic. The dynamic range of digital filters implemented with fixed-point arithmetic is limited. Floating-point arithmetic can increase the dynamic range at the expense of speed and accuracy. To overcome this difficulty, the use of a logarithmic number system (LNS) for digital signal processing applications was proposed .With LNS, multiplication and division operations reduce to simple addition and subtraction operations providing the ability to support high-speed arithmetic over a wide dynamic range. Another advantage of LNS is its uniform geometric error characteristics across the entire range of values, leading to a better precision than that of a floatingpoint representation having the same wordlength. It must be emphasized that this system cannot replace conventional arithmetic units ingeneral purpose computers; rather it is intended to enhance the implementation of special purpose processors for specialized applications(e.g., pattern recognition, digital image enhancement, radar processing, speech filtering, etc.).The sign/logarithm representation of a number consists of the sign of the number appended to the logarithm of the absolute value of the number (scaled to avoid negative logarithms). This system thus avoids the classic problem with logarithmic number systems; the inability to represent negative numbers.
LNS benefit come at the cost, the operation of addition and subtraction are awkward to perform in LNS with complex LUTs(look up tables) or other approximation circuitry is needed. while for shorter wordlength simple technique based on LUTs suffice are needed whereas for longer wordlength more elaborate techniques are required. Several authors have proposed solutions to reduce complexity of awkward LNS operations. Mahalingam and Ranganathan [1] improve Mitchell’s Algorithm in terms of the accuracy of the logarithmic operations, while Johansson et al. [2] use a method based on sums of bit products to implement the basic logarithmic functions. Arnold et al. [3] suggest the use of cotransformations for the reduction of the LUT. Dimitrov et al. [4] have proposed an extension of LNS in which several bases are used. In this context and to address the complicated required conversion to LNS, Muscedere et al. [5] have studied techniques for converting binary to a multidigitmultidimensional LNS by using LUT. Very recently, Ismail and Coleman [6] presented a cotransformation procedure and an improved interpolation method that reduce the size of LUT to an extent that allows their easy synthesis inlogic. Fu et al. [7] deal with LNS arithmetic optimizations on FPGAs. Arnold and Collange [8] propose complex LNS as a generalization of LNS, which represents complex values in log-polar form. 2. LNS BASICS The basic idea in LNS is to use logarithmic to represent data, since the logarithm of negative number is not real. To represent signed numbers in LNS the sign information is used as a separate bit sx and is used in combination with logarithm of magnitude of number. furthermore, the logarithm of zerois not a finite number so an additional single bit flag zx is used to denote that the number is zero.let us assume that X denote the number and x denote the logarithm 86
International Journal of Research in Advent Technology, Vol.2, No.6, June 2014 E-ISSN: 2321-9637 of absolute value of |X| .Xlns is a triplet having sign bit,zero bit and x.In LNS ,a number X is represented as triplet
generally adopted for high-precision applications,while the latter approach is generally preferable for smaller word length i.e in low precision application where size of required LUT is moderate.
Xlns = (zx,sx,x)…………………………..(1) Where zx is asserted in that case X is zero.sx is the sign of X and x = logb(|X|),if X is not zero with b being base of logarithm also called base of representation. Due to basic properties of logarithm, the multiplication of Xlns and Ylns is reduced to computation of triplet Zlns Zlns =(zZ,sZ,z)…………………………...(2) where zZ zx OR zY, sz = sX XORsY , z = x + y.
=
Similarly the case of division reduces to binary subtraction The derivation of logarithm a of the sum A of two triplets is more involved as it relies on computation of a = max{x,y} + logb(1+ b-|x-y|)………...(3) = max{x,y} + Ǿa(d)…………………...(4) Where Ǿa(d) =logb( 1+ b-d);d = |x-y|…...(5) Similarly ,the derivation of the difference of two number, requires the computation of c = max{x,y} + logb( 1- b-|x-y|)…………(6) = max{x,y} + Ǿs(d)……………………(7) Where Ǿs(d) = logb( 1- b-d) ;d = |x-y|….(8) The basic organisation of an adder/subtractor is shown in fig 1.The parallel subtractionsare implemented followed by a multiplexer which computes d according to the rule s1 = x-y……... (9)
s2 = y-x ………...(10)
d = |x-y| = { s1 , s1 > 0
Fig1:-The organisation of an LNS adder/subtractor.
3. LOW POWER DESIGN OF LNS CIRCUITS The basic organisation of an LNS adder/subtractor is shown in fig 1.firstly the differences of x-y and y-x are computed in parallel and sign of any of them is used as control signal to select maximum of x and y and provide |x-y| to LUT collection the implements the function of logb(1±b-d) of (3) and (6).Power dissipation can be sought by partition the particular LUTs into smaller LUTs as shown in fig 1.During operation only one of LUT is activated,thus achieving power savings.complexity reduction in in LNS processors by partitioning of the LUTs has been successfully applied [9]. Here we focus on combinational logic implementation of LUTs, instead of memory basedimplementation. The organization of the LNS adder/subtractor comprises four LUTs, grouped by two as shown in Figs. 2 The upper pair corresponds to the function required for LNS addition, i.e.,operands having the same sign, while the lower pair is used for LNS subtraction. One sub-LUT of one pair is activated for an operation,depending on the signs of the operands and the value of the Most Significant Bit (MSB) of as described by dn of d =|x-y| as(3) and (6).The use of dn as selection bit simplifies the total LUT complexity.The signal s is computed as the exclusive-or of the signs of the inputs as shown in Fig 2
{s2, otherwise…………(11) The choice of sign of either (9) or (10) as a select signal for multiplexer.The same signal is used to select the maximum of x and y required for computation of (4) and (7).The complexity of LNS circuitary arises from the fact that the values of functions Ǿa and Ǿsshould be computed by the LNS addition/subtraction circuit hardware for all required values of d.There are two main approaches to implement the evaluation of functions,namely the hardware implementation of an approximation algorithm or offline precomputation and storage of all required values in an LUT.The former approach is
A) LNS addition/subtraction with D-flip-flop An implementation of LNS addition/subtraction is studied, based on the use of D flip-flops (DFF) with clock and reset instead of level-sensitive latches. Significant power consumption reduction is achieved when using D-flip-flops instead of latches. The advantage of the adopting flip-flop based selection is that one or more MSBs can be used to break up the LUTs. Fig 2 summarizes the organization of the Dflip-flop based architecture. Signals clock and reset are not shown for clarity. It is concluded that MSBbased architecture creates simpler sub-LUTs. Since the utilization of the MSB for LUT selection, is not 87
International Journal of Research in Advent Technology, Vol.2, No.6, June 2014 E-ISSN: 2321-9637 efficient for a latch-based design, a solution based onD flip-flops is preferable.
Fig 3:-Organisation of single MAC architecture
Fig 2 :-DFF organisation using MSB for LUT selection 4. MATLAB IMPLEMENTATION Matlab(version-7.10.0.499) was used to calculate the output of equation (5) and equation (8) used in LNS design which were further called in Verilog code to store the value in LUTs.The matlab generated 4 text file which provided us with value needed to be stored in add or sub lut according to operation being performed. 5. LNS MAC MAC stands for multiply accumulate where a multiplier is followed by a adder and accumulator that stores the result.The output of register is fed back to one input of adder so that at each clock cycle the output of multiplier is added to to register.The basic structure of single MAC unit is shown in fig 3where denote a adder and multiplier respectively while a D denoted a delay unit implemented as a register.The LNS equivalent to single MAC architecture is given in fig 4 where binary multiplier has been replaced by an adder,binary adder is mapped to an LNS adder/subtractor.The LNS adder/subtractor is augmented with saturation circuits and exploits a zero flag to avoid unnecessary activation of lut partition and further reduce power dissipation.The employed LNS MAC uses MSB based architecture for LUT partitioning and use DFF for address latching.The word length of 12 bit was selected for LNS MAC.The LNS MAC was found to be more efficient then its binary counterpart.
Fig 4:- LNS MAC Unit 6. SYNTHESIS The VLSI implementation ofLNS addition/subtraction design was done on FPGA on a tool named XILINX which provided with the design summary including area and delay report.The simulation which was also carried on Xilinx provided us with result which was accurate as design provided.
Table 1:-Area report of BINARY LNS
Table 2:-Delay summary of BINARY MAC
88
International Journal of Research in Advent Technology, Vol.2, No.6, June 2014 E-ISSN: 2321-9637
Table 3:- Power report of BINARY MAC.
Table 4:-Area report of LNS MAC
Table 5:-Delay summary of LNS MAC
Table 6:- Power report of BINARY MAC. 7. CONCLUSION By exploiting optimal selection of LNS representation combined with proper LUT partitioning(d-ff),it is found that substantial power dissipation savings are obtained at no performance penalty, as shown by simulation results. Retiming is employed to avoid unnecessary switching activity, due to unbalanced delay paths, while LUT partitioning is employed to create parts inthe circuit that can be independently activated. The design techniques and performance analysis of LNS MAC units is presented in this paper which provides us with information that LNS can offer a viable solution of low power signal processing system with moderate word length.
[2] K. Johansson, O. Gustafsson, and L. Wanhammar, “Implementation of Elementary Functions for Logarithmic Number Systems,”IET Computers and Digital Techniques, vol. 2, no. 4, pp. 295304,http://link.aip.org/link/?CDT/2/295/1, 2008. [3] M.G. Arnold, T.A. Bailey, J.R. Cowles, and M.D. Winkel,“Arithmetic Co-Transformations in the Real and Complex Logarithmic Number Systems,” IEEE Trans. Computers, vol. 47,no. 7, pp. 777-786, July 1998. [4] V.S. Dimitrov, G.A. Jullien, and W.C. Miller, “Theory and Applications of the Double-Base Number System,” IEEE Trans.Computers, vol. 48, no. 10, pp. 1098-1106, Oct. 1999. [5]. R. Muscedere, V. Dimitrov, G. Jullien, and W. Miller, “Efficient Techniques for Binary-toMultidigit Multidimensional Logarithmic Number System Conversion using RangeAddressable Look-Up Tables,” IEEE Trans. Computers, vol. 54, no. 3, pp. 257-271, Mar.2005 [6]. R.C. Ismail and J.N. Coleman, “ROM-less LNS,” Proc. IEEE Symp.Computer Arithmetic, pp. 4351, 2011. [7]. H. Fu, O. Mencer, and W. Luk, “FPGA Designs with Optimized Logarithmic Arithmetic,” IEEE Trans. Computers, vol. 59, no. 7,pp. 1000-1006, July 2010. [8]. M. Arnold and S. Collange, “A Real/Complex Logarithmic Number System ALU,” IEEE Trans. Computers, vol. 60, no. 2,pp. 202-213, Feb. 2011. [9]. F.J. Taylor, R. Gill, J. Joseph, and J. Radke, “A 20 bit Logarithmic Number System processor,” IEEE Transactions on Computers, vol. 37, no. 5, pp. 190–199, Feb. 1988. [10]. Costas Efstathiou, Haridimos T. Vergos, and Dimitris Nikolos, “Modulo 2n +1adder design using select-prefix blocks,” IEEE Transactions onComputers, vol. 52, no. 11, Nov. 2003.
REFERENCES [1] V. Mahalingam and N. Ranganathan, “Improving Accuracy in Mitchell’s Logarithmic Multiplication using Operand Decomposition,”IEEE Trans. Computers, vol. 55, no. 12, pp. 1523-1535, Dec.2006.
89