# EFFECTIVE DA-BASED RECONFIGURABLE FIR DIGITAL FILTER FOR

# FPGA AND ASIC

Gadi Haritha<sup>1</sup>, K.Yuvaraj<sup>2</sup>

<sup>1</sup>PG Scholar, Dept of VLSI, SV College of Engineering, Tirupati,, India, <u>harithar8@gmail.com</u> <sup>2</sup>Asst. Prof, Dept of ECE, SV College of Engineering, Tirupati, India, <u>yuvaraj.k@gmail.com</u>

### ABSTRACT



A productive distributed arithmetic (DA)-based methodologies for high-throughput reconfigurable usage of finite impulse response (FIR) channels whose channel coefficients change during runtime. Expectedly, for reconfigurable DA-based usage of FIR filters, the lookup tables (LUTs) are obliged to be executed in RAM and the RAM-based LUT is discovered to be immoderate for ASIC execution. Along these lines, a mutual LUT outline is proposed to understand the DA reckoning. As opposed to utilizing separate registers to store the conceivable consequences of fractional internal items for DA preparing of diverse bit positions, registers are shared by the DA units for bit cuts of distinctive weightage. The proposed design has almost 68% and 58% less range delay item and 78% and 59% less vitality every specimen than the DA-based systolic structure and the carry slave adder (CSA)-based structure, separately, for the ASIC usage. A disseminated RAM-based outline is additionally proposed for the fieldprogrammable gate Array (FPGA) execution of the reconfigurable FIR filter, which backings up to 91 MHz info testing recurrence and offers 54% and 29% less the quantity of cuts than the systolic structure and the CSAbased structure, individually, when executed in the Xilinx Virtex-5 FPGA gadget (XC5VSX95T-1FF1136)

Index Terms—Circuit optimization, distributed arithmetic (DA), finite-impulse response (FIR) filter, reconfigurable implementation

### I. INTRODUCTION

Finite impulse response (FIR) computerized channels are widely utilized because of their key part in digital signal processing (DSP) applications . Alongside the headway in Very large scale integration(VLSI) innovation as the DSP has turn out to be progressively well known throughout the years, the high speed acknowledgment of FIR filter with less power utilization has turn out to be significantly more requesting.. A few endeavors have, consequently, been made to create committed and reconfigurable architectures for acknowledgment of FIR filters in application particular coordinated circuits like (ASIC) and field programmable field programmable gate array exhibits (FPGA) stages. Systolic outlines speak to an appealing building ideal model for proficient equipment execution of processing serious DSP applications.

The fundamental operations needed for DA-based processing of internal item are a grouping of look-up-table (LUT)-gets to look after by movement aggregation operations of the LUT yield. In FIR separating, one of the convolving arrangements is gotten from the information tests while the other arrangement is gotten from the settled motivation reaction coefficients of the channel. This conduct of FIR channel makes it conceivable to utilize DAbased strategy for memory-based acknowledgment .least mean square (LMS) versatile channel utilizing a decay of DA based FIR calculation and ensuing memory disintegration. Every one of these structures, then again, are not suitable for usage of the FIR filter in systolic equipment

since the incomplete items accessible from the divided memory modules are summed together by a system of yield adders.

A RECONFIGURABLE Finite impulse responsive (FIR) channel whose channel coefficients progressively change amid runtime assumes an essential part in the product characterized radio frameworks multichannel channels and advanced up/down converters not withstanding, the no doubt understood different constant multiplication- based system which is generally utilized for the execution of FIR channels, can't be utilized when the channel coefficients rapidly change. Then again, a general multiplier-based structure obliges an expansive chip region and hence authorizes a limit on the most extreme conceivable request of the channel that can be acknowledged for high-throughput applications. An appropriated math (DA)-based strategy has increased generous ubiquity lately for its high-throughput preparing ability and expanded consistency, which bring about practical and region time proficient registering structures. The fundamental operations needed for DA-based calculation are an arrangement of lookup table (LUT) gets to took after by shift accumulation operations of the LUT vield.

The routine DA execution utilized for the usage of a FIR channel expect that drive reaction coefficients are altered, and this conduct makes it conceivable to utilize ROM-based LUTs. The memory prerequisite for DA-based usage of FIR filter , on the other hand, exponentially increments with the channel request. To wipe out the issue of such a huge memory prerequisite, systolic deterioration procedures are recommended. for DA-based usage of longlength convolutions and FIR filter of substantial requests .For a reconfigurable DA-based FIR filter whose filter coefficients rapidly transform, we have to utilize rewritable RAM based LUT rather than ROM-based LUT. Another methodology is to store the coefficients in the simple area by utilizing serial computerized to-simple converters bringing about blended sign construction modeling. We likewise discover very much a couple deals with DA based usage of versatile filter, where the coefficients change at each cycle. In this brief, we show productive plans for the enhanced shared-LUT usage of reconfigurable FIR channels utilizing DA strategy, where LUTs are shared by the DA units for bit cuts of distinctive weight age. Likewise, the filter coefficients can be progressively changed in runtime with a little reconfiguration inactivity

### **II. EXISTING SYSTEMS**

The systolic decomposition scheme is found to offer a flexible choice of the address length of the look-uptables (LUT) for DA-based computation to decide on suitable area-time trade-off. It is observed that by using smaller address-lengths for DA-based computing units, it is possible to reduce the memory-size but on the other hand that leads to increase of adder complexity and the latency. For efficient DA-based realization of FIR filters of different orders, the flexible linear systolic design is implemented on a Xilinx Virtex-E XCV2000E FPGA using a hybrid combination of Handel-C and parameterizable VHDL cores. Various key performance metrics such as number of slices, maximum usable frequency, dynamic power consumption, energy density and energy throughput are estimated for different filter orders and address-lengths.

### **DA-based systolic structure**

A systolic system consists of a set of interconnected cells, each capable of performing some simple operation. Because simple, regular communication and control structures have substantial advantages over complicated ones in design and implementation, cells in a systolic system are typically interconnected to form a

systolic array or a systolic tree. Information in a systolic system flows between cells in a pipelined fashion, and communication with the outside world occurs only at the "boundary cells." For example, in a systolic array, only those cells on the array boundaries may be I/O ports for the system. The basic principle of a systolic architecture, array in particular, is replacing a single Processing Element (PE) with an array of PEs or cells. Being able to use each input data item a number of times (and thus achieving high computation throughput with only modest memory bandwidth) is one of the advantages of the systolic approach. They have several attractive features such as simplicity, regularity and modularity of structure . In addition, they also possess significant potential to yield highthroughput rate by exploiting high-level of concurrency using pipelining or parallel processing or both

#### A. 1-D Systolic Array for FIR Filters

The DG for computation of FIR filter output according to (9) is shown in Fig.1. It consists of L rows, where each row consists of P number of node-A and one boundary node-B. The functions of node-A and node-B are depicted in Figs. 1(b) and 1(c), respectively. A bit-vector (bn)l,p consisting of a sequence of M bits [derived from the l-th bit of the element of the input sequence as given in (9)] is fed to the node-A on (l + 1)-th row and (p + 1)-th column. The node uses the sequence of M input bits of the input bitvector as address for an LUT, and reads the content stored at the location specified by the address



Fig. 1. The DG for DA-based implementation of FIR filter. (a) The DG. (b) Function of node A. (c) Function of node B.



Fig. 2. The 1-D array for DA-based implementation of FIR filter. (a) The linear systolic array. (b) Function of PE. (c) Function of output cell.  $\Delta$  stands for a unit delay

## **B. 2-D** Systolic Structure for FIR Filters For highthroughput implementation of FIR filters

Each node of the DG of Fig.1 can be assigned to a PE exclusively to obtain a 2-D systolic array of L rows and (P + 1) columns as shown in Fig.3. Each row of the structure consists of P number of PEs and a shift-add cell (SA). The computation of all the subsequent values of filter output may also be given by similar DGs, and the computation of corresponding nodes of all such DGs may be folded to the same structure.



*Fig. 3. The 2-D array for FIR filter. (a) The 2-D systolic array. (b) Function of PE. (c) Function of SA cell.*  $\Delta$  *stands for unit delay* 

## III. PROPOSED RECONFIGURABLE DA-BASED FIR FILTER FOR ASIC IMPLEMENTATION

The proposed structure of the DA-based FIR filter for ASIC usage is indicated in Fig. 4. The information tests  $\{x(n)\}\$  landing at each inspecting moment are bolstered to a serial-in-parallelout movement register (SIPOSR) of sizeN. The SIPOSR disintegrates the N late most specimens to P vectors bp of length M for  $p = 0, 1, \ldots, P - 1$  and encourages them to P reconfigurable fractional item

generators (RPPGs) to ascertain the halfway items acc. The structure of the proposed RPPG is delineated in Fig. 5 for M = 2. For high-throughput execution, the RPPG produces L incomplete items comparing to L bit cuts in parallel utilizing the LUT made out of a solitary register bank of 2<sup>M</sup>-1 registers and L number of 2<sup>M</sup> : 1 MUXes. In the proposed structure, we lessen the capacity utilization by sharing every LUT crosswise over L bit cuts. The register cluster is favored for this reason as opposed to memorybased LUT keeping in mind the end goal to get to the LUT substance at the same time. Furthermore, the substance in the register-based LUT can be overhauled in parallel in less cycles than the memory-based LUT to actualize wanted FIR channel. The width of every register in the LUT is (W + \_log2M\_) bits, where W is the word length of the channel coefficient. The info of the MUXes are 0, h(2p), h(2p + 1), and h(2p) + h(2p + 1); and the no good digit bl,p is encouraged to MUX 1 for  $0 \le 1 \le L - 1$  as a control word. The PAT obliges P - 1 adders in  $\lceil \log \rceil - 2$  P stages and the PSAT obliges



Fig.4. Proposed structure of the high-throughput DA-based FIR filter for ASIC implementation. RPPG stands for reconfigurable partial product generator

## IV. PROPOSED RECONFIGURABLE DA-BASED FIR FILTER FOR FPGA IMPLEMENTATION

FPGA innovation has massively developed from a committed equipment to a heterogeneous framework, which is thought to be a well known decision in correspondence base stations as opposed to being only a model stage. The proposed reconfigurable FIR filter may be additionally actualized as part for the complete framework on FPGA. Consequently, here we propose a reconfigurable DA based FIR filter for FPGA usage.

The structural planning recommended in Section III for high-throughput execution of DA-based FIR filter is not suitable for FPGA usage. The structure in Fig. 4 includes N(2M - 1)/M number of registers for the execution of LUTs for FIR filter of length N. Then again, registers are rare asset in FPGA since every LUT in numerous FPGA gadgets contains just two bits of registers. In this manner, the LUTs are obliged to be actualized by circulated RAM (DRAM) for FPGA usage. Then again, not at all like the instance of the RPPG in Fig. 5, the different number of incomplete inward items S t,p can't be recovered from the DRAM at the same time subsequent to stand out LUT worth can be read from the DRAM every cycle. Besides, if L is the bit width of information, the length of time of the example time of the outline is L times the working clock period, which may not be suitable for the application obliging high throughput. Utilizing a DRAM to actualize LUT for every bit cut will prompt high asset utilization. In this manner, we decay the incomplete inward item generator into Q parallel segments and every area has R timemultiplexed operations relating to R bit cuts. At the point when L is a composite number given by L = RQ (R and Q are two positive numbers), the list 1 in (8a) can be mapped into (r + qR) for r = 0, 1, ..., R - 1 and q = 0, 1, ..., Q - 1

$$y = \sum_{q=0}^{Q-1} 2^{-Rq} \left[ \sum_{r=0}^{R-1} 2^{-r} (\sum_{p=0}^{p-1} S_{r+qR,p}) \right]$$



Fig. 5 pth RPPG for M = 2.









Fig. 6. Proposed structure of the DA-based FIR filter for FPGA implementation. (a) Structure of the DA-based FIR filter. (b) Structure of the DRPPG for M = 2 and R = 2. (c) Structure of the shift-accumulator.

### V.IMPLEMENTATION RESULT AND DISCUSSIONS

Simulation results for FPGA (m=2)



RTL schematic



Simulation results for ASIC (m=2)



RTL schematic



### VI.CONCLUSION

We have recommended proficient plans for highthroughput reconfigurable DA-based usage of FIR filter . It is demonstrated that the equipment expense could be considerably decreased by having the same registers by the DA units for distinctive bit cuts. The proposed configuration has about 68% and 58% less ADP and 78% and 59% less EPS than the DA-based systolic structure and the DA-based structure utilizing CSA, individually, for the ASIC usage. The proposed structure of reconfigurable FIR filter for FPGA usage backings up to 91 MHz information inspecting recurrence. It is found to offer 54% and 29% less NOS than the systolic structure and the CSA-based structure, individually.

### ACKNOWLEDGEMENT

I would like to express my sincere gratitude and thanks to **Mr.K.Yuvaraj**, **M.Tech** Assistant Professor in Electronics and Communication Engineering department, Sri Venkateswara College Of Engineering, Tirupati, for his constant help, valuable guidance and useful suggestions, which helped me in the successful Completion of the work.

### REFERENCES

[1] T. Hentschel, M. Henker, and G. Fettweis, "The digital front-end of software radio terminals," IEEE Pers. Commun. Mag., vol. 6, no. 4, pp. 40–46, Aug. 1999.

[2] K.-H. Chen and T.-D. Chiueh, "A low-power digitbased reconfigurable FIR filter," IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 53, no. 8, pp. 617–621, Aug. 2006.

[3] L. Ming and Y. Chao, "The multiplexed structure of multi-channel FIR filter and its resources evaluation," in Proc. Int. Conf. CDCIEM, Mar. 2012, pp. 764–768.

[4] I. Hatai, I. Chakrabarti, and S. Banerjee, "Reconfigurable architecture of a RRC FIR interpolator for multi-standard digital up converter," in Proc. IEEE 27th IPDPSW, May 2013, pp. 247–251.

[5] A. G. Dempster and M. D. Macleod, "Use of minimumadder multiplier blocks in FIR digital filters," IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process., vol. 42, no. 9, pp. 569–577, Sep. 1995.

[6] S. A. White, "Applications of distributed arithmetic to digital signal processing A tutorial review," IEEE ASSP Mag., vol. 6, no. 3, pp. 4–19, Jul. 1989.

[7] P. K. Meher, "Hardware-efficient systolization of DAbased calculation of finite digital convolution," IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 53, no. 8, pp. 707– 711, Aug. 2006.

[8] P. K. Meher, S. Chandrasekaran, and A. Amira, "FPGA realization of FIR filters by efficient and flexible systolization using distributed arithmetic," IEEE Trans. Signal Process., vol. 56, no. 7, pp. 3009–3017, Jul. 2008.

[9] M. Kumm, K. Moller, and P. Zipf, "Dynamically reconfigurable FIR filter architectures with fast reconfiguration," in Proc. 8th Int. Workshop ReCoSoC, Jul. 2013, pp. 1–8.

[10] E. Ozalevli, W. Huang, P. E. Hasler, and D. V. Anderson, "A reconfigurable mixed-signal VLSI implementation of distributed arithmetic used for finiteimpulse response filtering," IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 55, no. 2, pp. 510–521, Mar. 2008.