

# Time and Area Efficient 2-D DWT using Multiplier-less Canonic Signed Digit Technique

#### Akhilesh Jain, Karishma Rushiya, Rakesh Singhai

Abstract: A few models have been recommended for proficient VLSI usage of 2-D DWT for constant applications. It is disc overed that multipliers devour more chip zone and expands multifaceted nature of the DWT design. Multiplier-less equipment usage approach gives an answer for diminish chip region, lower equipment intricacy and higher throughput of calculation of the DWT design. The proposed design outline is (i) priority must be given for memory complexity optimization over the arithmetic complexity optimization or reduction of cycle period and (ii) memory utilization efficiency to be considered ahead of memory reduction due to design complexity of memory optimization method. Based on the proposed design outline four separate design approaches and concurrent architectures are presented in this thesis for area-delay and power efficient realization of multilevel 2-D DWT.In this theory a multiplier-less VLSI engineering is proposed utilizing new dispersed number juggling calculation named CSD. We show that CSD is an effective engineering with adders as the principle part and free of ROM, duplication, and subtraction. The proposed design utilizing CSD gives less postponement and least number of cut thought about the current engineering. The reenactment was performed utilizing XILINX 14.1i and ModelSim test system.

Keywords: 2-D Discrete Wavelet Transform (DWT), CSD, Low Filter Bank, High Filter Bank, Xilinx Simulation.

#### I. INTRODUCTION

Discrete Wavelet Transform (DWT) is a change method that is utilized in sign and picture preparing applications for interpreting input information from time area to wavelet space. The wavelet area data got after change gives data on both time and recurrence goals of info signal. With enormous number of sign and picture preparing applications, for example, restorative picture handling, remote detecting, satellite imaging, discourse preparing, correspondence and hyper ghostly picture preparing applications requiring picture handling procedures, for example, pressure, combination,

Manuscript published on November 30, 2019. \* Correspondence Author

Akhilesh Jain\*, Assistant Professor, Department of Electronics and Communication, NRI Bhopal

Karishma Rushiya, M. Tech Scholar, Department of Electronics and Communication, NRI Bhopal

Dr. Rakesh Singhai, Dy. Registrar (Admn.), RGPV, Bhopal

© The Authors. Published by Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an <u>open access</u> article under the CC-BY-NC-ND license <u>http://creativecommons.org/licenses/by-nc-nd/4.0/</u>

enrollment, object location and movement estimation it is beneficial work in wavelet area. In this part a point by point discourse on wavelet change and calculation multifaceted nature in wavelet change is displayed. The major challenges in figuring wavelet change and executing the equivalent over equipment stages are displayed. Remote detecting utilizing Micro Air Vehicles (MAV) is one of the time basic applications that require picture preparing calculations to be quicker and financially savvy as far as zone and power [1].

The essential strides in wavelet pressure are playing out a discrete wavelet Transformation (DWT), quantization of the wavelet-space picture sub groups, and after that encoding these sub groups. Wavelet pictures by and of themselves are not packed pictures; rather it is quantization and encoding stages that do the picture pressure. Picture decompression, or remaking, is accomplished via completing the above strides backward and converse request. In this way, to reestablish the first picture, the compacted picture is decoded, dequantized, and afterward a reverse DWT is performed. Since wavelet pressure intrinsically brings about a set of multi-goals pictures, it is appropriate to working with huge symbolism which should be specifically seen at various goals, as just the levels containing the required degree of detail should be decompressed [2, 3].

The center of picture pressure unit is DWT. Planning DWT-IDWT as an IP center is one of the major testing part of this exploration work. The two-dimensional DWT is getting to be one of the standard instruments [4] for picture combination in picture and sign preparing field. The DWT procedure is completed by progressive low pass and high pass sifting of the advanced picture or pictures.

Wavelets assume a fundamental job in picture handling applications; with double tree wavelets having worthwhile over wavelets regarding directionality, move invariance it is required to assess different double tree wavelet channels for picture preparing applications. Wavelet channel coefficients should be spoken to in decimal numbers; it is required to land at fitting number of bits required to speak to channel coefficients. In this part different wavelets, double tree wavelets are assessed for their properties. Appropriate number framework is distinguished for spoke to in channel coefficients. The picture recreation is characterized as the system of including two dimensional pictures into PC by exploring the state of the picture. The picture recreation is chiefly utilized in different applications like Medicine, Mechanical technology, and Gaming [5]. In Discrete Wavelet Transform, there are some arrangement of wavelet works that are utilized for the pressure, commotion decreases, and remaking process. When all is said in done, all correspondence channels have irregular commotion because of these attributes, and these channels are influenced by awful association from the wellspring of the channel.

Retrieval Number: D7419118419/2019©BEIESP DOI:10.35940/ijrte.D7419.118419 Journal Website: <u>www.ijrte.org</u>

5425

Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP) © Copyright: All rights reserved.



The picture reproduction is performed by the up inspecting pursued by the computerized channels. Multi-goals wavelet change is the customary methodology of reproduction. The fundamental disservice of ordinary methodology is the most elevated equipment necessities to store the transitional qualities. The computational deferral of the fixed is additionally incredible. To beat these issues, the multi-band wavelet change is mostly utilized for the picture recreation process. By utilizing the proposed multiband wavelet change, the recurrence covering of the gear is diminished. The summation channels are principally used to manufacture the recreation square. The picture complexity and power are productive in the multiband wavelet to change when contrasted with customary multiresolution wavelet change [6].

#### MULTI RESOLUTION ANALYSIS(MRA) II.

In the sign preparing and pressure, the wavelet change has been improved in acknowledgment of the productive execution. The JPEG council has presented another propelled picture coding technique dependent on the DWT known as JPEG-2000. Set of the info given sign are deteriorated into the different substructures dependent on the capacities. Those essential capacities are known as the wavelets. By utilizing the single model wavelet known as mother wavelet, the fundamental wavelets are decided. For the sub band deteriorations of the change, the effective also, improved DWT has been presented. 2D-DWT is executed as the key activities in the picture handling and multi-goals wavelet examination. Sources of info given pictures are deteriorated into the different wavelet coefficients and furthermore the scaling capacity is likewise to be performed by utilizing the disintegration of the picture [7].

2-D DWT procedures are generally utilized for picture and video pressure process. The 2-D DWT procedure has multigoals deterioration ability, since it assumes a fundamental job in many designing fields. Be that as it may, gathering of huge estimations of information of different decay levels of the change makes their multifaceted nature computationally escalated. Huge undertakings have been planned by a lot of engineering which is planned for giving fast 2-D DWT calculation with the necessity of sensible equipment use.

These designs can be named distinct and nondistinguishable designs. In a divisible engineering, 2-D separating activity should be possible through two1-D separating activities, one for preparing the information in line insightful also, another for handling the information in segment astute [8].



Figure 1: Non-stationary Signals



Figure 3: Time-frequency plane of a Wavelet Transform

#### III. **PROPOSED ARCHITECTURE**

Inner product computation can be expressed by CSD. The DWT formulation using convolution scheme given in can be expressed by inner product, where the 1-D DWT formulation given in (1) - (2) cannot be expressed by inner product. Although, convolution DWT demands more arithmetic resources than DWT, convolution DWT is considered to take the advantages of CSD-based design. CSD formulation of convolution-based DWT using 5/3 biorthogonal filter is presented here.



Figure 4: Block Diagram of 5/3 1-D DWT using CSD Technique

Where

Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP) 5426 © Copyright: All rights reserved.



Retrieval Number: D7419118419/2019©BEIESP DOI:10.35940/ijrte.D7419.118419 Journal Website: <u>www.ijrte.org</u>



B: Buffer

D: Delay flip flop

 $A_1$ : First output of the LUT

A<sub>2</sub>: Second output of the LUT and add '0'

An: N output of the LUT and add (N-1) zero bit



#### Figure 5: Block Diagram of 5/3 2-D DWT using CSD Technique

Where

 $Y_{LL}$  is the low-low output of the 2-D DWT  $Y_{LH}$  is the low-high output of the 2-D DWT  $Y_{HL}$  is the high-low output of the 2-D DWT  $Y_{HH}$  is the high-high output of the 2-D DWT

According to (1) and (2), the 5/3 wavelet filter computation in convolution form is expressed as

$$Y_{L} = \sum_{i=0}^{4} h(i) X_{n}(i)$$
(1)
$$Y_{H} = \sum_{i=0}^{2} g(i) X_{n}(i)$$
(2)

The low-pass filter coefficients {h(i)} and high-pass filter coefficients {g(i)} of the 5/3 wavelet filter coefficient.  $Y_H$  is the high pass filter output and  $Y_L$  is the low pass filter output.

#### IV. RESULT AND SIMULATION

#### A. Hardware Utilization

The VHDL language was used to design the VLSI architecture modules and are synthesized Virtex-5 (xc5vlx110-2ff676 and xc5vlx330t) and Virtex-4 (xc4vfx140) FPGA board. Hardware description language (HDL) synthesis report for 5/3 1-D and 2-D DWT using CSD technique are shown in table I and table III respectively. It is observed from the table that the preprocessing unit for 5/3 1-D DWT using CSD technique uses 48 registers, 10 latch, 8 multiplexer, 1854 XOR gate,

Retrieval Number: D7419118419/2019©BEIESP DOI:10.35940/ijrte.D7419.118419 Journal Website: <u>www.ijrte.org</u> 353632 Kbytes memory, 20.00 sec real time to Xilinx Synthesis (XST) and 20.71 sec central processing unit (CPU) to XST. It is observed from the table that the preprocessing unit for 5/3 2-D DWT using CSD technique uses 240 registers, 10 latch, 8 multiplexer, 2384 XOR gate, 373088 Kbytes memory, 29.00 sec real time to XST and 29.44 sec CPU to XST.

#### B. Synthesis Utilization

Device utilization summary for 5/3 1-D and 2-D DWT using CSD technique are shown in table II and table IV respectively. It is observed from the table that the processing unit for 5/3 1-D DWT using CSD technique uses 99 number of slice registers, 1164 number of 4-input look up table (LUTs), 68 LUT flip flop, 74 number of input output bounds (IOBs), 1.552 nsec minimum period, 644.330 MHz 10.162 maximum frequency and nsec maximum combination path delay. The processing unit for 5/3 2-D DWT using CSD technique uses 236 number of slice registers, 1453 number of 4-input LUTs, 148 LUT flip flop, 88 number of input output bounds (IOBs), 3.591 nsec minimum period, 278.489 MHz maximum frequency and 12.368 nsec maximum combination path delay is shown in table IV.

Table I: HDL Synthesis Report for 5/3 1-D DWT using CSD Technique

| Technique        |                          |  |  |
|------------------|--------------------------|--|--|
| Device           | Virtex5 xc5vlx110-2ff676 |  |  |
| Register         | 48                       |  |  |
| Latch            | 10                       |  |  |
| Multiplexer      | 8                        |  |  |
| XOR Gate         | 1854                     |  |  |
| memory           | 353632 Kilobytes         |  |  |
| Real Time to XST | 20.00 sec                |  |  |
| CPU to XST       | 20.71 sec                |  |  |
|                  |                          |  |  |

Table II: Device utilization summary for 5/3 1-D DWT using CSD Technique

| Number of Slice Register            | 99          | 207360      |
|-------------------------------------|-------------|-------------|
| Number of 4 input LUTs              | 1164        | 207360      |
| LUT-FF                              | 68          | 1541        |
| Number of IOBs                      | 74          | 960         |
| Minimum Period                      |             | 1.552 nsec  |
| Maximum Frequency                   | 644.330 MHz |             |
| Maximum Combinational Path<br>Delay |             | 10.162 nsec |

#### Table III: HDL Synthesis Report for 5/3 2-D DWT using CSD Technique

| Device           | Virtex5 xc5vlx110-2ff676 |
|------------------|--------------------------|
| Register         | 240                      |
| Latch            | 10                       |
| Multiplexer      | 8                        |
| XOR Gate         | 2384                     |
| memory           | 373088 Kilobytes         |
| Real Time to XST | 29.00 sec                |
| CPU to XST       | 29.44 sec                |

Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP) © Copyright: All rights reserved.

5427



## Time and Area Efficient 2-D DWT using Multiplier-less Canonic Signed Digit Technique

| Table IV:                                            |
|------------------------------------------------------|
| Device utilization summary for 5/3 2-D DWT using CSD |
| Technique                                            |

| Number of Slice Register            | 236  | 207360      |  |  |
|-------------------------------------|------|-------------|--|--|
| Number of 4 input LUTs              | 1453 | 207360      |  |  |
| LUT-FF                              | 148  | 1541        |  |  |
|                                     |      |             |  |  |
| Number of IOBs                      | 88   | 960         |  |  |
| Minimum Period                      |      | 3.591 nsec  |  |  |
| Maximum Frequency                   | 2    | 278.489 MHz |  |  |
| Maximum Combinational Path<br>Delay | h    | 12.368 nsec |  |  |

### C. Comparision Result

As shown in table V the maximum frequency and number of slice result are obtained for the proposed 5/3 2-D DWT using CSD algorithm and previous algorithm. From the analysis of the results, it is found that the proposed 5/3 2-D DWT using CSD algorithm gives a superior performance as compared with previous algorithm.

The proposed algorithm gives a maximum frequency 190.54 MHz for Virtex-5 device family as compared with 365 MHz for previous algorithm. Similarly, proposed algorithm gives a lower number of slices 236 for Virtex-5 device family and 1235 for Virtex-4 device family as compared with 1261 for Virtex-5 device family and 2278 for Virtex-4 device family for previous algorithm.

 Table V:

 Comparison of Result with Previous 2-D DWT

 Implementation

| Measure              | Proposed Design          |                         | Rakesh Biswas et al. [1] |                         |
|----------------------|--------------------------|-------------------------|--------------------------|-------------------------|
| Device               | Virtex-5<br>(xc5vlx330t) | Virtex-4<br>(xc4vfx140) | Virtex-5<br>(xc5vlx330t) | Virtex-4<br>(xc4vfx140) |
| Image<br>Size        | 256×256                  | 256×256                 | 256×256                  | 256×256                 |
| Maximum<br>Frequency | 190.54 MHz               | 234.97 MHz              | 365 MHz                  | 264.97 MHz              |
| Number<br>of Slice   | 236                      | 1235                    | 1261                     | 2278                    |



Figure 6: Bar graph of the 5/3 2-D DWT for Vertix-5 device family



# Figure 7: Bar graph of the 5/3 2-D DWT for Vertix-4 device family

The proposed algorithm gives a slice register 236 for Virtex-5 device family as compared with 645 for previous algorithm. Similarly, proposed algorithm gives lower slices LUTs 1453 for Virtex-5 device family and as compared with 5485 for previous algorithm is shown in table VI.

Table VI: Comparison of Result with Previous 2-D DWT Implementation

| Measure    | Proposed    | Satish S            | Senthil Singh |
|------------|-------------|---------------------|---------------|
|            | Design      | Bhairannawar et al. | et al. [7]    |
|            |             | [2]                 |               |
| Device     | Virtex-5    | Virtex-5            | Virtex-5      |
|            | (xc5vlx110) | (xc5vlx110)         | (xc5vlx110)   |
| Slice      | 236         | 645                 | 302           |
| Register   |             |                     |               |
| Slice LUTs | 1453        | 5485                | 2643          |
| Maximum    | 278.489     | 258.35 MHz          | 207.009 MHz   |
| Frequency  | MHz         |                     |               |



Figure 8: Bar graph of the 5/3 2-D DWT for Vertix-5 device family

### V. CONCLUSION

In this paper, CSD-based architecture for computation of 1-D and 2-D DWT is presented. The proposed CSD-based 1-D DWT structure involves significantly less logic resources than the similar existing multiplier-less designs and, it has less bit-cycle period than others.

The proposed CSD-based 2-D DWT architectures (architecture-1 and architecture- 2) involve the same logic components but they differ with on-chip memory size and frame buffer size. The architecture-1 is based on line-scanning and the architecture-2 is based on parallel data access scheme.

Retrieval Number: D7419118419/2019©BEIESP DOI:10.35940/ijrte.D7419.118419 Journal Website: <u>www.ijrte.org</u>  Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP)
 5428 © Copyright: All rights reserved.





#### REFERENCES

- Rakesh Biswas, Siddarth Reddy Malreddy and Swapna Banerjee, "A High Precision-Low Area Unified Architecture for Lossy and Lossless 3D Multi-Level Discrete Wavelet Transform", Transactions on Circuits and Systems for Video Technology, Vol. 45, No. 5, May 2017.
- Satish S Bhairannawar, Rajath Kumar, "FPGA Implementation of Face Recognition System using Efficient 5/3 2D-Lifting Scheme", 2016 International Conference on VLSI Systems, Architectures, Technology and Applications (VLSI-SATA).
- Maurizio Martina, Guido Masera, Massimo Ruo Roch, and Gianluca Piccinini, "Result-Biased Distributed-Arithmetic-Based Filter Architectures for Approximately Computing the DWT", IEEE Transactions on Circuits and Systems—I: Regular Papers, Vol .62, No.8, and August 2015.
- M. Alam, C. A. Rahman, and G. Jullian, "Efficient distributed arithmetic based DWT architectures for multimedia applications," in Proc. IEEE Workshop on SoC for real-time applications, pp. 333 336, 2003.
- X. Cao, Q. Xie, C. Peng, Q. Wang and D. Yu, "An efficient VLSI implementation of distributed architecture for DWT," in Proc. IEEE Workshop on Multimedia and Signal Process., pp. 364-367, 2006.
- Senthil singh C and Manikandan. M, "Design and Implementation of an FPGA-Based Real-Time Very Low Resolution Face Recognition System", International Journal of Advanced Information Science and Technology, Vol. 7, No. 7, pp. 59-65, November 2012.
- Archana Chidanandan and Magdy Bayoumi, "Area-Efficient MDA Architecture for the 1-D DCT/IDCT," ICASSP 2006.

