Approximate multipliers attract a large interest in the scientific literature that proposes several circuits built with approximate 4-2 compressors. Due to the large number of proposed solutions, the ...designer who wishes to use an approximate 4-2 compressor is faced with the problem of selecting the right topology. In this paper, we present a comprehensive survey and comparison of approximate 4-2 compressors previously proposed in literature. We present also a novel approximate compressor, so that a total of twelve different approximate 4-2 compressors are analyzed. The investigated circuits are employed to design <inline-formula> <tex-math notation="LaTeX">8\times 8 </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">16\times 16 </tex-math></inline-formula> multipliers, implemented in 28nm CMOS technology. For each operand size we analyze two multiplier configurations, with different levels of approximations, both signed and unsigned. Our study highlights that there is no unique winning approximate compressor topology since the best solution depends on the required precision, on the signedness of the multiplier and on the considered error metric.
Approximate Multipliers Based on New Approximate Compressors Esposito, Darjn; Strollo, Antonio Giuseppe Maria; Napoli, Ettore ...
IEEE transactions on circuits and systems. I, Regular papers,
12/2018, Letnik:
65, Številka:
12
Journal Article
Recenzirano
Odprti dostop
Approximate computing is an emerging trend in digital design that trades off the requirement of exact computation for improved speed and power performance. This paper proposes novel approximate ...compressors and an algorithm to exploit them for the design of efficient approximate multipliers. By using the proposed approach, we have synthesized approximate multipliers for several operand lengths using a 40-nm library. Comparison with previously presented approximated multipliers shows that the proposed circuits provide better power or speed for a target precision. Applications to image filtering and to adaptive least mean squares filtering are also presented in the paper.
A scalable approximate multiplier, called truncation- and rounding-based scalable approximate multiplier (TOSAM) is presented, which reduces the number of partial products by truncating each of the ...input operands based on their leading one-bit position. In the proposed design, multiplication is performed by shift, add, and small fixed-width multiplication operations resulting in large improvements in the energy consumption and area occupation compared to those of the exact multiplier. To improve the total accuracy, input operands of the multiplication part are rounded to the nearest odd number. Because input operands are truncated based on their leading one-bit positions, the accuracy becomes weakly dependent on the width of the input operands and the multiplier becomes scalable. Higher improvements in design parameters (e.g., area and energy consumption) can be achieved as the input operand widths increase. To evaluate the efficiency of the proposed approximate multiplier, its design parameters are compared with those of an exact multiplier and some other recently proposed approximate multipliers. Results reveal that the proposed approximate multiplier with a mean absolute relative error in the range of 11%-0.3% improves delay, area, and energy consumption up to 41%, 90%, and 98%, respectively, compared to those of the exact multiplier. It also outperforms other approximate multipliers in terms of speed, area, and energy consumption. The proposed approximate multiplier has an almost Gaussian error distribution with a near-zero mean value. We exploit it in the structure of a JPEG encoder, sharpening, and classification applications. The results indicate that the quality degradation of the output is negligible. In addition, we suggest an accuracy configurable TOSAM where the energy consumption of the multiplication operation can be adjusted based on the minimum required accuracy.
Approximate computing has received significant attention as a promising strategy to decrease power consumption of inherently error tolerant applications. In this paper, we focus on hardware-level ...approximation by introducing the partial product perforation technique for designing approximate multiplication circuits. We prove in a mathematically rigorous manner that in partial product perforation, the imposed errors are bounded and predictable, depending only on the input distribution. Through extensive experimental evaluation, we apply the partial product perforation method on different multiplier architectures and expose the optimal architecture-perforation configuration pairs for different error constraints. We show that, compared with the respective exact design, the partial product perforation delivers reductions of up to 50% in power consumption, 45% in area, and 35% in critical delay. In addition, the product perforation method is compared with the state-of-the-art approximation techniques, i.e., truncation, voltage overscaling, and logic approximation, showing that it outperforms them in terms of power dissipation and error.
Summary
Recustomize finite impulse response (RFIR) filter is designed to achieve lesser power consume, cost, area, and higher speed of system operation. This is used to remove the noises from the ...image and signal. Previously, several filters were designed for removing noises, but that filters consume more area, power, cost, and delay and did not provide accurate results and the error rate was increased. To overcome these issues, this work is proposed. In this work, the Recustomize finite impulse response (RFIR) filter is designed using truncation‐based scalable rounding approximate multiplier (TOSAM) and error reduced carry prediction approximate adder (ERCPAA) for image processing application. Here, TOSAM‐ERCPAA is used to speed up the filter design with less area and less power consumption. The proposed ERCPAA adder is divided into three blocks, such as carry prediction logic, approximate full adder cells array, constant truncation along error lessening logic, which can reduce the power and area. The proposed RFIR‐ TOSAM‐ERCPAA filter is designed and executed in Verilog programming language and the simulation is done in Xilinx ISE 14.5 design tools.
This article proposes boosting the multiplication performance for convolutional neural network (CNN) inference using a precision prediction preprocessor which controls various precision approximate ...multipliers. Previously, utilizing approximate multipliers for CNN inference was proposed to enhance the power, speed, and area at a cost of a tolerable drop in the accuracy. Low precision approximate multipliers can achieve massive performance gains; however, utilizing them is not feasible due to the large accuracy loss they cause. To maximize the multiplication performance gains while minimizing the accuracy loss, this article proposes using a tiny two-class precision controller to utilize low and high precision approximate multipliers hybridly. The performance benefits for the proposed concept are presented for multi-core multi-precision architectures and single-core reconfigurable architectures. Additionally, a design for a merged reconfigurable approximate multiplier with two precisions is proposed for utilization in single-core architectures. For performance comparison, several segments-based approximate multipliers with different precisions were synthesized using CMOS 15nm technology. For accuracy evaluation, the concept was simulated on VGG19, Xception, and DenseNet201 using the ImageNetV2 dataset. This article will demonstrate that the proposed concept can achieve significant performance gains with a minimal accuracy loss when compared to designs that utilize exact multipliers or single-precision approximate multipliers.
This manuscript proposes a low-power and high-speed hybrid approximate multiplier using 15-4 approximate compressors in partial product stage for image processing application. Initially, the most ...significant bits (MSB) of approximate multiplier is encoded by approximate radix-8 Booth’s (R-8B) encoding, and also least significant bits (LSB) is encoded by approximate truncated-round approximate multiplier (TRAM) encoding both are used to rounding the LSB to the adjacent power of two. Then, approximate 15-4 compressors are subjugated in partial product lessening stage to produce MSB result. Then, the hybrid approximate multiplier under 15-4 approximate compressors is carried out in the application of image processing. The proposed approach is done in MATLAB and Vivado Design Suite 2018.1 simulator, then observes that the power consumption of proposed design attains 31.814%, 23.562% lower than existing models. Similarly, the velocity attains 42.63%, 6.263% higher than the existing models.
Approximate computing has received significant attention as an attractive paradigm for error-tolerant applications to reduce the power consumption, delay and area with some trade-off in accuracy. ...This paper proposes the design of a novel approximate 4–2 compressor. A modified architecture of Dadda Multiplier is presented for the effective utilization of the proposed compressor and to reduce the error at the output. Through extensive experimental evaluation, the efficiency of the proposed compressor and multiplier are evaluated in a 45 nm standard CMOS technology and their parameters are compared with those of the state-of-the-art approximate multipliers. The results show that the proposed compressor accomplish a significant reduction in error rate compared to other approximate compressors available in the literature. In addition, the proposed multiplier shows 35%, 36% and 17% reduction in power consumption, delay and area respectively compared to those of exact multiplier. The effectiveness of multiplier is assessed by some of the image processing applications. On an average, the proposed multiplier processes images with 85% structural similarity compared to the exact output image.
Approximate multipliers (AMs) have widely been investigated to pursue high-performance and energy-efficient hardware designs for error-tolerant applications, such as neural networks (NNs). The ...computing accuracy of an AM has been evaluated by using statistical error features; however, it is difficult to estimate the quality of a specific application using AMs. Thus, it is a great challenge to select or design appropriate AMs for an accuracy-constrained application. This paper proposes an application-oriented error evaluation framework for AMs with the aim of exploring the correlation between statistical error features of AMs and the accuracy degradation in AM-based NN applications. Specifically, based on the Dropout Feature Ranking technique, statistical error features of AMs are extensively studied and ranked by their importance to the accuracy of AM-based NN applications. The three most informative features are obtained to construct error models to predict the accuracy loss of AM-based NN applications. The constructed classification models show a probability higher than 96% for correctly classifying the AMs into three categories in accordance with the induced accuracy loss in AM-based NN applications. Furthermore, regression models can predict the accuracy of NN applications using an AM with a deviation as low as 6%. These results show that the proposed error evaluation framework can guide an efficient selection of AMs for NN applications by using just several AM error features, instead of running time-consuming and complicated hardware simulation. The obtained statistical error features can also provide a guidance for the design or generation of application-oriented AMs. Moreover, the proposed framework is applicable for quickly analyzing and selecting other approximate circuits for error-tolerant applications.