•The fixed-point format fosters implementation of intelligent sensors.•Performance of the fixed-point format depends on variance of measurement data.•Optimized fixed-point format provides almost ...constant quality in wide dynamic range.
The development of intelligent measurement systems by implementing AI (artificial intelligence) algorithms on edge sensor nodes is very topical. Due to very limited hardware of sensor nodes, the choice of the optimal digital format for representation of measurement data and parameters of AI algorithms becomes an important issue. The floating-point format is impractical to implement on sensor nodes due to great complexity, while the fixed-point format has significantly less complexity, representing an ideal solution for intelligent sensor nodes. This paper is dealing with performance analysis and optimization of the fixed-point format for measurement data modeled by the Gaussian distribution. The performance analysis is performed in a generalized way, having wide applicability. Furthermore, the paper proposes a rule for optimization of the fixed-point format, providing a high and almost constant level of quality of the fixed-point representation in a very wide variance range of measurement data. Results are confirmed by simulations.
Achieving real-time inference is one of the major issues in contemporary neural network applications, as complex algorithms are frequently being deployed to mobile devices that have constrained ...storage and computing power. Moving from a full-precision neural network model to a lower representation by applying quantization techniques is a popular approach to facilitate this issue. Here, we analyze in detail and design a 2-bit uniform quantization model for Laplacian source due to its significance in terms of implementation simplicity, which further leads to a shorter processing time and faster inference. The results show that it is possible to achieve high classification accuracy (more than 96% in the case of MLP and more than 98% in the case of CNN) by implementing the proposed model, which is competitive to the performance of the other quantization solutions with almost optimal precision.
The main aim of the paper is to provide effective and accurate solutions for the calculation of the support region of the μ-law logarithmic companding quantizers. A new solution for the starting ...point of iterative methods will be proposed, that provides very accurate value of the support region (being the main parameter needed for the design of the quantizer) only after one iteration of the iterative method. Based on this new starting point, an accurate closed-form approximate expression for the calculation of the support region will be derived, as one of the main contributions of the paper. To significantly simplify implementation of the μ-law companding quantizer, piecewise linearization is performed. A new linearization method is presented, based on the optimization of the last segments. Derivation of an accurate closed-form formula for the support region of the linearized quantizer is done, as an important contribution. The obtained linearized μ-law companding quantizer is very simple to design (due to closed-form formulas) and to implement (due to linearization), providing at the same time very high performance (due to optimization of the last segments). Due to these and other advantages (robustness, adjustability to the statistical distribution of the input signal), the proposed quantizer can be used in many topical applications, such as in receivers of 5G wireless systems or in neural networks for quantization of weights and activations. The paper provides an application of the designed quantizers for quantization of weights of a neural network, showing significant decreasing of the bit-rate compared to the standard full-precision representation (from 32 bits to just 5 bits), with the same prediction accuracy of the network.
•A very accurate closed-form expression for the support region of the µ-law logarithmic quantizer is provided.•Implementation complexity of the µ-law logarithmic quantizer is drastically reduced by optimized linearization.•Weights of neural networks can be drastically compressed if they are quantized with the proposed quantizer.
The G.711 codec has been accepted as a standard for high quality coding in many applications. A dual-mode quantizer, which combines the nonlinear logarithmic quantizer for restricted input signals ...and G.711 quantizer for unrestricted input signals is proposed in this paper. The parameters of the proposed quantizer are optimized, where the minimal distortion is used as the criterion. It is shown that the optimized version of the proposed quantizer provides 5.4 dB higher SQNR (Signal to Quantization Noise Ratio) compared to G.711 quantizer, or equivalently it performs savings in the bit rate of approximately 0.9 bit/sample for the same signal quality. Although the complexity is slightly increased, we believe that due to the superior performance it can be successfully implemented for high-quality quantization.
A novel 2‐bit adaptive delta modulation (ADM) algorithm is presented based on uniform scalar quantization and fractional linear prediction (FLP) for encoding the signals modelled by a Gaussian ...probability density function. The study focusses on two major areas: realization of a 2‐bit adaptive quantizer based on Q‐function approximation that significantly facilitates quantizer design; and implementation of a recently introduced FLP approach with the memory of two samples, which replaces the first‐order linear prediction used in standard ADM algorithms and enables improved performance without increasing transmission costs. It furthermore represents the first implementation of FLP in signal encoding, therefore confirming its applicability in a real signal‐processing scenario. Based on the performance analysis conducted on a real speech signal, the proposed ADM algorithm with FLP is demonstrated to outperform other 2‐bit ADM baselines by a large margin for the gain in signal‐to‐noise ratio achieved over a wide dynamic range of input signals. The results of this research indicate that ADM with adaptive quantization based on Q‐function approximation and adaptive FLP represents a promising solution for encoding/compression of correlated time‐varying signals following the Gaussian distribution.
A compression method based on non-uniform binary scalar quantization, designed for the memoryless Laplacian source with zero-mean and unit variance, is analyzed in this paper. Two quantizer design ...approaches are presented that investigate the effect of clipping with the aim of reducing the quantization noise, where the minimal mean-squared error distortion is used to determine the optimal clipping factor. A detailed comparison of both models is provided, and the performance evaluation in a wide dynamic range of input data variances is also performed. The observed binary scalar quantization models are applied in standard signal processing tasks, such as speech and image quantization, but also to quantization of neural network parameters. The motivation behind the binary quantization of neural network weights is the model compression by a factor of 32, which is crucial for implementation in mobile or embedded devices with limited memory and processing power. The experimental results follow well the theoretical models, confirming their applicability in real-world applications.