The development of an automatic telemedicine system for computer-aided screening and grading of diabetic retinopathy depends on reliable detection of retinal lesions in fundus images. In this paper, ...a novel method for automatic detection of both microaneurysms and hemorrhages in color fundus images is described and validated. The main contribution is a new set of shape features, called Dynamic Shape Features, that do not require precise segmentation of the regions to be classified. These features represent the evolution of the shape during image flooding and allow to discriminate between lesions and vessel segments. The method is validated per-lesion and per-image using six databases, four of which are publicly available. It proves to be robust with respect to variability in image resolution, quality and acquisition system. On the Retinopathy Online Challenge's database, the method achieves a FROC score of 0.420 which ranks it fourth. On the Messidor database, when detecting images with diabetic retinopathy, the proposed method achieves an area under the ROC curve of 0.899, comparable to the score of human experts, and it outperforms state-of-the-art approaches.
Convolutional Neural Networks (CNNs) have proven to be extremely accurate for image recognition, even outperforming human recognition capability. When deployed on battery-powered mobile devices, ...efficient computer architectures are required to enable fast and energy-efficient computation of costly convolution operations. Despite recent advances in hardware accelerator design for CNNs, two major problems have not yet been addressed effectively, particularly when the convolution layers have highly diverse structures: (1) minimizing energy-hungry off-chip DRAM data movements; (2) maximizing the utilization factor of processing resources to perform convolutions. This work thus proposes an energy-efficient architecture equipped with several optimized dataflows to support the structural diversity of modern CNNs. The proposed approach is evaluated on convolutional layers of VGGNet-16 and ResNet-50. Results show that the architecture achieves a Processing Element (PE) utilization factor of 98% for the majority of <inline-formula> <tex-math notation="LaTeX">3\times 3 </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">1\times 1 </tex-math></inline-formula> convolutional layers, while limiting latency to 396.9 ms and 92.7 ms when performing convolutional layers of VGGNet-16 and ResNet-50, respectively. In addition, the proposed architecture benefits from the structured sparsity in ResNet-50 to reduce the latency to 42.5 ms when half of the channels are pruned.
In this paper, we present a simple yet effective image deblurring method to produce ringing-free deblurred images. Our work is inspired by the observation that large-scale deblurring ringing ...artifacts are measurable through a multiresolution pyramid of low-pass filtering of the blurred-deblurred image pair. We propose to model such a quantification as a convex cost function and minimize it directly in the deblurring process in order to reduce ringing regardless of its cause. An efficient primal-dual algorithm is proposed as a solution to this optimization problem. Since the regularization is more biased toward ringing patterns, the details of the reconstructed image are prevented from over-smoothing. An inevitable source of ringing is sensor saturation which can be detected costlessly contrary to most other sources of ringing. However, dealing with the saturation effect in deblurring introduces a non-linear operator in optimization problem. In this paper, we also introduce a linear approximation as a solution to handling saturation in the proposed deblurring method. As a result of these steps, we significantly enhance the quality of the deblurred images. Experimental results and quantitative evaluations demonstrate that the proposed method performs favorably against state-of-the-art image deblurring methods.
Oscillations in the granule cell layer (GCL) of the cerebellar cortex have been related to behavior and could facilitate communication with the cerebral cortex. These local field potential (LFP) ...oscillations, strong at 4–12 Hz in the rodent cerebellar cortex during awake immobility, should also be an indicator of an underlying influence on the patterns of the cerebellar cortex neuronal firing during rest. To address this hypothesis, cerebellar cortex LFPs and simultaneous single-neuron activity were collected during LFP oscillatory periods in the GCL of awake resting rats. During these oscillatory episodes, different types of units across the GCL and Purkinje cell layers showed variable phase-relation with the oscillatory cycles. Overall, 74% of the Golgi cell firing and 54% of the Purkinje cell simple spike (SS) firing were phase-locked with the oscillations, displaying a clear phase relationship. Despite this tendency, fewer Golgi cells (50%) and Purkinje cell’s SSs (25%) showed an oscillatory firing pattern. Oscillatory phase-locked spikes for the Golgi and Purkinje cells occurred towards the peak of the LFP cycle. GCL LFP oscillations had a strong capacity to predict the timing of Golgi cell spiking activity, indicating a strong influence of this oscillatory phenomenon over the GCL. Phase-locking was not as prominent for the Purkinje cell SS firing, indicating a weaker influence over the Purkinje cell layer, yet a similar phase relation. Overall, synaptic activity underlying GCL LFP oscillations likely exert an influence on neuronal population firing patterns in the cerebellar cortex in the awake resting state and could have a preparatory neural network shaping capacity serving as a neural baseline for upcoming cerebellar operations.
Bit-width allocation has a crucial impact on hardware efficiency and accuracy of fixed-point arithmetic circuits. This paper introduces a new accuracy-guaranteed word-length optimization approach for ...feed-forward fixed-point designs. This method uses affine arithmetic, which is a well-known analytical technique, for both range and precision analyses. This paper introduces an acceleration technique and two new semianalytical algorithms for precision analysis. While the first algorithm follows a progressive search strategy, the second one uses a tree-shaped search method for fractional width optimization. The algorithms offer two different time-complexity/cost efficiency tradeoffs. The first algorithm has polynomial complexity and achieves comparable results with existing heuristic approaches. The second algorithm has exponential complexity, but it achieves near-optimal results compared to the exhaustive search method. A commonly used set of case studies is used to evaluate the efficiency of the proposed techniques and algorithms in terms of optimization time and hardware cost. The first and second algorithms achieve 10.9% and 13.1% improvements in area, respectively, over uniform fractional width allocation. The proposed acceleration technique reduces the complexity of the fractional width selection problem by an average of 20.3%.
Application-specific customisation of micro-processor architectures has been widely accepted as an effective way to improve the efficiency of processor-based designs. In this work, the authors ...propose a new processor customisation method based on fixed-point word-length optimisation. Accuracy-aware word-length optimisation (WLO) of fixed-point circuits is an active research area with a large body of literature. For the first time, this work introduces a method to combine the WLO with the processor customisation. The data type word-lengths, the size of register-files and the architecture of the functional units are the main target objectives to be optimised. Accuracy requirements, defined as the worst-case error bound, is the key consideration that must be met by any solution. A custom processor design environment, called PolyCuSP, is used to realise the processor architecture based on the solution found in the proposed optimisation algorithm. The results achieved by evaluating five benchmark show that this method can reduce the number of necessary LUTs and flip-flops by an average of 11.9% and 5.1%, respectively. The latency is also improved by an average of 33.4%. Moreover, the method was further examined through a case study on a JPEG decoder. The results suggest 16.2% and 56.2% reduction in area consumption and latency, respectively.
Full text
Available for:
DOBA, FZAB, GIS, IJS, KILJ, NLZOH, NUK, OILJ, SAZU, SBCE, SBMB, UILJ, UKNU, UL, UM, UPUK
In this paper, a novel BCD multiplier approach is proposed. The main highlight of the proposed architecture is the generation of the partial products and parallel binary operations based on 2-digit ...columns. 1 × 1-digit multipliers used for the partial product generation are implemented directly by 4-bit binary multipliers without any code conversion. The binary results of the 1 × 1-digit multiplications are organized according to their two-digit positions to generate the 2-digit column-based partial products. A binary-decimal compressor structure is developed and used for partial product reduction. These reduced partial products are added in optimized 6-LUT BCD adders. The parallel binary operations and the improved BCD addition result in improved performance and reduced resource usage. The proposed approach was implemented on Xilinx Virtex-5 and Virtex-6 FPGAs with emphasis on the critical path delay reduction. Pipelined BCD multipliers were implemented for 4 × 4, 8 × 8, and 16 × 16-digit multipliers. Our realizations achieve an increase in speed by up to 22% and a reduction of LUT count by up to 14% over previously reported results.
Full text
Available for:
FZAB, GIS, IJS, KILJ, NLZOH, NUK, OILJ, SAZU, SBCE, SBMB, UL, UM, UPUK
Catching a Rat by Its Edglets Farah, R.; Langlois, J. M. P.; Bilodeau, G.
IEEE transactions on image processing,
02/2013, Volume:
22, Issue:
2
Journal Article
Peer reviewed
Computer vision is a noninvasive method for monitoring laboratory animals. In this article, we propose a robust tracking method that is capable of extracting a rodent from a frame under uncontrolled ...normal laboratory conditions. The method consists of two steps. First, a sliding window combines three features to coarsely track the animal. Then, it uses the edglets of the rodent to adjust the tracked region to the animal's boundary. The method achieves an average tracking error that is smaller than a representative state-of-the-art method.
Convolutional Neural Networks (CNNs) and Deep Neural Networks (DNNs) have gained significant popularity in several classification and regression applications. The massive computation and memory ...requirements of DNN and CNN architectures pose particular challenges for their FPGA implementation. Moreover, programming FPGAs requires hardware-specific knowledge that many machine-learning researchers do not possess. To make the power and versatility of FPGAs available to a wider deep learning user community and to improve DNN design efficiency, we introduce POLYBiNN, an efficient FPGA-based inference engine for DNNs and CNNs. POLYBiNN is composed of a stack of decision trees, which are binary classifiers in nature, and it utilizes AND-OR gates instead of multipliers and accumulators. POLYBiNN is a memory-free inference engine that drastically cuts hardware costs. We also propose a tool for the automatic generation of a low-level hardware description of the trained POLYBiNN for a given application. We evaluate POLYBiNN and the tool for several datasets that are normally solved using fully connected layers. On the MNIST dataset, when implemented in a ZYNQ-7000 ZC706 FPGA, the system achieves a throughput of up to 100 million image classifications per second with 90 ns latency and 97.26
%
accuracy. Moreover, POLYBiNN consumes 8× less power than the best previously published implementations, and it does not require any memory access. We also show how POLYBiNN can be used instead of the fully connected layers of a CNN and apply this approach to the CIFAR-10 dataset.
Full text
Available for:
EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NLZOH, NUK, OBVAL, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
Abstract In this study, we demonstrate that gamma oscillations (30–50 Hz) recorded in the local field potentials (LFP) of the hippocampus are a marker of temporal lobe seizure propagation and that ...the level of LFP synchrony in the amygdalo-hippocampal network, during these oscillations, is related to the severity of seizures. Sprague–Dawley rats were given a single systemic dose of kainic acid (KA; 6 mg/kg, i.p.) and local field potential activity (1–475 Hz) of the dorsal hippocampus, the amygdala and the neocortex was recorded. Of 135 ictal discharges, 55 (40.7%) involved both limbic structures. We demonstrated that 78.2% of seizures involving both the hippocampus and amygdala showed hippocampal gamma oscillations. Seizure duration was also significantly correlated with the frequency of hippocampal gamma oscillations ( r2 = 0.31, p < 0.01) and LFP synchrony in the amygdalo-hippocampal network ( r2 = 0.21, p < 0.05). These results suggest that gamma oscillations in the amygdalo-hippocampal network could facilitate long-range synchrony and participate in the propagation of seizures.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK