With the rapid development of semiconductor technology, the scale of FPGA increases exponentially, as well as the difficulty in testing. Traditional test methods are expensive, inconvenient and ...time-consuming due to the lack of controllability and observability. This paper presents a unique chip verification platform based on network, enabling users to program FPGA and verify their designs conveniently from their own terminals. This platform gives consideration to both configurability and observability. It allows researchers to not only configure the number and sequence of bit-stream files and test-bench to be downloaded, but also get the direct output feedback immediately. We integrate JTAG and CF card into our system, thus enhancing the function of the entire system and improving the verification process. We verified the functionality of these two modules, made test patterns in different widths and depths work well in our system, achieved the desired goal in the overall tests, and consequently equipped our system with control- ability and observability.
Convolutional neural networks have gained tremendous success in computer vision and medical imaging applications. To make these models truly portable and compatible for prototyping, their efficient ...implementations on low-power devices are vitally imperative. Thus, in this brief, we propose EdgeMedNet, which is one lightweight and accurate U-Net model to enable the efficient medical image segmentation on Intel/Movidius Neural Compute Stick 2 (NCS-2). Firstly, we optimize the feature maps of the baseline U-Net model into the optimum point for striking a balance between parameters reductions and segmentation performance. Then, we propose the enhanced activation block to provide comprehensive multi-scale analysis, for facilitating avoiding the representational bottleneck. By utilizing these elementary units, we further introduce the cascaded attention module and customize the layers to construct our EdgeMedNet. Extensive experiments are conducted on three medical imaging datasets, including BraTs dataset of brain MRI, heart MRI dataset, and COVID-19 CT lung and infection segmentation dataset. The experimental results demonstrate that we can gain up to 6.1× reductions of the model parameters. Compared with the state-of-the-art works, our EdgeMedNet is capable of achieving as high as 2.0× inference speedup on NCS-2, while maintaining the segmentation performance.
A novel flame retardant, silicone elastomeric nanoparticle (S-ENP) with
T
g of −120
°C and particle size of ∼100
nm has been developed and used as a modifier for polyamide 6 (nylon-6). It has been ...found that S-ENP can not only increase the toughness and improve the flame retardancy of nylon-6 but also helps unmodified clay exfoliate in nylon-6 matrix. It has been also found that the S-ENP and exfoliated clay platelet in nylon-6 have a synergistic flame retardant effect on nylon-6. A novel flame retardant nanocomposite of nylon-6/unmodified clay/S-ENP with high toughness, high heat resistance, high stiffness and good flowability has been prepared and a mechanism of synergistic flame retardancy has also been proposed.
Due to the rise in the number of vehicles in the past few years, the frequency of traffic accident has increased as well, resulting in huge losses. As a means to improve traffic safety, advanced ...driver assistance system (ADAS) has gradually gained more attention. However, it is difficult for traditional traffic sign recognition algorithms to achieve high accuracy in the ADAS whose scenarios are various in the practical application. And most current methods based on convolution neural network (CNN) for traffic sign recognition has large amount of parameters, making its implementation on resource-limited hardware platform challenging. In this work, we present a FPGA-based convolution neural network module for the recognition of traffic signs in ADAS. Experimental results shows that the accuracy rate of the model is 98.1%, the total number of parameters is 4.7M, only accounting for 12.5% of AlexNet, and the number of calculation in single forward transmission is 703.2M, which takes up 61.4% of AlexNet.
Projects involving both software design and hardware design are usually retarded by expensive equipments, complex simulation and challenging modification. In order to retrench the designers' time and ...economic costs in SW/HW (Software/Hardware) co-design simulation, this paper demonstrates a remote embedded simulation system to help multiple users manage and simulate their SW/HW co-design projects remotely while scheduling the access to on-chip FPGA resources. We built a small-scale, high-concurrent multiuser management service system on board. The system offers TCP/IP connection and transmission, flexible Wi-Fi network, secure multiuser information and files management, real-time task progress notification, compilation and execution of multiple programming languages, run-time FPGA configuration and simulation, which considerably augments the exploitability for embedded simulation service. Meanwhile, we offer users a supporting PC (Personal Computer) application to attain pertinent features, which has a multithreading GUI (Graphical User Interface). In order to verify this design, we deploy a prototype on a Xilinx Zynq ™ -7000 AP (All Programmable) SoC (System on Chips) Z-7010 on a ZYBO Board. We apply an image processing SW/HW co-design project to our prototype. The experiment result demonstrates the system's portability and efficacy when dealing with remote access, and flexibility when simulating SW/HW co-design projects through PR (Partial Reconfiguration) technique within reasonable latency. The latency for the end-to-end reconfiguration of a 306.60KB partial bitfile is 8.819ms.
In this paper, we propose a FPGA bitstream compression method based on LZ77 algorithm and Bitmask-based Compression(BMC) Technique. In order to improve the compression ratio, we optimize LZ77 by ...encoding the variable `length' with golomb code and simplifying the encoding output. We develop appropriate adding strategy and choose proper cost-benefit parameters of BMC, and combine the BMC with LZ77 to further improve the compression ratio. Experiment results shows that our method has 12.9% improvement in compression ratio over fixed dictionary approach and 9.3% over LZSS on an average. Our compression method also has good compatibility. A decompressor which has two look-ahead buffers is designed for decompression. The pipeline architecture of the decompressor improves the throughput. The decompression throughput reaches 1692Mb/s according to Xilinx ISE report. We have verified the decompressor on a Virtex-5 FPGA.
The binarized neural network (BNN) inference accelerators show great promise in cost- and power-restricted domains. However, the performances of these accelerators are still severely limited by the ...significant redundancies in BNNs inference. In this brief, we introduce channel-aware sparse accelerator (CAA) to alleviate the performance degradations induced by the redundancies in BNNs while maintaining original accuracies. First, motivated by the observation that the convolution processes of our rebuilt rectangle kernels contain many redundant operations which can be skipped by exploiting the BNN-specific property, we convert the entire original XNOR-popcount convolutions of each neuron into channel-aware-popcount (CAP) operations for all binarized convolutional and fully-connected layers in CAA by employing rectangle kernel simplification strategy and eliminate the unnecessary operations. Meanwhile, these CAP operations can be implemented to directly gain the final output without any extra steps. Furthermore, inspired by our new observations on two specific kinds of properties of the CAP operations, we adopt group pruning approach to save the remaining redundant CAP operations. Experimental results show that our design evaluated on an embedded FPGA achieves 4.2-<inline-formula> <tex-math notation="LaTeX">6.6{\times } </tex-math></inline-formula> inference-speedup, 3.6-<inline-formula> <tex-math notation="LaTeX">5.5{\times } </tex-math></inline-formula> energy-efficiency enhancement, and <inline-formula> <tex-math notation="LaTeX">1.35{\times } </tex-math></inline-formula> resource-efficiency improvement compared with state-of-the-art works.