The state-of-the-art models for medical image segmentation are variants of U-Net and fully convolutional networks (FCN). Despite their success, these models have two limitations: (1) their optimal ...depth is apriori unknown, requiring extensive architecture search or inefficient ensemble of models of varying depths; and (2) their skip connections impose an unnecessarily restrictive fusion scheme, forcing aggregation only at the same-scale feature maps of the encoder and decoder sub-networks. To overcome these two limitations, we propose UNet++, a new neural architecture for semantic and instance segmentation, by (1) alleviating the unknown network depth with an efficient ensemble of U-Nets of varying depths, which partially share an encoder and co-learn simultaneously using deep supervision; (2) redesigning skip connections to aggregate features of varying semantic scales at the decoder sub-networks, leading to a highly flexible feature fusion scheme; and (3) devising a pruning scheme to accelerate the inference speed of UNet++. We have evaluated UNet++ using six different medical image segmentation datasets, covering multiple imaging modalities such as computed tomography (CT), magnetic resonance imaging (MRI), and electron microscopy (EM), and demonstrating that (1) UNet++ consistently outperforms the baseline models for the task of semantic segmentation across different datasets and backbone architectures; (2) UNet++ enhances segmentation quality of varying-size objects-an improvement over the fixed-depth U-Net; (3) Mask RCNN++ (Mask R-CNN with UNet++ design) outperforms the original Mask R-CNN for the task of instance segmentation; and (4) pruned UNet++ models achieve significant speedup while showing only modest performance degradation. Our implementation and pre-trained models are available at https://github.com/MrGiovanni/UNetPlusPlus.
Federated learning (FL) allows model training from local data collected by edge/mobile devices while preserving data privacy, which has wide applicability to image and vision applications. A ...challenge is that client devices in FL usually have much more limited computation and communication resources compared to servers in a data center. To overcome this challenge, we propose PruneFL -a novel FL approach with adaptive and distributed parameter pruning, which adapts the model size during FL to reduce both communication and computation overhead and minimize the overall training time, while maintaining a similar accuracy as the original model. PruneFL includes initial pruning at a selected client and further pruning as part of the FL process. The model size is adapted during this process, which includes maximizing the approximate empirical risk reduction divided by the time of one FL round. Our experiments with various datasets on edge devices (e.g., Raspberry Pi) show that: 1) we significantly reduce the training time compared to conventional FL and various other pruning-based methods and 2) the pruned model with automatically determined size converges to an accuracy that is very similar to the original model, and it is also a lottery ticket of the original model.
•A lightweight tomato target detection algorithm based on YOLOv5 is proposed.•The improved model has a significantly higher detection speed on the CPU platform.•The model is deployed on smartphones ...using model quantization and an APP is developed to achieve local real-time detection of tomatoes.•The 16-bit quantized model performs the best and the detection speed meets the needs of real-time.
The current deep-learning-based tomato target detection algorithm has many parameters; it has drawbacks of large computation, long time consumption, and reliance on high-computing-power devices such as graphics processing units (GPU). In this study, we propose a lightweight improved YOLOv5 (You Only Look Once) based algorithm to achieve real-time localization and ripeness detection of tomato fruits. Initially, this algorithm used a down-sampling convolutional layer instead of the original focus layer, reconstructing the backbone network of YOLOv5 using the bneck module of MobileNetV3. Then, it performs channel pruning for the neck layer to further reduce the model size and uses a genetic algorithm for hyperparameter optimization to improve detection accuracy. We evaluate the improved algorithm using a homemade tomato dataset. The experimental results demonstrated that the improved model number of parameters and floating point operations per second (FLOPs) were compressed by 78% and 84.15% compared to the original YOLOv5s, while the mAP reached 0.969. Meanwhile, the detection speed on the central processing unit (CPU) platform was 42.5 ms, which was 64.88% better. This study further utilized the Nihui convolutional neural network (NCNN) framework to quantize the improved model and developed an Android-based real-time tomato monitoring application (app). Experimental results demonstrated that the 16-bit quantized model achieved an average detection frame rate of 26.5 frames per second (fps) on the mobile side with lower arithmetic power, which was 268% better than the original YOLOv5s, and the model size was reduced by 51.1% while achieving a 93% true detection rate.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Recent surge of Convolutional Neural Networks (CNNs) has brought successes among various applications. However, these successes are accompanied by a significant increase in computational cost and the ...demand for computational resources, which critically hampers the utilization of complex CNNs on devices with limited computational power. In this work, we propose a feature representation based layer-wise pruning method that aims at reducing complex CNNs to more compact ones with equivalent performance. Different from previous parameter pruning methods that conduct connection-wise or filter-wise pruning based on weight information, our method determines redundant parameters by investigating the features learned in the convolutional layers and the pruning process is operated at a layer level. Experiments demonstrate that the proposed method is able to significantly reduce computational cost and the pruned models achieve equivalent or even better performance compared to the original models on various datasets.
The inevitable multi-component assembly errors and complex data collection sites lead to coupling fault information and global distribution differences among individuals, making fault diagnosis of ...machine-level motors more challenging. This article proposes a lightweight cross-machine model, namely, coarse-fine signal pruning transformer (CFSPT), specially for the compound fault diagnosis. Specifically, the unidirectional multi-scale convolutional patches (UDMCP) are proposed to provide flexible global information interaction and fusion. Coarse-grained temporal locator (CTL) and pruned fine-grained feature extractor (PFFE) are designed as the multi-process feature pruner and extractor, which not only improve attention to key temporal blocks, but also achieve lightweight design. The superiority of the proposed CFSPT is validated on real industrial production line motors instead of laboratory part-level signals. The comprehensive experimental results based on visualization show that the proposed method achieves the highest generalization performance of 94.74% cross machine accuracy (CMA). The proposed CFSPT with interpretable design, as a lightweight, efficient and reliable method, has great application potential in cross machine fault diagnosis scenarios of machine-level motors.
•A cross-machine model for machine-level motor compound fault diagnosis.•A lightweight coarse-fine signal pruning transformer with interpretable design.•A convolutional tokenizer for multiscale information interaction and fusion.•A coarse-grained pruning structure used as the multi-process feature pruner.•Experiments conducted on motor production lines rather than laboratory benches.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
•We propose a UNet-based nested segmentation network for medical image segmentation.•Nested UNet architecture integrates features at different levels to improve performance.•Attention mechanism is to ...focus on target organ while suppressing irrelevant tissue.•Deep supervision is introduced to perform network pruning operations during testing.
Display omitted
Organ cancer have a high mortality rate. In order to help doctors diagnose and treat organ lesion, an automatic medical image segmentation model is urgently needed as manually segmentation is time-consuming and error-prone. However, automatic segmentation of target organ from medical images is a challenging task because of organ’s uneven and irregular shapes. In this paper, we propose an attention-based nested segmentation network, named ANU-Net. Our proposed network has a deep supervised encoder-decoder architecture and a redesigned dense skip connection. ANU-Net introduces attention mechanism between nested convolutional blocks so that the features extracted at different levels can be merged with a task-related selection. Besides, we redesign a hybrid loss function combining with three kinds of losses to make full use of full resolution feature information. We evaluated proposed model on MICCAI 2017 Liver Tumor Segmentation (LiTS) Challenge Dataset and ISBI 2019 Combined Healthy Abdominal Organ Segmentation (CHAOS) Challenge. ANU-Net achieved very competitive performance for four kinds of medical image segmentation tasks.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Automated fault detection and diagnosis (AFDD) plays a crucial role in enhancing the energy efficiency of air conditioning systems. Deep learning has emerged as a promising tool for image ...classification, and its application in the context of AFDD of HVAC systems is gaining traction due to its exceptional performance. However, the deployment cost of deep models in practical scenarios is increased due to the large number of parameters and the lack of interpretability. This paper focuses on improving the potential of deep learning models for AFDD in real HVAC systems. We use pruning to reduce the number of parameters in the model and use layer-wise relevance propagation (LRP) to improve the interpretability of the model. The case study builds a simulation model and 31 kinds of fault data sets based on the actual HVAC in Japan. Based on the findings, Without loss of accuracy, the pruning method can reduce the model size by more than 99 % and maintain 90% classification accuracy. The LRP score allows model users to find out the input data that most affects the results at each diagnosis, improving interpretability.
•The study presents a solution for cutting costs of deep learning deployment in HVAC system AFDD.•The method combines pruning and LRP, reducing model size by over 99•LRP score improves model interpretability by showing impactful input data on diagnosis results.•The study shows the potential of deep learning in HVAC AFDD, enhancing their practicality and usefulness.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Single image super-resolution (SISR) has achieved significant performance improvements due to the deep convolutional neural networks (CNN). However, the deep learning-based method is computationally ...intensive and memory demanding, which limit its practical deployment, especially for mobile devices. Focusing on this issue, in this paper, we present a novel approach to compress SR networks by weight pruning. To achieve this goal, firstly, we explore a progressive optimization method to gradually zero out the redundant parameters. Then, we construct a sparse-aware attention module by exploring a pruning-based well-suited attention strategy. Finally, we propose an information multi-slicing network which extracts and integrates multi-scale features at a granular level to acquire a more lightweight and accurate SR network. Extensive experiments reflect the pruning method could reduce the model size without a noticeable drop in performance, making it possible to apply the start-of-the-art SR models in the real-world applications. Furthermore, our proposed pruning versions could achieve better accuracy and visual improvements than state-of-the-art methods.
•We prove most existing SR networks are over-parameterized and the model size can be dramatically reduced without a noticeable drop in performance, making it possible to apply the SR model in real-world applications.•We introduce a progressive global sparse optimization method to prune the SR network and explore a sparse-aware attention module to reduce the performance gap between the pruning version and the original one.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
While a practical wireless network has many tiers where end users do not directly communicate with the central server, the users' devices have limited computation and battery powers, and the serving ...base station (BS) has a fixed bandwidth. Owing to these practical constraints and system models, this paper leverages model pruning and proposes a pruning-enabled hierarchical federated learning (PHFL) in heterogeneous networks (HetNets). We first derive an upper bound of the convergence rate that clearly demonstrates the impact of the model pruning and wireless communications between the clients and the associated BS. Then we jointly optimize the model pruning ratio, central processing unit (CPU) frequency and transmission power of the clients in order to minimize the controllable terms of the convergence bound under strict delay and energy constraints. However, since the original problem is not convex, we perform successive convex approximation (SCA) and jointly optimize the parameters for the relaxed convex problem. Through extensive simulation, we validate the effectiveness of our proposed PHFL algorithm in terms of test accuracy, wall clock time, energy consumption and bandwidth requirement.
Deep Neural Networks (DNN) has made significant progress in recent years. However, its high computing and storage costs make it challenging to apply on resource-limited platforms or edge computation ...scenarios. Recent studies have shown that model pruning is an effective method to solve this problem. Typically, the model pruning method is a three-stage pipeline: training, pruning, and fine-tuning. In this work, a novel structured pruning method for Convolutional Neural Networks (CNN) compression is proposed, where filter-level redundant weights are pruned according to entropy importance criteria (termed FPEI). In short, the FPEI criterion, which works in the stage of pruning, defines the importance of the filter according to the entropy of feature maps. If a feature map contains very little information, it should not contribute much to the whole network. By removing these uninformative feature maps, their corresponding filters in the current layer and kernels in the next layer can be removed simultaneously. Consequently, the computing and storage costs are significantly reduced. Moreover, because our method cannot show the advantages of the existing ResNet pruning strategy, we propose a dimensionality reduction (DR) pruning strategy for ResNet structured networks. Experiments on several datasets demonstrate that our method is effective. In the experiment about the VGG-16 model on the SVHN dataset, we removed 91.31% of the parameters, from 14.73M to 1.28M, achieving a 63.77% reduction in the FLOPs, from 313.4M to 113.5M, and 1.73 times speedups of model inference.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP