Covid-19 has been spread worldwide for nearly three years and affected the economy and our daily life deeply. Even though Tedros Adhanom Ghebreyesus, director-general of WHO, declared Covid-19 over ...as a global health emergency. In the coming period, human beings will continue living with and fighting against it for a long time. How to distinguish between Covid-19 and community-acquired pneumonia (CAP) patients will become a basic clinical task. Apart from the RT-PCR test, CT is also a complementary tool to detect whether infected by Covid-19. In this paper, a TransSE-ResNet model with the multi-head attention mechanism is proposed for Covid-19 chest CT image classification. The input images can be classified into three categories, Covid-19 patients, CAP patients, and healthy individuals. From the results of the experiments, the proposed model obtains an accuracy of 96.76%, outperforming the related models in the experiment. It is qualified to work as a part of the Covid-19 infection detection system we are developing.
Fault coupling and fault override are common phenomena when faults occur in different parts of the planetary gearbox. Labeled compound fault samples are very rare or even unavailable in industrial ...scenarios. Thus, it is a challenging task to decouple the compound fault and detect the compound fault with unequal severity only utilizing single faults. In this paper, an untrained planetary gearbox compound fault diagnosis method based on Adaptive Learning Variational Mode Decomposition (ALVMD) and Dual Scale Squeeze-and-Excitation Convolutional Neural Network (DSSECNN) is proposed. An ALVMD algorithm is established to enhance weak fault characteristics and enrich the diversity of sample information. The DSSECNN intelligent fault diagnosis model is presented, and the lightweight SENet can enhance the sensitivity of the model to channel features. Moreover, the mapping relationship between the compound fault mode and single faults is revealed through the proposed probability formula, which can reduce the dependence of the training model on compound fault samples. Experiments have been performed on the gearbox fault experiment bench in the laboratory to verify the effectiveness and generalization performance of the method. Besides, wind turbine gearbox test results show that the ALVMD-DSSECNN can achieve the highest average accuracy of 98.83% compared with other related methods in variable working conditions, which has a certain guiding significance for practical engineering.
Semantic segmentation is a fundamental research in optical remote sensing image processing. Because of the complex maritime environment, the sea-land segmentation is a challenging task. Although the ...neural network has achieved excellent performance in semantic segmentation in the last years, there were a few of works using CNN for sea-land segmentation and the results could be further improved. This paper proposes a novel deep convolution neural network named DeepUNet. Like the U-Net, its structure has a contracting path and an expansive path to get high-resolution optical output. But differently, the DeepUNet uses DownBlocks instead of convolution layers in the contracting path and uses UpBlock in the expansive path. The two novel blocks bring two new connections that are U-connection and Plus connection. They are promoted to get more precise segmentation results. To verify the network architecture, we construct a new challenging sea-land dataset and compare the DeepUNet on it with the U-Net, SegNet, and SeNet. Experimental results show that DeepUNet can improve 1-2% accuracy performance compared with other architectures, especially in high-resolution optical remote sensing imagery.
Text-independent speaker verification is an important artificial intelligence problem that has a wide spectrum of applications, such as criminal investigation, payment certification, and ...interest-based customer services. The purpose of text-independent speaker verification is to determine whether two given uncontrolled utterances originate from the same speaker or not. Extracting speech features for each speaker using deep neural networks is a promising direction to explore and a straightforward solution is to train the discriminative feature extraction network by using a metric learning loss function. However, a single loss function often has certain limitations. Thus, we use deep multi-metric learning to address the problem and introduce three different losses for this problem, i.e., triplet loss, n-pair loss and angular loss. The three loss functions work in a cooperative way to train a feature extraction network equipped with Residual connections and squeeze-and-excitation attention. We conduct experiments on the large-scale VoxCeleb2 dataset, which contains over a million utterances from over 6,000 speakers, and the proposed deep neural network obtains an equal error rate of 3.48%, which is a very competitive result. Codes for both training and testing and pretrained models are available at https://github.com/GreatJiweix/DmmlTiSV, which is the first publicly available code repository for large-scale text-independent speaker verification with performance on par with the state-of-the-art systems.
Analyzing fish school feeding behavior can assist aquaculturists in making successful feeding decisions, which is critical for enhancing farming efficiency and supporting healthy fish growth. Due to ...the fierce gathering, jumping, chasing and other actions that occur during fish school feeding, images contain varying degrees of noise and significant overlap between fish targets. Therefore, accurately identifying fish school feeding behavior becomes challenging. To address this problem, the study proposes a MobileNetV2-SENet-based method for identifying fish school feeding behavior. To enhance sample diversity, fish school images are preprocessed with some operations, such as random cropping and brightness-contrast adjustment. Then, MobileNetV2 is used to extract fish school image features, and a feature weighting network based on SENet (Squeeze-and-Excitation Networks) is built. Weights are assigned to features with varying degrees of value. The feeding behavior of fish school is identified using the linear classifier. Finally, a method of determining feeding amount is provided based on the identification result to reduce feed consumption. The SENet is introduced in the proposed method based on the original MobileNetV2 feature extraction network, and a feature extraction and weighting network that integrates MobileNetV2 and SENet is built to improve the weights of features relevant for fish school state identification while suppressing the weights of interference elements such as noise, resulting in more accurate identification. The proposed method was tested on the real fish school images and yielded an accuracy of 97.76%. The accuracy of our model was improved by 3.81%, 9.99%, 50.35%, and 100.37% when compared to the fish school feeding behavior identification model based on EfficientNet_B0, ShuffleNetV2, AlexNet, and EfficientNetV2, respectively, indicating that our method can accurately identify the feeding behavior of fish school in real aquaculture.
•A feature extraction and weighting network based on MobileNetV2-SENet is built.•An efficient fish school feeding identification method is proposed.•A optimized method of determining feeding amount is provided.•The method accurately and stably realizes behavior identification.
Images captured under low-light environments typically have poor visibility, affecting many advanced computer vision tasks. In recent years, there have been some low-light image enhancement models ...based on deep learning, but they have not been able to effectively mine the deep multiscale features in the image, resulting in poor generalization performance and instability of the model. The disadvantages are mainly reflected in the color distortion, color unsaturation and artifacts. Current methods unable to adjust the exposure effectively, resulting in uneven exposure or partial overexposure. To address these issues, we propose an end-to-end low-light image enhancement model, which is called multiscale low-light image enhancement network with illumination constraint (MLLEN-IC), to achieve preferable generalization ability and stable performance. On the one hand, we use the squeeze-and-excitation-Res2Net block (SE-Res2block) as a base unit to enhance the model's ability by extracting deep multiscale features. On the other hand, to make the model more adaptable in low-light image enhancement tasks, we calculate the illumination constraint by the low-light itself to prevent overexposure, uneven exposure, and unsaturated colors. Extensive experiments are conducted to demonstrate MLLEN-IC not only adjusts light levels, but also has a more natural visual effect, and avoids problems such as color distortion, artifacts, and uneven exposure. In particular, MLLEN-IC has pretty generalization and stability performance. The source code and supplementary are available at https://github.com/CCECfgd/MLLEN-IC .
Protocol recognition technology assumes a crucial position and exerts significant influence in the domains of network communication and information security. Existing protocol recognition methods ...based on spatio-temporal features cannot adequately and comprehensively extract protocol features. An application layer protocol recognition method incorporating SENet channel attention and Transformer is proposed. The model focuses on spatio- temporal feature extraction of protocol data, and the model consists of a spatial feature extraction module and a time extraction module. SE blocks are added to the residual network to capture the associations between multiple channels and adaptively assign weights, so as to extract the key space features in different channels. The temporal feature extraction module is constructed by stacking the transformer encoders based on multi-head attention mechanism. This module is used to comprehensively capture temporal features of the protocol data by directly leveraging the positiona
This study addresses the crucial aspect of identifying individual appliance power consumption without extensive sensor deployment, a cornerstone of modern smart grid planning and demand response. ...Non-Intrusive Load Monitoring (NILM) offers a solution by estimating appliance energy usage from aggregated meter data, increasingly powered by deep learning techniques. However, the depth-dependent effectiveness of load feature extraction can lead to gradient-related issues. To mitigate this, novel approach using a Bi-directional Temporal Convolutional Network (BiTCN) as a residual block foundation, enhanced by a Squeeze-and-Excitation Network (SENet) attention mechanism for channel-wise feature extraction. A departure from conventional methods involves substituting bidirectional non-causal convolution with channel attention for improved feature extraction was introduced. Evaluation on REDD and UK-DALE datasets demonstrates our model’s superiority in load disaggregation compared to existing approaches, emphasizing its potential in advancing NILM through effective deep learning strategies
•Channel Attention Bi-directional Temporal Convolutional Network (CABTCN) was introduced to effectively learn long-term dependencies in temporal sequences while simultaneously recalibrating the importance of each channel.•A Squeeze-and-Excitation Network was used along with the bi-directional TCN residual block to capture the high-level features and activations (ON/OFF states) of the appliances.•A hybrid loss function, combines Huber and quantile regression, addressing overestimation and underestimation in energy consumption, capturing variations among appliances.•Performance of the model was evaluated using the metrics accuracy, f1-score, Mean Absolute Error (MAE) and Signal Aggregate Error (SAE).
•Customize a formula for sensitivity analysis to screen characteristic parameters.•Use an improved hybrid neural network with high fitting exactness and great prediction effect to establish ...efficiency regression prediction of SRM.•Propose improved northern goshawk optimization with strong search ability and fast convergence to optimize the objective function of the model, achieving an increase of 2.975% in efficiency.
Six-degrees of freedom (6-DOF) parallel mechanisms driven by switched reluctance motors (SRMs) can realize flexible control with high precision. Efficiency is an important indicator to measure the speed control system of SRMs. There are many characteristic factors affecting efficiency and strong nonlinear relationships between different characteristic parameters, which makes it difficult for analytical models and traditional neural network models to express their spatial correlation. For this reason, a convolutional neural network (CNN)-bidirectional long short-term memory network (BiLSTM) efficiency regression prediction model (CNN-BiLSTM-SENet) that integrates the attention mechanism (SENet) is proposed. Firstly, customize a formula for sensitivity analysis to screen characteristic parameters of efficiency. Secondly, build the CNN-BiLSTM-SENet model and use sparrow search algorithm (SSA) for hyperparameter optimization, input data to CNN to extract high-dimensional feature vectors that reflect complex changing relationships between features and efficiency while establishing feature channels. Embed the SENet to adaptively perceive and assign different weights to the feature channels, enhancing the influence of key features. Input the feature vectors outputted by the front-end network to BiLSTM to bidirectionally learn coupling relationships between sequences and complete regression prediction. Finally, propose improved northern goshawk optimization (MNGO) to solve the regression model to obtain the maximum efficiency and corresponding characteristic parameters. The results proved that the SSA-optimized CNN-BiLSTM-SENet model has higher fitting exactness and better prediction effect for efficiency regression prediction, and the MNGO also has stronger search ability and faster convergence.
Micro-expression is a kind of facial feature that reflects the most real emotional state hidden in the human heart. Most of the existing micro-expression recognition methods are based on manual ...feature extraction of subtle movements of facial muscles. Due to its short duration and weak intensity, the accurate identification of micro-expression remains a challenging task. This paper investigates micro-expression recognition based on deep learning methods and proposes a three-dimensional SE-DenseNet architecture, which fused Squeeze-and-Excitation Networks with a 3D DenseNet and can automatically integrate the spatiotemporal features extracted from each video to increase the weight of valid feature maps. The proposed architecture first obtains apex frames from each video for the most obvious facial muscle movements and then amplifies facial muscle movements using Euler video magnification to significantly alleviate the issue of small sample size and weak intensity of micro-expression recognition. Finally, the pre-processed videos are fed into the 3D SE-DenseNet for further feature extraction as well as to perform micro-expression classification. Experiments are performed on three public datasets. Our best model obtains an overall accuracy of 95.12%, 92.96%, and 82.74% on SMIC, CAS(ME)2 and CASME-II dataset, respectively. The experimental results show that the proposed methods can well describe the considerable details of micro-expression and outperform most of the state-of-the-art methods on three public datasets.
•Appropriate preprocessing promotes the extraction of micro-expression features.•The three-dimensional DenseNet can extract facial features deeply.•SE block combined with DenseNet can facilitate feature extraction.•Different SE block combination methods significantly affect the recognition rate.