Hand Gesture Recognition (HGR) using Frequency Modulated Continuous Wave (FMCW) radars is difficult because of the inherent variability and ambiguity caused by individual habits and environmental ...differences. This paper proposes a deformable dual-stream fusion network based on CNN-TCN (DDF-CT) to solve this problem. First, we extract range, Doppler, and angle information from radar signals with the Fast Fourier Transform to produce range-time (RT) and range-angle (RA) maps. Then, we reduce the noise of the feature map. Subsequently, the RAM sequence (RAMS) is generated by temporally organizing the RAMs, which captures a target’s range and velocity characteristics at each time point while preserving the temporal feature information. To improve the accuracy and consistency of gesture recognition, DDF-CT incorporates deformable convolution and inter-frame attention mechanisms, which enhance the extraction of spatial features and the learning of temporal relationships. The experimental results show that our method achieves an accuracy of 98.61%, and even when tested in a novel environment, it still achieves an accuracy of 97.22%. Due to its robust performance, our method is significantly superior to other existing HGR approaches.
A powerful machine learning detector based on the k-nearest neighbors (KNN) algorithm is proposed to overcome system impairments. The zero-dispersion link (ZDL), dispersion managed link (DML), and ...dispersion unmanaged link (DUL) are considered. Meanwhile, an improved algorithm, the distance-weight KNN, is introduced, which outperforms the conventional maximum likelihood-post compensation approach. The numerical results show that KNN is feasible for overcoming various impairments, especially for non-Gaussian symmetric noise, such as laser phase noise and nonlinear phase noise in the ZDL or DML.
A large number of epidemics, including COVID-19 and SARS, quickly swept the world and claimed the precious lives of large numbers of people. Due to the concealment and rapid spread of the virus, it ...is difficult to track down individuals with mild or asymptomatic symptoms with limited human resources. Building a low-cost and real-time epidemic early warning system to identify individuals who have been in contact with infected individuals and determine whether they need to be quarantined is an effective means to mitigate the spread of the epidemic. In this paper, we propose a smartphone-based zero-effort epidemic warning method for mitigating epidemic propagation. Firstly, we recognize epidemic-related voice activity relevant to epidemics spread by hierarchical attention mechanism and temporal convolutional network. Subsequently, we estimate the social distance between users through sensors built-in smartphone. Furthermore, we combine Wi-Fi network logs and social distance to comprehensively judge whether there is spatiotemporal contact between users and determine the duration of contact. Finally, we estimate infection risk based on epidemic-related vocal activity, social distance, and contact time. We conduct a large number of well-designed experiments in typical scenarios to fully verify the proposed method. The proposed method does not rely on any additional infrastructure and historical training data, which is conducive to integration with epidemic prevention and control systems and large-scale applications.
Facial expression is the main medium of information transmission in human communication, playing an important role in human's daily life. Facial expression recognition is still challenging due to the ...various obstacle, illumination, and posture. However, most of the existing works focus on deeper or wider network structures and rarely explores the high-level feature statistics. In this paper, we propose a second-order pooling convolution neural network to explore the correlation information between the facial features after deep network learning. At the final stage of the network, we add a new covariance pooling layer to replace the first-order pooling of standard convolution networks. In the pooling layer of covariance, the Newton iteration method is used to approximate the square root instead of EIG or SVD, which makes it more suitable for GPU. Due to the small amount of facial expression data, this paper uses different data augmentation methods to increase the amount of training data and improve the generalization ability of the model. The proposed method, data augmentation and second-order pooling (DASOP), was evaluated on the real-world affective faces database (RAFDB) and the static facial expressions in the wild (SFEW), yielding correct rates of 88.625% and 59.518%, respectively. We achieve state-of-the-art performance superior to existing methods.
Next‐generation 6G networks will fully drive the development of the industrial Internet of Things. Steel surface defect detection as an important application in industrial Internet of Things has ...recently received increasing attention from the military industry, the aviation industry and other fields, which is closely related to the quality of industrial production products. However, many typical convolutional neural networks‐based methods are insensitive to the problem of unclear boundaries. In this article, the authors develop a region‐based fully convolutional networks with deformable convolution and attention fusion to adaptively learn salient features for steel surface defect detection. Specifically, deformable convolution is applied into selectively replace the standard convolution in the backbone of the region‐based fully convolutional networks, which performs significantly in scenarios with unclear defect boundaries. Moreover, convolutional block attention module is utilised in region proposal network to further enhance detection accuracy. The proposed architecture is demonstrated on two popular steel defect detection benchmarks, including NEU‐DET and GC10‐DET, which can effectively present the performance of steel surface defect detection by abundant experiments. The mean average precision on two datasets reaches 80.9% and 66.2%. The average precision of defect crazing, inclusion, patches, pitted‐surface, rolled‐in scale and scratches on NEU‐DET is 58.2%, 82.3%, 95.7%, 85.6%, 75.9%, and 87.9% respectively.
Next‐generation 6G networks will fully drive the development of the industrial Internet of Things (IIoT). Steel surface defect detection as an important application in IIoT has recently received increasing attention from the military industry, the aviation industry and other fields. We develop a region‐based fully convolutional networks with deformable convolution and attention fusion (DCA_RFCN) to adaptively learn salient features for steel surface defect detection.
The inertial navigation system (INS), which is frequently used in emergency rescue operations and other situations, has the benefits of not relying on infrastructure, high positioning frequency, and ...strong real-time performance. However, the intricate and unpredictable pedestrian motion patterns lead the INS localization error to significantly diverge with time. This paper aims to enhance the accuracy of zero-velocity interval (ZVI) detection and reduce the heading and altitude drift of foot-mounted INS via deep learning and equation constraint of dual feet. Aiming at the observational noise problem of low-cost inertial sensors, we utilize a denoising autoencoder to automatically eliminate the inherent noise. Aiming at the problem that inaccurate detection of the ZVI detection results in obvious displacement error, we propose a sample-level ZVI detection algorithm based on the U-Net neural network, which effectively solves the problem of mislabeling caused by sliding windows. Aiming at the problem that Zero-Velocity Update (ZUPT) cannot suppress heading and altitude error, we propose a bipedal INS method based on the equation constraint and ellipsoid constraint, which uses foot-to-foot distance as a new observation to correct heading and altitude error. We conduct extensive and well-designed experiments to evaluate the performance of the proposed method. The experimental results indicate that the position error of our proposed method did not exceed 0.83% of the total traveled distance.
Person re-identification aims to retrieve the pedestrian across different cameras. It is still a challenging task for the intelligent visual surveillance system because of similar appearances, camera ...shooting angles, scene illumination, and pedestrian pose. In this paper, we propose a novel two-stream network named spatial segmentation network that learns both the global and local features in a unified framework for nonaligned person re-identification. One stream focuses on spatial feature learning using global adaptive average pooling in deep convolutional neural networks. Another stream is utilized to learn the fine local features by adopting horizontal average pooling without division that depends on the pose predictor. To assess the importance ranking of all features, we also obtain the performance of every part feature and global features. Our evaluation of the proposed method on Market-1501 acquires 94.51% Rank-1 and 90.78% mAP, that on DukeMTMC-re-ID acquires 87.52% Rank-1 and 84.82% mAP, and that on CHUK03-detected acquires 69.71% Rank-1 and 71.67% mAP; these findings verify the state-of-the-art performance of the proposed method.
Intelligent manufacturing is a challenging and compelling topic in Industry 4.0. Many computer vision (CV)-based applications have attracted widespread interest from researchers and industries around ...the world. However, it is difficult to integrate visual recognition algorithms with industrial control systems. The low-level devices are controlled by traditional programmable logic controllers (PLCs) that cannot realize data communication due to different industrial control protocols. In this article, we develop a multi-crane visual sorting system with cloud PLCs in a 5G environment, in which deep convolutional neural network (CNN)-based character recognition and dynamic scheduling are designed for materials in intelligent manufacturing. First, an YOLOv5-based algorithm is applied to locate the position of objects on the conveyor belt. Then, we propose a Chinese character recognition network (CCRNet) to significantly recognize each object from the original image. The position, type, and timestamp of each object are sent to cloud PLCs that are virtualized in the cloud to replace the function of traditional PLCs in the terminal. After that, we propose a dynamic scheduling method to sort the materials in minimum time. Finally, we establish a real experimental platform of a multi-crane visual sorting system to verify the performance of the proposed methods.
An intelligent eye-diagram analyzer is proposed to implement both modulation format recognition (MFR) and optical signal-to-noise rate (OSNR) estimation by using a convolution neural network ...(CNN)-based deep learning technique. With the ability of feature extraction and self-learning, CNN can process eye diagram in its raw form (pixel values of an image) from the perspective of image processing, without knowing other eye-diagram parameters or original bit information. The eye diagram images of four commonly-used modulation formats over a wide OSNR range (10~25 dB) are obtained from an eye-diagram generation module in oscilloscope combined with the simulation system. Compared with four other machine learning algorithms (decision tress, k-nearest neighbors, back-propagation artificial neural network, and support vector machine), CNN obtains the higher accuracies. The accuracies of OSNR estimation and MFR both attain 100%. The proposed technique has the potential to be embedded in the test instrument to perform intelligent signal analysis or applied for optical performance monitoring.
Facial expressions which contain rich behavioral information are the primary vehicle to express emotions. It is important to analyze people's emotions with computer to achieve human-computer ...interaction. Feature extraction is the most important factor affecting the recognition effect. However, the existing deep learning for expression recognition is mainly based on global feature extraction. Local feature extraction provides more fine-grained information than global features. To strengthen the local discrimination of the image and pay more attention to the small targets in the local region, we propose an innovative Adaptive Weight Based on Overlapping Blocks Network (AWOBNet) for learning feature representation. First, we spatially overlay the feature maps to obtain the local features of the face. Considering the correlation and proportion between different features, we model the correlation between feature channels after overlapping blocks. Moreover, a new adaptive weighting method is developed to enhance significant features. We evaluate the proposed network on two public datasets, including the Real-World Affective Faces Database (RAFDB) and the Static Facial Expressions in the Wild (SFEW), and show the performance using the visualization method. The accuracy rates of our method obtain 89.863% on RAFDB and 62.410% on SFEW, which is significantly higher than the existing technical level.
•Adaptive Weight Based on Overlapping Blocks Network for Facial Expression Recognition.•Feature map blocking to extract local features of the network to improve performance.•The relevant information between local features to achieve effective performance.•Performance of local network for facial expressions that are occluded and posture.