To solve the problems of insufficient samples and weak fault features of audio signals in the fault diagnosis of plunger pump, this paper proposes a fault diagnosis method of plunger pump based on ...audio signal combined with meta-transfer learning (MTL-PAFD). The method takes the audio signals of the plunger pump as samples, which are acquired by a single sensor. Through the Gammatone filter bank processing, the representation ability of the audio signal under strong noise interference is effectively improved. Then combined with meta-transfer learning, the few-shot fault diagnosis of plunger pump is realized. In addition, according to the actual needs of fault diagnosis of plunger pump, the test method of meta-transfer learning in fault diagnosis application is improved, which can process unknown fault classes adaptively. Experimental results show that MTL-PAFD has a fault diagnosis accuracy of 91.41% for seen classes. After fast adaptive learning, it can achieve an accuracy of 89.64% when identifying unseen fault classes.
In addition to being extremely non-linear, modern problems require millions if not billions of parameters to solve or at least to get a good approximation of the solution, and neural networks are ...known to assimilate that complexity by deepening and widening their topology in order to increase the level of non-linearity needed for a better approximation. However, compact topologies are always preferred to deeper ones as they offer the advantage of using less computational units and less parameters. This compacity comes at the price of reduced non-linearity and thus, of limited solution search space. We propose the 1-Dimensional Polynomial Neural Network (1DPNN) model that uses automatic polynomial kernel estimation for 1-Dimensional Convolutional Neural Networks (1DCNNs) and that introduces a high degree of non-linearity from the first layer which can compensate the need for deep and/or wide topologies. We show that this non-linearity enables the model to yield better results with less computational and spatial complexity than a regular 1DCNN on various classification and regression problems related to audio signals, even though it introduces more computational and spatial complexity on a neuronal level. The experiments were conducted on three publicly available datasets and demonstrate that, on the problems that were tackled, the proposed model can extract more relevant information from the data than a 1DCNN in less time and with less memory.
Tool wear is an important parameter in the machining because the production, cost and performance is highly depend upon its performance. Therefore, the monitoring of cutting tool wear plays an ...important role in mechanical machining processes. With this aim, the present work deals with the application of novel ensemble deep learning model for cutting tool wear monitoring using audio sensors. The tool wear data during machining was extracted with an audio denoising technique combined with Fast Fourier Transform (FFT) and bandpass filters and dependent component analysis (DCA). Then, the ensemble convolutional neural networks (CNN) detection model was trained and audio signals were converted into audio images with different algorithms. Finally, the results confirm that this novel method is very accurate to predict the tool wear values under different cutting conditions.
•An audio-based tool wear monitoring method is proposed.•A new denoising model is developed for the audio signals.•A ensemble deep learning model is developed for tool wear degree identification.
The existing broiler health monitoring technology has problems such as low automation, unstable monitoring results, and low practical value, making it difficult to provide timely and reliable broiler ...health monitoring results. The broiler sound signal can provide feedback on their health. A widely validated and correct experience is to analyze the frequency of coughs in a segment of broiler sound signal to determine the health of the broiler group. Based on this, in this paper, the authors proposed a new broiler health monitoring technology based on sound detection. The broiler health monitoring problem is cleverly transformed into a multi-classification problem, which can be solved by identifying the sound types in broiler sound signals. Specifically, the audio signal collection system was designed to complete signal collection and preliminary signal filtering. Wiener filtering was used for deep signal filtering. The 60-dimensional sound features with good performance from three aspects, time-frequency domain, Mel-Frequency Cepstral Coefficients, and sparse representation were extracted, and a preliminary data set was created. Min-max normalization was used to align the numerical distribution of the data set, and a high-quality data set was created. Multi-classification models based on different classification algorithms and neural networks were trained, and the best-performing Random Forest was obtained, thus parameter optimization was carried out, and the optimal multi-classification model was obtained, achieving a classification accuracy of 91.14%. The visualization platform was built to process the classification results of the multi-classification model, completing majority voting processing and cough rate calculation, thereby achieving broiler health monitoring. In addition, the definitions of cough rate and prediction accuracy were newly proposed. A large number of experiments have verified the feasibility of the broiler health monitoring technology proposed in this paper, with an average prediction accuracy of 98.97% achieved.
•Newly propose a complete broiler health monitoring technology based on sound detection.•Transform the broiler health monitoring problem into the sound type identification problem.•Newly propose an index of cough rate to evaluate the health of broiler groups.•Newly propose a data quality improvement scheme.•Obtain the highest prediction accuracy of broiler health monitoring in this field, currently.
Speaker verification models have achieved good results on the single genre data. But the performance degrades when model training and testing are not in the same domain. The adversarial training ...method is proposed to solve this problem by minimizing domain distribution differences. However, the adversarial training ignores domain‐specific information for the domain‐invariant speaker representations. In this paper, an improved collaborative adversarial network for domain adaptation in speaker verification is performed. Compared to the adversarial training, a collaborative discriminator is newly incorporated that learns domain‐specific information at the lower layers. Further, the projection block is added to the collaborative discriminator. It reduces the noise introduced by the collaborative discriminator. Experiments are conducted in different mismatch scenarios and using different speaker encoders. All the experimental results show that the performance of this method is better than the baseline and previous work using adversarial training.
This work can extract better speaker representations that are both domain‐ invariant and domain‐specific. The proposed collaborative discriminator enables the speaker encoder to learn domain‐specific information, which is beneficial for adversarial training. Further, the projection block is designed to reduce the noise introduced by the collaborative discriminator.
WITHOUT DOUBT, THERE IS EMOTIONAL INFORMATION IN ALMOST ANY KIND OF SOUND RECEIVED BY HUMANS EVERY DAY: be it the affective state of a person transmitted by means of speech; the emotion intended by a ...composer while writing a musical piece, or conveyed by a musician while performing it; or the affective state connected to an acoustic event occurring in the environment, in the soundtrack of a movie, or in a radio play. In the field of affective computing, there is currently some loosely connected research concerning either of these phenomena, but a holistic computational model of affect in sound is still lacking. In turn, for tomorrow's pervasive technical systems, including affective companions and robots, it is expected to be highly beneficial to understand the affective dimensions of "the sound that something makes," in order to evaluate the system's auditory environment and its own audio output. This article aims at a first step toward a holistic computational model: starting from standard acoustic feature extraction schemes in the domains of speech, music, and sound analysis, we interpret the worth of individual features across these three domains, considering four audio databases with observer annotations in the arousal and valence dimensions. In the results, we find that by selection of appropriate descriptors, cross-domain arousal, and valence regression is feasible achieving significant correlations with the observer annotations of up to 0.78 for arousal (training on sound and testing on enacted speech) and 0.60 for valence (training on enacted speech and testing on music). The high degree of cross-domain consistency in encoding the two main dimensions of affect may be attributable to the co-evolution of speech and music from multimodal affect bursts, including the integration of nature sounds for expressive effects.
Tool wear in machining could result in poor surface finish, excessive vibration and energy consumption. Monitoring tool wear in real-time is crucial to improve manufacturing productivity and quality. ...While numerous sensor-based tool wear monitoring techniques have been demonstrated in laboratory environments, few tool wear monitoring systems have been deployed in factories because it is not realistic to install some of the important sensors such as dynamometers on manufacturing machines. To address this issue, a novel audio signal processing approach is introduced. This technique does not require expensive sensors but audio sensors only. A blind source separation method is used to separate source signals from noise. An extended principal component analysis is used for dimensionality reduction. Real-time multi-channel audio signals are collected during a set of milling tests under varying cutting conditions. The experimental data are used to develop and validate a predictive model. Experimental results have shown that the predictive model is capable of classifying tool wear conditions with high accuracy.
The indexable insert drill, commonly known as the U drill, holds a significant market share of approximately 53% among drilling tools. Therefore, investigation and improvement studies are carried out ...by both industry and academia. U drills are usually produced in different length/diameter ratios and with two coolant holes. But in other cases, some manufacturers design a third coolant hole in the chip evacuation channel, where the central insert of the U drills is located. It is thought that the coolant holes and length/diameter ratios change the conditions of the drilling process. In this study, the impact of U drills with various attributes was examined using thrust force, torque, spindle load, and audio signals. For this purpose, Al 7075-T651 aluminum alloy was drilled with 4 different U drills. The trials used three feed rates (0.06, 0.09, and 0.12 mm/rev) and three cutting speeds (200, 250, and 300 m/min). Experimental results show that the length/diameter ratio of U drills has the highest impact on thrust force (76.45%), spindle load (53.33%), and audio signal (87.53%). However, it was ineffective for torque, according to the Anova analysis. Moreover, the U drill, which has an additional coolant hole, generates higher thrust forces (39.89%) and audio signals (95.17%), lower spindle loads (41.28%), and lower torque (3.26%). Taguchi based grey relational analysis that was used to optimize the test parameters provided an improvement of 26.06% according to the gray relationship grade, which is the normal method used. To sum up, these findings may contribute to improving the design and production of U drills to enhance their drilling performance.
In the traditional network media education system, there is little content about skill performance, and the malleability is poor. With the combination of Internet technology, online teaching videos ...of performance skills began to be promoted on a large scale, and with the help of the Internet, the learning of online performance skills has been updated. The aim of this study is to develop a performance skills teaching system based on machine vision and sensor audio signal processing technology to provide a simulated performance environment and personalized instruction. In this study, multiple sensors were used to capture various movement data during the performance of students. Through the use of sensor technology, the system is able to capture information about students' posture, finger position and force in real time, and analyze posture and movement in combination with machine vision technology. The system can also process and analyze the audio signals in the students' performance to evaluate the accuracy and fluency of the playing technique. The application of sensor technology enables the system to accurately capture students' movements and provide timely feedback. The use of machine vision technology allows the system to accurately analyze the student's posture and movements to provide more targeted guidance. The application of audio signal processing algorithm enables the system to objectively evaluate the students' playing skills and help them improve the accuracy and fluency in the playing process.