Recently, there has been an increased interest in additive manufacturing (AM) for its potential to reduce costs and lighten the weight of manufactured parts. However, materials produced through AM ...are prone to defects that can significantly impact their fatigue resistance. Identifying fatigue failure sources is crucial for the characterization of critical manufacturing defects, especially for the future use of AM in main load-bearing structural parts. This requires conducting fatigue tests and manually inspecting fracture surfaces. In this research, we introduce an innovative machine-learning model designed to detect the initiation defects causing fatigue cracks in Titanium Ti-6Al-4V samples manufactured by selective laser melting (SLM). The model also measures the distance between the detected fatigue failure source and the surface of the material. Our approach involves initially segmenting out areas without initiation points, and then identifying these points in the remaining areas. We then use established computer vision techniques to calculate their distance from the surface. The results of our study highlight the significant potential of using machine learning and computer vision to automate fractographic analysis. This advancement could greatly improve the speed and efficiency of this process, marking a new phase of productivity in the field. This research not only furthers artificial intelligence by introducing an innovative method but also may possess important applications in engineering.
Abstract This study explores the construction, characterization, and measurement of the design space using a novel approach that centres on First Occurrences (FOs) and Re-Occurrences (ROs) as ...metrics. Expert architects' cognitive behaviours during the design process were investigated empirically to gain insights into design space evolution. Findings reveal a consistent generation and revisiting of ideas, signifying an ongoing development of the design space. Future research should incorporate diverse methodologies and broader participant sample for a more comprehensive understanding.
Unmanned Aerial Vehicles and the increasing variety of their applications are raising in popularity. The growing number of UAVs, emphasizes the significance of drones' reliability and robustness. ...Thus, there is a need for an efficient self-observing sensing mechanism to detect real-time anomalies in drone behavior. Previous works suggested prediction models from control theory, yet, they are complex by nature and hard to implement, while Deep Learning solutions are of great utility. In this paper, we propose a real-time framework to detect anomalies in drones by analyzing the sound emitted from them. For this purpose, we construct a hybrid Deep Learning based Transformer and a Convolutional Neural Network inspired by the well-known VGG architecture. Our approach is examined over a dataset that is collected from a single microphone set located on a micro drone in real-time. Our approach achieves an F1-score of 88.4% in detecting anomalies and outperforms the VGG-16 architecture. Moreover, the framework presented in this paper reduces the number of parameters of the well-known VGG-16 from 138M, into a shrunk version with 3.6M parameters only. Additionally, our real-time approach, results in a smaller number of parameters in the neural network, and yet yields high accuracy in anomaly detection in drones with an average inference time of 0.2 seconds per second. Moreover, with an earphone that weighs less than 100 grams on top of the UAV, our method is shown to be beneficial, even in extreme conditions such as a micro-size dataset that is composed of three hours of flight recordings. The presented self-observing method can be implemented by simply adding a microphone to drones and transmitting the captured audio for analysis to the remote control or performing it onboard the drone using a dedicated microcontroller.
Speaker identification, a cornerstone of speech processing, involves associating individuals with spoken segments within a known speaker pool. This paper presents a significant AI contribution: an ...innovative framework tailored for closed-set speaker identification. It concurrently emphasizes its practical engineering application in the realm of speech analysis. This paper introduces a pioneering AI framework with substantial neural network architecture enhancements, particularly focusing on optimizing the Log-Softmax function—a linchpin for speaker attribution. Additionally, we seamlessly incorporate cutting-edge data augmentation techniques into the Wav2Vec2 framework. These innovations push the boundaries of current Speaker Identification methodologies. Empirical validation demonstrates our framework’s efficacy, yielding a remarkable relative improvement of up to 3.16% in top-1% accuracy compared to the state-of-the-art. This research sets a new benchmark, surpassing existing standards and unlocking the full potential of closed-set Speaker Identification functions. In addition, the methodology presented in this paper serves as a catalyst for advancing Speaker Identification methodologies in engineering applications, underlining the transformative potential of AI-driven innovations in this domain.
•Short-duration speech segment focus improves model’s practical applicability.•Architectural changes increase neural networks’ vocal pattern precision.•Integration of Wav2Vec2 framework advances state-of-the-art in SID.•Novel dual Log-Softmax strategy enhances speaker identification accuracy.
This study proposes an innovative methodology to enhance the performance of multilingual Automatic Speech Recognition (ASR) systems by capitalizing on the high semantic similarity between sentences ...across different languages and eliminating the requirement for Language Identification (LID). To achieve this, special bilingual datasets were created from the Mozzila Common Voices datasets in Spanish, Russian, and Portuguese. The process involves computing sentence embeddings using Language-agnostic BERT and selecting sentence pairs based on high and low cosine similarity. Subsequently, we train the Wav2vec 2.0 XLSR53 model on these datasets and assess its performance utilizing Character Error Rate (CER) and Word Error Rate (WER) metrics. The experimental results indicate that models trained on high-similarity samples consistently surpass their low-similarity counterparts, emphasizing the significance of high semantic similarity data selection for precise and dependable ASR performance. Furthermore, the elimination of LID contributes to a simplified system with reduced computational costs and the capacity for real-time text output. The findings of this research offer valuable insights for the development of more efficient and accurate multilingual ASR systems, particularly in real-time and on-device applications.
•MLASR data creation without LID to overcome accuracy loss & performance degradation.•Handle various grammar rules & syntax in different languages for cross-lingual ASR.•Semantic dataset creation incorporated into Wav2Vec prevents language output errors.•Cope with real-life bilingual datasets for low-data languages using data augmentation.•Improve CER for languages with limited datasets and similar alphabetic characters.
In this study, we develop a machine learning modeling approach to the prediction of the hysteretic Boundary Drying (BD) curves of unsaturated porous media from the known Boundary Wetting (BW) curves, ...measured at a constant void ratio. The relationship between the families of BW and BD curves of the porous media is considered to consist of regular and random constituents, and it is represented by a limited set of N known pairs of these curves. Prediction of the desired BD curve from its associated known BW curve of some porous medium is obtained as a product of two mappings: (i) a nonlinear mapping of the known BW curve to its corresponding Hypothetical Drying (HD) curve, as defined in ”The modified dependent-domain theory of hysteresis” of Mualem (1984, 2009) and (ii) a linear mapping of this HD curve to the desired BD curve. The latter mapping is performed by an optimization algorithm based on a training set of k known BW-BD pairs (k≤N) of the k corresponding porous media. The predicted BD curves indicate a generally good agreement with the measured ones. An advantage of the proposed approach is the possibility of permanently updating the suggested model by incorporating new measured data.
•Predictive integral operator for modeling relationships between functions families.•Optimized operator for measured pairs functions dataset of two associated families.•Individual operator optimization, to predict boundary drying curve of a given soil.•Operator kernel derivative as a common air-entry blockage function for soils sample.•Permanently improvable operator’s predictive ability, due its update by new data.
Speaker Change Detection (SCD) is the problem of splitting an audio-recording by its speaker-turns. Many real-world problems, such as the Speaker Diarization (SD) or automatic speech transcription, ...are influenced by the quality of the speaker-turns estimation. Previous works have already shown that auxiliary textual information (for mono-lingual systems) can be of great use for detection of speaker-turns and the diarization systems’ performance. In this paper, we suggest a framework for speaker-turn estimation, as well as the determination of clustered speaker identities to the SD system, and examine our approach over a multi-lingual dataset that consists of three mono-lingual datasets—in English, French, and Hebrew. As such, we propose a generic and language-independent framework for the SCD problem that is learned through textual information using state-of-the-art transformer-based techniques and speech-embedding modules. Comprehensive experimental evaluation shows that (i) our multi-lingual SCD framework is competitive enough when compared to a framework over mono-lingual datasets, and that (ii) textual information improves the solution’s quality compared to the speech signal-based approach. In addition, we show that our multi-lingual SCD approach does not harm the performance of SD systems.
•Improvement of speaker change detection & diarization system by textual data.•Multilingual training framework to compute speaker-turns in multiple languages.•Competitive strategy transformers and speech embedding for speaker labeling.•Additive approach for multilingual domain adaptation in NLP.
Short Message Service (SMS) spamming is a harmful phishing attack on mobile phones. That is, fraudsters are trying to misuse personal user information, using tricky text messages, sometimes included ...with a fake URL that asks for this personal information, such as passwords, usernames, etc. In the world of Machine Learning, several approaches have tried to attitudinize this problem, but the lack of available data resources was commonly the main drawback towards a good enough solution. Therefore, in this paper, we suggest a dataset extension technique for small datasets, based on an Out Of Distribution (OOD) metric. Hence, different approaches such as Generative Adversarial Networks (GANs) were suggested, yet GANs are hard to train whenever datasets are limited in terms of sample size. In this paper, we present a GAN-like method that imitates the generator concept of GANs for the purpose of limited datasets extension, using the OOD concept. By using a sophisticated text generation method, we show how to apply it over datasets from the domain of fraud and spam detection in SMS messages, and achieve over 25% relative improvement, compared to two other solutions. In addition, due to the class imbalance in typical spam datasets, our approach is being examined over another dataset, in order to verify that the false alarm rate is low enough.
Speaker Change Detection (SCD) is the task of segmenting an input audio-recording according to speaker interchanges. Nowadays, many applications, such as Speaker Diarization (SD) or automatic vocal ...transcription, depend on this segmentation task. In this paper, we focus on the essential task of the SD problem, the audio segmenting process, and suggest a solution for the SCD problem, as well as the assignment of clustered speaker labels for the extracted segments, and applying the solution over two datasets: a commercial dataset in Hebrew and the ICSI Meeting Corpus. As such, we propose a hybrid framework for the SCD problem that is learned by textual information and speech signals and the meta-data features that can be extracted from them. Moreover, we demonstrate the negative correlation between an increase in the number of speakers in the training dataset and the influence on the overall diarization system's performance, which is improved using our efficient SCD component. Finally, we show how our proposed hybrid framework remains robust compared to the ICSI Meeting Corpus, as the experimental evaluation's training and testing is based on two languages.