This study develops a deep learning (DL) model to extract the ship size from Sentinel-1 synthetic aperture radar (SAR) images, named SSENet. We employ a single shot multibox detector (SSD)-based ...model to generate a rotatable bounding box (RBB) for the ship. We design a deep-neural-network (DNN)-based regression model to estimate the accurate ship size. The hybrid inputs to the DNN-based model include the initial ship size and orientation angle obtained from the RBB and the abstracted features extracted from the input SAR image. We design a custom loss function named mean scaled square error (MSSE) to optimize the DNN-based model. The DNN-based model is concatenated with the SSD-based model to form the integrated SSENet. We employ a subset of the OpenSARShip, a data set dedicated to Sentinel-1 ship interpretation, to train and test SSENet. The training/testing data set includes 1500/390 ship samples. Experiments show that SSENet is capable of extracting the ship size from SAR images end to end. The mean absolute errors (MAEs) are under 0.8 pixels, and their length and width are 7.88 and 2.23 m, respectively. The hybrid input significantly improves the model performance. The MSSE reduces the MAE of length by nearly 1 m and increases the MAE of width by 0.03m compared to the mean square error (MSE) loss function. Compared with the well-performed gradient boosting regression (GBR) model, SSENet reduces the MAE of length by nearly 2 m (18.68%) and that of width by 0.06 m (2.51%). SSENet shows robustness on different training/testing sets.
Direction-of-arrival (DOA) estimation is a key step in the passive target location. The primary issues with traditional DOA estimation methods are the huge computation and weak noise immunity in ...extreme noise environments. Random vector functional link (RVFL) and its variants (RVFL without direct links, RVFLv) have demonstrated high learning efficiency and strong generalization ability in previous studies. However, due to the shallow network structure, they may not be effective for underwater acoustic array signals with complex features. Therefore, we propose a model-embedded self-supervised ensemble deep RVFL (ME-SedRVFL) network to estimate the DOA of underwater acoustic array signals. To prove the efficiency and generalization ability, ME-SedRVFL is compared with its variants (ME-SRVFLv), as well as other well-known randomization-based networks. The results testify the noise immunity of ME-SedRVFL and ME-SRVFLv is 9.62% and 9.34% better than traditional signal model-based methods, 1.68% and 1.40% better than randomization-based parameter estimation methods (Signal-to-noise ratio is −20 dB, frequency is 200 Hz). The statistical box diagrams and statistical comparisons are performed to evaluate different methods, which indicate that the ME-SedRVFL obtains superior DOA estimation performance to ME-SRVFLv in most cases, due to direct input–output connections helping regularize the randomization. Hence, ME-SedRVFL is identified as the best-performing DOA estimation method through a comprehensive evaluation of real-world and simulated datasets.
Display omitted
•We design a signal model-embedded loss function, which is a closed-form solution.•We introduce an L1 auxiliary loss term to obtain the signal sparse solution.•The proposed models are instantiated and have excellent generalization ability.•We compare with the RVFL variants, the results manifested competitive performance.
This paper proposes a CDRSHNet (CodecDirtyRainyShadowHazeNetwork) architecture with a fusion of self-attention (SA) and variance-guided multiscale attention (VGMA) mechanism to restore traffic sign ...images captured in challenging weather conditions including raindrops, shadows, haze, blurry images from dirty camera lenses and codec errors. The SA captures global dependencies whereas VGMA enhances the representation by emphasizing informative channels and spatial locations. To enhance the image quality hybrid loss function is proposed that combines the Gradient Magnitude Similarity Deviation (GMSD) and Charbonnier loss. The CDRSHNet is trained on a dataset of real and synthesized images. Its performance is evaluated on the average Structural Similarity Index Measure (SSIM) and Peak Signal-to-Noise Ratio (PSNR) on Test RID (Real Image Dataset) and Test SID (Synthesized Image Dataset). CDRSHNet achieved an average SSIM of 0.978 and an average PSNR of 39.58 on Test RID. On Test SID, the average SSIM is 0.963, and the average PSNR is 39.46
Clouds are one of the major limitations to crop monitoring using optical satellite images. Despite all efforts to provide decision-makers with high-quality agricultural statistics, there is still a ...lack of techniques to optimally process satellite image time series in the presence of clouds. In this regard, in this article it was proposed to add a Multi-Layer Perceptron loss function to the pix2pix conditional Generative Adversarial Network (cGAN) objective function. The aim was to enforce the generative model to learn how to deliver synthetic pixels whose values were proxies for the spectral response improving further crop type mapping. Furthermore, it was evaluated the generalization capacity of the generative models in producing pixels with plausible values for images not used in the training. To assess the performance of the proposed approach it was compared real images with synthetic images generated with the proposed approach as well as with the original pix2pix cGAN. The comparative analysis was performed through visual analysis, pixel values analysis, semantic segmentation and similarity metrics. In general, the proposed approach provided slightly better synthetic pixels than the original pix2pix cGAN, removing more noise than the original pix2pix algorithm as well as providing better crop type semantic segmentation; the semantic segmentation of the synthetic image generated with the proposed approach achieved an F1-score of 44.2%, while the real image achieved 44.7%. Regarding the generalization, the models trained utilizing different regions of the same image provided better pixels than models trained using other images in the time series. Besides this, the experiments also showed that the models trained using a pair of images selected every three months along the time series also provided acceptable results on images that do not have cloud-free areas.
The increased spread of malicious software (malware) through the internet remains a serious threat. Malware authors use obfuscation and deformation techniques to generate new types than can evade ...traditional detection methods. Hence, it is widely expected that machine learning methods can classify malware and cleanware based on the characteristics of malware samples. This paper investigates malware classification accuracy using static methods for malware detection based on LightGBM by a custom log loss function, which controls learning by installing coefficient <inline-formula> <tex-math notation="LaTeX">\alpha </tex-math></inline-formula> to a loss function of the false-negative side and coefficient <inline-formula> <tex-math notation="LaTeX">\beta </tex-math></inline-formula> to a loss function of the false-positive side. By installing coefficients, we can create a lopsided classifier. We used two malware datasets, non-public and public, to construct a malware baseline model to verify the effectiveness of the proposed method. We extracted the dataset features from PE-file surface analysis and PE-header dumps and customized a binary log loss function to improve all the classification evaluation metrics to a certain extent. We obtained a better result (AUC = 0.979) at <inline-formula> <tex-math notation="LaTeX">\alpha =430 </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">\beta =339 </tex-math></inline-formula> than the normal log loss function (AUC = 0.978) on the EMBER dataset. In addition, to maintain malware detection coverage and quick countermeasures to true positive results, we propose a hybrid usage of different custom models to prioritize positive results.
We develop a deep learning model (DL) for Indian Summer Monsoon (ISM) short-range precipitation forecasting using a ConvLSTM network. The model is built using daily precipitation records from both ...ground-based observations and remote sensing. Precipitation datasets from the Tropical Rainfall Measuring Mission and the India Meteorological Department are used for training, testing, forecasting, and comparison. For lead days 1 and 2, the correlation coefficient (CC), which was determined using predicted data from the previous five years and corresponding observational records (from both in-situ and remote sensing products), yielded values of 0.67 and 0.42, respectively. Interestingly, the CCs are even higher over the Western Ghats and Monsoon trough region. The model performance evaluated based on skill scores, Normalized Root Mean Squared Error (NRMSE), Mean absolute percentage error (MAPE) and ROC curves show a reasonable skill in short-range precipitation forecasting. Incorporating multivariable-based DL has the potential to match or even better the forecasts made by the state-of-the-art numerical weather prediction models.
In this paper, we present Machine Learning (ML) solutions to address the reliability challenges likely to be encountered in advanced wireless systems (5G, 6G, and indeed beyond). Specifically, we ...introduce a novel loss function to minimize the outage probability of an ML-based resource allocation system. A single-user multi-resource greedy allocation strategy constitutes our application scenario, for which an ML binary classification predictor assists in selecting a resource satisfying the established outage criterium. While other resource allocation policies may be suitable, they are not the focus of our study. Instead, our primary emphasis is on theoretically developing this loss function and leveraging it to train an ML model to address the outage probability challenge. With no access to future channel state information, this predictor foresees each resource's likely future outage status. When the predictor encounters a resource it believes will be satisfactory, it allocates it to the user. The predictor aims to ensure that a user avoids resources likely to undergo an outage. Our main result establishes exact and asymptotic expressions for this system's outage probability. These expressions reveal that focusing solely on the optimization of the per-resource outage probability conditioned on the ML predictor recommending resource allocation (a strategy that - at face value - looks to be the most appropriate) may produce inadequate predictors that reject every resource. They also reveal that focusing on standard metrics, like precision, false-positive rate, or recall, may not produce optimal predictors. With our result, we formulate a theoretically optimal, differentiable loss function to train our predictor. We then compare predictors trained using this and traditional loss functions namely, binary cross-entropy (BCE), mean squared error (MSE), and mean absolute error (MAE). In all scenarios, predictors trained using our novel loss function provide superior outage probability performance. Moreover, in some cases, our loss function outperforms predictors trained with BCE, MAE, and MSE by multiple orders of magnitude. Additionally, when applied to another ML-based resource allocation scheme (a modified greedy algorithm), our proposed loss function maintains its efficacy.
Heart disease is the leading cause of mortality worldwide, and it is of utmost importance that clinicians and researchers understand the dynamics of the heart. As an electrical measure of the heart's ...activity, the electrocardiogram, or ECG, is the gold standard for recording the cardiac state, whether monitoring the structure of the traces that make up the ECG or indicating key metrics such as heart rate variability. Long-term monitoring of ECG is often required to identify cardiovascular issues but proves impractical; therefore, patients will remotely collect their data. However, ECG signals can become contaminated with various noise sources during data collection. This paper proposes a custom loss function capable of denoising electrode motion artefact in ECG data to a higher standard than other, more common loss functions. We implement our custom loss function with a convolutional neural network to return high-quality ECG, suitable for calculating the aforementioned key metrics from a previously unobtainable state. The proposed model improves ECG signals overall signal-to-noise ratio and preserves the R waves structure. The model outperforms a standard mean squared error loss function with an improvement of 0.5 dB in terms of signal to noise ratio and improves the heart rate estimation by 25%.
Image classification finds wide applications in face recognition, cancer detection, and many more. However, the classifier models such as convolutional neural networks (CNN) used in safety-critical ...systems like self-driving cars, medical image diagnosis, etc., demand differential treatment of classification mistakes. This is because certain misclassifications may have grave impacts whereas certain others might have only minor adverse impacts. The idea of the severity of misclassification can be associated with the semantic information possessed by the classes. The objective of this work is to minimize the severity of misclassification which has high relevance in safety-critical applications. To achieve differential treatment of mistakes semantic information is incorporated into CNNs with a custom cross-entropy loss function. The semantic similarity information enables CNNs to distinguish between semantically similar and dissimilar images. We propose a model to build custom cross-entropy loss functions that penalise the classification mistakes according to their severity. In addition, we propose two novel methods to build the semantic similarity matrix between the classes to feed into the custom loss function. The first method used directed acyclic hierarchies from the data, and the second method uses a confusion matrix to build similarity matrices. The results showed that the proposed solutions were found to reduce the severity of misclassification and also achieve an overall improvement in the classification accuracy of the model. The First model achieved a classification accuracy of 78.8 ± 0.3% and the second model achieved an accuracy of 79.2 ± 0.3%. The First model reduced the severity of misclassification by a degree of 2 Superclass Misclassification Error (SCME) and the second model reduced it by 1.5 SCME than the base CNN model. In addition, we have also compared our methods to the recent related works in this domain to highlight the benefits of the proposed methods.
Sagittal cervical spine alignment measured on X-Ray is a key objective measure for clinicians caring for patients with a multitude of presenting symptoms. Despite its applications, there has been no ...research available in this field yet. This paper presents a framework for automatic detection of the Sagittal cervical spine landmark point. Inspired by UNet, we propose an encoder-decoder Convolutional Neural Network (CNN) called PoseNet. In developing our model, we first review the weaknesses of widely used regression loss functions such as the L1, and L2 losses. To address these issues, we propose a novel loss function specifically designed to improve the accuracy of the localization task under challenging situations (extreme neck pose, low or high brightness and illumination, X-Ray noises, etc.) We validate our model and loss function on a dataset of X-Ray images. The results show that our framework is capable of performing precise sagittal cervical spine landmark point detection even for challenging X-Ray images.