•The Matthews Correlation Coefficient (MCC) is not a balanced measurement.•MCC is a biased measurement similar to F1 score and others.•MCC measurement is unevenly distributed when the datasets are ...imbalanced.•MCC is not suitable for classification measurement on imbalanced datasets.
The Matthews Correlation Coefficient (MCC) is one of the popular measurements for classification accuracy. It has been generally regarded as a balanced measure which can be used even if the classes are of very different sizes. The study of this paper finds that this is not true. MCC deteriorates seriously when the dataset in classification are imbalanced. Experiment results and analysis show that MCC is not suitable for classification accuracy measurement on imbalanced datasets.
Datasets play an important role in the progress of facial expression recognition algorithms, but they may suffer from obvious biases caused by different cultures and collection conditions. To look ...deeper into this bias, we first conduct comprehensive experiments on dataset recognition and cross-dataset generalization tasks, and for the first time, explore the intrinsic causes of the dataset discrepancy. The results quantitatively verify that current datasets have a strong build-in bias, and corresponding analyses indicate that the conditional probability distributions between source and target datasets are different. However, previous researches are mainly based on shallow features with limited discriminative ability under the assumption that the conditional distribution remains unchanged across domains. To address these issues, we further propose a novel deep Emotion-Conditional Adaption Network (ECAN) to learn domain-invariant and discriminative feature representations, which can match not only the marginal distribution but also the class-conditional distribution across domains by exploring the underlying label information of the target dataset. Moreover, the largely ignored expression class distribution bias is also addressed so that the training and testing domains can share similar class distribution. Extensive cross-database experiments on both lab-controlled datasets (CK+, JAFFE, MMI, and Oulu-CASIA) and real-world databases (AffectNet, FER2013, RAF-DB 2.0, and SFEW 2.0) demonstrate that our ECAN can yield competitive performances across various cross-dataset facial expression recognition tasks and outperform the state-of-the-art methods.
Fault diagnostics and prognostics are important topics both in practice and research. There is an intense pressure on industrial plants to continue reducing unscheduled downtime, performance ...degradation, and safety hazards, which requires detecting and recovering potential faults in its early stages. Intelligent fault diagnosis is a promising tool due to its ability to rapidly and efficiently processing collected signals and providing accurate diagnosis results. Although many studies have developed machine leaning (M.L) and deep learning (D.L) algorithms for detecting the bearing fault, the results have generally been limited to relatively small train and test datasets and the input data has been manipulated (selective features used) to reach high accuracy. In this work, the raw data, collected from accelerometers (time-domain features) are taken as the input of a novel temporal sequence prediction algorithm to present an end-to-end method for fault detection. We use equivalent temporal sequences as the input of a novel Convolutional Long-Short-Term-Memory Recurrent Neural Network (CRNN) to detect the bearing fault with the highest accuracy in the shortest possible time. The method can reach the highest accuracy in the literature, to the best knowledge of the authors of the present paper, voiding any sort of pre-processing or manipulation of the input data. Effectiveness and feasibility of the fault diagnosis method are validated by applying it to two commonly used benchmark real vibration datasets and comparing the result with the other intelligent fault diagnosis methods.
The appealing features of Cloud Computing continue to fuel its adoption and its integration in many sectors such industry, governments, education and entertainment. Nevertheless, uploading sensitive ...data to public cloud storage services poses security risks such as integrity, availability and confidentiality to organizations. Moreover, the open and distributed (decentralized) structure of the cloud has resulted this class of computing, prone to cyber attackers and intruders. Thereby, it is imperative to develop an anomaly network intrusion system to detect and prevent both inside and outside assaults in cloud environment with high detection precision and low false warnings. In this work, we propose an intelligent approach to build automatically an efficient and effective Deep Neural Network (DNN) based anomaly Network IDS using a hybrid optimization framework (IGASAA) based on Improved Genetic Algorithm (IGA) and Simulated Annealing Algorithm (SAA). The IDS resulted is called “MLIDS” (Machine Learning based Intrusion Detection System). Genetic Algorithm (GA) is improved through optimization strategies, namely Parallel Processing and Fitness Value Hashing, which reduce execution time, convergence time and save processing power. Moreover, SAA was incorporated to IGA with the aim to optimize its heuristic search. Our approach consists of using IGASAA in order to search the optimal or near-optimal combination of most relevant values of the parameters included in construction of DNN based IDS or impacting its performance, like feature selection, data normalization, architecture of DNN, activation function, learning rate and Momentum term, which ensure high detection rate, high accuracy and low false alarm rate. For simulation and validation of the proposed method, CloudSim 4.0 simulator platform and three benchmark IDS datasets were used, namely CICIDS2017, NSL-KDD version 2015 and CIDDS-001. The implementation results of our model demonstrate its ability to detect intrusions with high detection accuracy and low false alarm rate, and indicate its superiority in comparison with state-of-the-art methods.
Change detection (CD) aims to identify surface changes from bitemporal images. In recent years, deep learning (DL)-based methods have made substantial breakthroughs in the field of CD. However, CD ...results can be easily affected by external factors, including illumination, noise, and scale, which leads to pseudo-changes and noise in the detection map. To deal with these problems and achieve more accurate results, a deeply supervised (DS) attention metric-based network (DSAMNet) is proposed in this article. A metric module is employed in DSAMNet to learn change maps by means of deep metric learning, in which convolutional block attention modules (CBAM) are integrated to provide more discriminative features. As an auxiliary, a DS module is introduced to enhance the feature extractor's learning ability and generate more useful features. Moreover, another challenge encountered by data-driven DL algorithms is posed by the limitations in change detection datasets (CDDs). Therefore, we create a CD dataset, Sun Yat-Sen University (SYSU)-CD, for bitemporal image CD, which contains a total of 20 000 aerial image pairs of size <inline-formula> <tex-math notation="LaTeX">256\times256 </tex-math></inline-formula>. Experiments are conducted on both the CDD and the SYSU-CD dataset. Compared to other state-of-the-art methods, our network achieves the highest accuracy on both datasets, with an F1 of 93.69% on the CDD dataset and 78.18% on the SYSU-CD dataset.
Learning-based methods are believed to work well for unconstrained gaze estimation, i.e. gaze estimation from a monocular RGB camera without assumptions regarding user, environment, or camera. ...However, current gaze datasets were collected under laboratory conditions and methods were not evaluated across multiple datasets. Our work makes three contributions towards addressing these limitations. First, we present the MPIIGaze dataset, which contains 213,659 full face images and corresponding ground-truth gaze positions collected from 15 users during everyday laptop use over several months. An experience sampling approach ensured continuous gaze and head poses and realistic variation in eye appearance and illumination. To facilitate cross-dataset evaluations, 37,667 images were manually annotated with eye corners, mouth corners, and pupil centres. Second, we present an extensive evaluation of state-of-the-art gaze estimation methods on three current datasets, including MPIIGaze. We study key challenges including target gaze range, illumination conditions, and facial appearance variation. We show that image resolution and the use of both eyes affect gaze estimation performance, while head pose and pupil centre information are less informative. Finally, we propose GazeNet, the first deep appearance-based gaze estimation method. GazeNet improves on the state of the art by 22 percent (from a mean error of 13.9 degrees to 10.8 degrees) for the most challenging cross-dataset evaluation.
In recent years, autonomous robots have become ubiquitous in research and daily life. Among many factors, public datasets play an important role in the progress of this field, as they waive the tall ...order of initial investment in hardware and manpower. However, for research on autonomous aerial systems, there appears to be a relative lack of public datasets on par with those used for autonomous driving and ground robots. Thus, to fill in this gap, we conduct a data collection exercise on an aerial platform equipped with an extensive and unique set of sensors: two 3D lidars, two hardware-synchronized global-shutter cameras, multiple Inertial Measurement Units (IMUs), and especially, multiple Ultra-wideband (UWB) ranging units. The comprehensive sensor suite resembles that of an autonomous driving car, but features distinct and challenging characteristics of aerial operations. We record multiple datasets in several challenging indoor and outdoor conditions. Calibration results and ground truth from a high-accuracy laser tracker are also included in each package. All resources can be accessed via our webpage https://ntu-aris.github.io/ntu_viral_dataset/.
Single-frame InfraRed Small Target (SIRST) detection has been a challenging task due to a lack of inherent characteristics, imprecise bounding box regression, a scarcity of real-world datasets, and ...sensitive localization evaluation. In this paper, we propose a comprehensive solution to these challenges. First, we find that the existing anchor-free label assignment method is prone to mislabeling small targets as background, leading to their omission by detectors. To overcome this issue, we propose an all-scale pseudo-box-based label assignment scheme that relaxes the constraints on scale and decouples the spatial assignment from the size of the ground-truth target. Second, motivated by the structured prior of feature pyramids, we introduce the one-stage cascade refinement network (OSCAR), which uses the high-level head as soft proposals for the low-level refinement head. This allows OSCAR to process the same target in a cascade coarse-to-fine manner. Finally, we present a new research benchmark for infrared small target detection, consisting of the SIRST-V2 dataset of real-world, high-resolution single-frame targets, the normalized contrast evaluation metric, and the DeepInfrared toolkit for detection. We conduct extensive ablation studies to evaluate the components of OSCAR and compare its performance to state-of-the-art model-driven and data-driven methods on the SIRST-V2 benchmark. Our results demonstrate that a top-down cascade refinement framework can improve the accuracy of infrared small target detection without sacrificing efficiency. The DeepInfrared toolkit, dataset, and trained models are available at https://github.com/YimianDai/open-deepinfrared.
As it is the seventh most-spoken language and fifth most-spoken native language in the world, the domain of Bengali handwritten character recognition has fascinated researchers for decades. Although ...other popular languages i.e., English, Chinese, Hindi, Spanish, etc. have received many contributions in the area of handwritten character recognition, Bengali has not received many noteworthy contributions in this domain because of the complex curvatures and similar writing fashions of Bengali characters. Previously, studies were conducted by using different approaches based on traditional learning, and deep learning. In this research, we proposed a low-cost novel convolutional neural network architecture for the recognition of Bengali characters with only 2.24 to 2.43 million parameters based on the number of output classes. We considered 8 different formations of CMATERdb datasets based on previous studies for the training phase. With experimental analysis, we showed that our proposed system outperformed previous works by a noteworthy margin for all 8 datasets. Moreover, we tested our trained models on other available Bengali characters datasets such as Ekush, BanglaLekha, and NumtaDB datasets. Our proposed architecture achieved 96–99% overall accuracies for these datasets as well. We believe our contributions will be beneficial for developing an automated high-performance recognition tool for Bengali handwritten characters.