Graphs are widely used as a popular representation of the network structure of connected data. Graph data can be found in a broad spectrum of application domains such as social systems, ecosystems, ...biological networks, knowledge graphs, and information systems. With the continuous penetration of artificial intelligence technologies, graph learning (i.e., machine learning on graphs) is gaining attention from both researchers and practitioners. Graph learning proves effective for many tasks, such as classification, link prediction, and matching. Generally, graph learning methods extract relevant features of graphs by taking advantage of machine learning algorithms. In this survey, we present a comprehensive overview on the state-of-the-art of graph learning. Special attention is paid to four categories of existing graph learning methods, including graph signal processing, matrix factorization, random walk, and deep learning. Major models and algorithms under these categories are reviewed, respectively. We examine graph learning applications in areas such as text, images, science, knowledge graphs, and combinatorial optimization. In addition, we discuss several promising research directions in this field.
Machine learning (ML) provides effective means to learn from spectrum data and solve complex tasks involved in wireless communications. Supported by recent advances in computational resources and ...algorithmic designs, deep learning (DL) has found success in performing various wireless communication tasks such as signal recognition, spectrum sensing and waveform design. However, ML in general and DL in particular have been found vulnerable to manipulations thus giving rise to a field of study called adversarial machine learning (AML). Although AML has been extensively studied in other data domains such as computer vision and natural language processing, research for AML in the wireless communications domain is still in its early stage. This paper presents a comprehensive review of the latest research efforts focused on AML in wireless communications while accounting for the unique characteristics of wireless systems. First, the background of AML attacks on deep neural networks is discussed and a taxonomy of AML attack types is provided. Various methods of generating adversarial examples and attack mechanisms are also described. In addition, an holistic survey of existing research on AML attacks for various wireless communication problems as well as the corresponding defense mechanisms in the wireless domain are presented. Finally, as new attacks and defense techniques are developed, recent research trends and the overarching future outlook for AML in next-generation wireless communications are discussed.
Many-body descriptors are widely used to represent atomic environments in the construction of machine-learned interatomic potentials and more broadly for fitting, classification, and embedding tasks ...on atomic structures. There is a widespread belief in the community that three-body correlations are likely to provide an overcomplete description of the environment of an atom. We produce several counterexamples to this belief, with the consequence that any classifier, regression, or embedding model for atom-centered properties that uses three- (or four)-body features will incorrectly give identical results for different configurations. Writing global properties (such as total energies) as a sum of many atom-centered contributions mitigates the impact of this fundamental deficiency-explaining the success of current "machine-learning" force fields. We anticipate the issues that will arise as the desired accuracy increases, and suggest potential solutions.
Machine learning based Intrusion Detection Systems (IDS) allow flexible and efficient automated detection of cyberattacks in Internet of Things (IoT) networks. However, this has also created an ...additional attack vector; the machine learning models which support the IDS’s decisions may also be subject to cyberattacks known as Adversarial Machine Learning (AML). In the context of IoT, AML can be used to manipulate data and network traffic that traverse through such devices. These perturbations increase the confusion in the decision boundaries of the machine learning classifier, where malicious network packets are often miss-classified as being benign. Consequently, such errors are bypassed by machine learning based detectors, which increases the potential of significantly delaying attack detection and further consequences such as personal information leakage, damaged hardware, and financial loss. Given the impact that these attacks may have, this paper proposes a rule-based approach towards generating AML attack samples and explores how they can be used to target a range of supervised machine learning classifiers used for detecting Denial of Service attacks in an IoT smart home network. The analysis explores which DoS packet features to perturb and how such adversarial samples can support increasing the robustness of supervised models using adversarial training. The results demonstrated that the performance of all the top performing classifiers were affected, decreasing a maximum of 47.2 percentage points when adversarial samples were present. Their performances improved following adversarial training, demonstrating their robustness towards such attacks.
As machine learning becomes widely used for automated decisions, attackers have strong incentives to manipulate the results and models generated by machine learning algorithms. In this paper, we ...perform the first systematic study of poisoning attacks and their countermeasures for linear regression models. In poisoning attacks, attackers deliberately influence the training data to manipulate the results of a predictive model. We propose a theoretically-grounded optimization framework specifically designed for linear regression and demonstrate its effectiveness on a range of datasets and models. We also introduce a fast statistical attack that requires limited knowledge of the training process. Finally, we design a new principled defense method that is highly resilient against all poisoning attacks. We provide formal guarantees about its convergence and an upper bound on the effect of poisoning attacks when the defense is deployed. We evaluate extensively our attacks and defenses on three realistic datasets from health care, loan assessment, and real estate domains.
The k-means algorithm is generally the most known and used clustering method. There are various extensions of k-means to be proposed in the literature. Although it is an unsupervised learning to ...clustering in pattern recognition and machine learning, the k-means algorithm and its extensions are always influenced by initializations with a necessary number of clusters a priori. That is, the k-means algorithm is not exactly an unsupervised clustering method. In this paper, we construct an unsupervised learning schema for the k-means algorithm so that it is free of initializations without parameter selection and can also simultaneously find an optimal number of clusters. That is, we propose a novel unsupervised k-means (U-k-means) clustering algorithm with automatically finding an optimal number of clusters without giving any initialization and parameter selection. The computational complexity of the proposed U-k-means clustering algorithm is also analyzed. Comparisons between the proposed U-k-means and other existing methods are made. Experimental results and comparisons actually demonstrate these good aspects of the proposed U-k-means clustering algorithm.
Robust learning, an emerging research topic in recent years, is a promising branch of advanced artificial intelligence. Robust learning models target mainly noisy and rough datasets, predominantly in ...situations where noises and outliers are hard to remove. In this article, the concept of robust learning is combined with complex fuzzy theory for the first time, proposing a novel neuro-fuzzy system ENCFIS with extensive adaptability to numerical regression problems, with or without noise. Simulation results indicate that such architecture has excellent performance on a dataset with massive (45%) label noises and on a distorted time-series dataset (25% corrupted). In addition, experimental results on a metallurgy dataset also show that the approximation performance of ENCFIS is not compromised for the increase in robustness, making it an ideal candidate for general industrial scenarios with weak noise but difficult data characteristics.
Machine learning algorithms learn a desired input-output relation from examples in order to interpret new inputs. This is important for tasks such as image and speech recognition or strategy ...optimisation, with growing applications in the IT industry. In the last couple of years, researchers investigated if quantum computing can help to improve classical machine learning algorithms. Ideas range from running computationally costly algorithms or their subroutines efficiently on a quantum computer to the translation of stochastic methods into the language of quantum theory. This contribution gives a systematic overview of the emerging field of quantum machine learning. It presents the approaches as well as technical details in an accessible way, and discusses the potential of a future theory of quantum learning.
Neurobiological heterogeneity in schizophrenia is poorly understood and confounds current analyses. We investigated neuroanatomical subtypes in a multi-institutional multi-ethnic cohort, using novel ...semi-supervised machine learning methods designed to discover patterns associated with disease rather than normal anatomical variation. Structural MRI and clinical measures in established schizophrenia (n = 307) and healthy controls (n = 364) were analysed across three sites of PHENOM (Psychosis Heterogeneity Evaluated via Dimensional Neuroimaging) consortium. Regional volumetric measures of grey matter, white matter, and CSF were used to identify distinct and reproducible neuroanatomical subtypes of schizophrenia. Two distinct neuroanatomical subtypes were found. Subtype 1 showed widespread lower grey matter volumes, most prominent in thalamus, nucleus accumbens, medial temporal, medial prefrontal/frontal and insular cortices. Subtype 2 showed increased volume in the basal ganglia and internal capsule, and otherwise normal brain volumes. Grey matter volume correlated negatively with illness duration in Subtype 1 (r = -0.201, P = 0.016) but not in Subtype 2 (r = -0.045, P = 0.652), potentially indicating different underlying neuropathological processes. The subtypes did not differ in age (t = -1.603, df = 305, P = 0.109), sex (chi-square = 0.013, df = 1, P = 0.910), illness duration (t = -0.167, df = 277, P = 0.868), antipsychotic dose (t = -0.439, df = 210, P = 0.521), age of illness onset (t = -1.355, df = 277, P = 0.177), positive symptoms (t = 0.249, df = 289, P = 0.803), negative symptoms (t = 0.151, df = 289, P = 0.879), or antipsychotic type (chi-square = 6.670, df = 3, P = 0.083). Subtype 1 had lower educational attainment than Subtype 2 (chi-square = 6.389, df = 2, P = 0.041). In conclusion, we discovered two distinct and highly reproducible neuroanatomical subtypes. Subtype 1 displayed widespread volume reduction correlating with illness duration, and worse premorbid functioning. Subtype 2 had normal and stable anatomy, except for larger basal ganglia and internal capsule, not explained by antipsychotic dose. These subtypes challenge the notion that brain volume loss is a general feature of schizophrenia and suggest differential aetiologies. They can facilitate strategies for clinical trial enrichment and stratification, and precision diagnostics.