Multi-label learning deals with data examples which are associated with multiple class labels simultaneously. Despite the success of existing approaches to multi-label learning, there is still a ...problem neglected by researchers, i.e., not only are some of the values of observed labels missing, but also some of the labels are completely unobserved for the training data. We refer to the problem as
multi-label learning with missing and completely unobserved labels
, and argue that it is necessary to discover these completely unobserved labels in order to mine useful knowledge and make a deeper understanding of what is behind the data. In this paper, we propose a new approach named MCUL to solve multi-label learning with Missing and Completely Unobserved Labels. We try to discover the unobserved labels of a multi-label data set with a clustering based regularization term and describe the semantic meanings of them based on the label-specific features learned by MCUL, and overcome the problem of missing labels by exploiting label correlations. The proposed method MCUL can predict both the observed and newly discovered labels simultaneously for unseen data examples. Experimental results validated over ten benchmark datasets demonstrate that the proposed method can outperform other state-of-the-art approaches on observed labels and obtain an acceptable performance on the new discovered labels as well.
The detection of network changes over time is based on identifying deviations of the network structure. The challenge mainly lies in designing a good summary or descriptor of the network structure ...for facilitating the measure of deviations. In particular, a network may have a huge number of nodes and edges. Moreover, there can exist complicated dependences among edges, e.g., the existence of some edges may be because of others. Therefore, it is non-trivial to measure the contribution of each node and each edge to the deviation of the entire network structure. Existing descriptors are designed to have factors less than the number of nodes and edges. They also model edge dependences, but can only achieve partial modeling. In this paper, we propose a novel type of descriptor. We first obtain node coordinates or positions in a latent space where nodes connected by edges have close positions by network embedding. Node positions are low-dimensional. More importantly, node positions can fully model edge dependences. We then design the descriptor based on random walk on the node positions. We conducted extensive experiments on synthetic datasets and three real-world datasets to demonstrate the effectiveness of our proposed change detection framework with the descriptor.
To predict the visual field (VF) of glaucoma patients within the central 10° from optical coherence tomography (OCT) measurements using deep learning and tensor regression.
Cross-sectional study.
...Humphrey 10-2 VFs and OCT measurements were carried out in 505 eyes of 304 glaucoma patients and 86 eyes of 43 normal subjects. VF sensitivity at each test point was predicted from OCT-measured thicknesses of macular ganglion cell layer + inner plexiform layer, retinal nerve fiber layer, and outer segment + retinal pigment epithelium. Two convolutional neural network (CNN) models were generated: (1) CNN-PR, which simply connects the output of the CNN to each VF test point; and (2) CNN-TR, which connects the output of the CNN to each VF test point using tensor regression. Prediction performance was assessed using 5-fold cross-validation through the root mean squared error (RMSE). For comparison, RMSE values were also calculated using multiple linear regression (MLR) and support vector regression (SVR). In addition, the absolute prediction error for predicting mean sensitivity in the whole VF was analyzed.
RMSE with the CNN-TR model averaged 6.32 ± 3.76 (mean ± standard deviation) dB. Significantly (P < .05) larger RMSEs were obtained with other models: CNN-PR (6.76 ± 3.86 dB), SVR (7.18 ± 3.87 dB), and MLR (8.56 ± 3.69 dB). The absolute mean prediction error for the whole VF was 2.72 ± 2.60 dB with the CNN-TR model.
The Humphrey 10-2 VF can be predicted from OCT-measured retinal layer thicknesses using deep learning and tensor regression.
We constructed a multitask learning model (latent space linear regression and deep learning LSLR-DL) in which the 2 tasks of cross-sectional predictions (using OCT) of visual field (VF; central 10°) ...and longitudinal progression predictions of VF (30°) were performed jointly via sharing the deep learning (DL) component such that information from both tasks was used in an auxiliary manner (The Association for Computing Machinery's Special Interest Group on Knowledge Discovery and Data Mining SIGKDD 2021). The purpose of the current study was to investigate the prediction accuracy preparing an independent validation dataset.
Cohort study.
Cross-sectional training and testing data sets included the VF (Humphrey Field Analyzer HFA 10-2 test) and an OCT measurement (obtained within 6 months) from 591 eyes of 351 healthy people or patients with open-angle glaucoma (OAG) and from 155 eyes of 131 patients with OAG, respectively. Longitudinal training and testing data sets included 7984 VF results (HFA 24-2 test) from 998 eyes of 592 patients with OAG and 1184 VF results (HFA 24-2 test) from 148 eyes of 84 patients with OAG, respectively. Each eye had 8 VF test results (HFA 24-2 test). The OCT sequences within the observation period were used.
Root mean square error (RMSE) was used to evaluate the accuracy of LSLR-DL for the cross-sectional prediction of VF (HFA 10-2 test). For the longitudinal prediction, the final (eighth) VF test (HFA 24-2 test) was predicted using a shorter VF series and relevant OCT images, and the RMSE was calculated. For comparison, RMSE values were calculated by applying the DL component (cross-sectional prediction) and the ordinary pointwise linear regression (longitudinal prediction).
Root mean square error in the cross-sectional and longitudinal predictions.
Using LSLR-DL, the mean RMSE in the cross-sectional prediction was 6.4 dB and was between 4.4 dB (VF tests 1 and 2) and 3.7 dB (VF tests 1–7) in the longitudinal prediction, indicating that LSLR-DL significantly outperformed other methods.
The results of this study indicate that LSLR-DL is useful for both the cross-sectional prediction of VF (HFA 10-2 test) and the longitudinal progression prediction of VF (HFA 24-2 test).
Network embedding has been widely employed in networked data mining applications as it can learn low-dimensional and dense node representations from the high-dimensional and sparse network structure. ...While most existing network embedding methods only model the proximity between two nodes regardless of the order of the proximity, this paper proposes to explicitly model multi-node proximities which can be widely observed in practice, e.g., multiple researchers coauthor a paper, and multiple genes co-express a protein. Explicitly modeling multi-node proximities is important because some two-node interactions may not come into existence without a third node. By proving that LINE(1st), a recent network embedding method, is equivalent to kernelized matrix factorization, this paper proposes coupled kernelized multi-dimensional array factorization (Cetera) which jointly factorizes multiple multi-dimensional arrays by enforcing a consensus representation for each node. In this way, node representations can be more comprehensive and effective, which is demonstrated on three real-world networks through link prediction and multi-label classification.
Multi-label image classification is a fundamental and practical task, which aims to assign multiple possible labels to an image. In recent years, many deep convolutional neural network (CNN) based ...approaches have been proposed which model label correlations to discover semantics of labels and learn semantic representations of images. This paper advances this research direction by improving both the modeling of label correlations and the learning of semantic representations. On the one hand, besides the local semantics of each label, we propose to further explore global semantics shared by multiple labels. On the other hand, existing approaches mainly learn the semantic representations at the last convolutional layer of a CNN. But it has been noted that the image representations of different layers of CNN capture different levels or scales of features and have different discriminative abilities. We thus propose to learn semantic representations at multiple convolutional layers. To this end, this paper designs a Multi-layered Semantic Representation Network (MSRN) which discovers both local and global semantics of labels through modeling label correlations and utilizes the label semantics to guide the semantic representations learning at multiple layers through an attention mechanism. Extensive experiments on five benchmark datasets including VOC2007, VOC2012, MS-COCO, NUS-WIDE, and Apparel show a competitive performance of the proposed MSRN against state-of-the-art models.
Generative Adversarial Network (GAN) has been widely used to generate impressively plausible data. However, it is a non-trivial task to train the original GAN model in practice due to the vanishing ...gradient problem. This is because the JS divergence could be a constant (i.e., log2) when original data distribution and generated data distribution hold a negligible overlapping area. Under such a scenario, the gradient of generator is 0. Most efforts have been devoted to designing a more proper difference measure while few attentions have been paid to the former aspect of the issue.
In this paper, we propose a new method to design a noise distribution having a guaranteed non-negligible overlapping area with raw data distribution. The key idea is to transform the noise from the randomized space into the raw data space. We propose to obtain the transformation as the basis matrix in non-negative matrix factorization because the basis matrix has the underlying features of the raw data. The proposed idea is instantiated as Sketch-then-Edit GAN (SEGAN) where sketches are the noises after transformation and are adopted as the name since they contains basic features of the raw data. Moreover, a new generator for editing the sketches into realistic-like data is designed. We mathematically prove that SEGAN solves the gradient vanishing problem, and conduct extensive experiments on the MNIST, CIFAR10, SVHN and Celeba datasets to demonstrate the effectiveness of SEGAN.
Generative Adversarial Network (GAN) is a thriving generative model and considerable efforts have been made to enhance the generation capabilities via designing a different adversarial framework of ...GAN (e.g., the discriminator and the generator) or redesigning the penalty function. Although existing models have been demonstrated to be very effective, their generation capabilities have limitations. Existing GAN variants either result in identical generated instances or generate simulation data with low quality when the training data are diverse and extremely limited (a dataset consists of a set of classes but each class holds several or even one single sample) or extremely imbalanced (a category holds a set of samples and other categories hold one single sample). In this paper, we present an innovative approach to tackle this issue, which jointly employs joint distribution and reparameterization method to reparameterize the randomized space as a mixture model and learn the parameters of this mixture model along with that of GAN. In this way, we term our approach Joint Distribution GAN (JDGAN). In our work, we show that the JDGAN can not only generate high quality simulation data with diversity, but also increase the overlapping area between the generating distribution and the raw data distribution. We proceed to conduct extensive experiments, utilizing MNIST, CIFAR10 and Mass Spectrometry datasets, all using extremely limited amounts of data, to demonstrate the significant performance of JDGAN in both achieving the smallest Fréchet Inception Distance (FID) score and producing diverse generated data.
Heterogeneous information network (HIN) embedding is to encode network structure into node representations with the heterogeneous semantics of different node and edge types considered. However, since ...each HIN may have a unique nature, e.g., a unique set of node and edge types, a model designed for one type of networks may not be applicable to or effective on another type. In this article, we thus attempt to propose a framework for HINs with arbitrary number of node and edge types. The proposed framework constructs a novel mixture-split representation of an HIN, and hence is named as MixSp. The mixture sub-representation and the split sub-representation serve as two different views of the network. Compared with existing models which only learn from the original view, MixSp thus may exploit more comprehensive information. Node representations in each view are learned by embedding the respective network structure. Moreover, the node representations are further refined through cross-view co-regularization. The framework is instantiated in three models which differ from each other in the co-regularization. Extensive experiments on three real-world datasets show MixSp outperforms several recent models in both node classification and link prediction tasks even though MixSp is not designed for a particular type of HINs.