A Survey on Deep Semi-Supervised Learning Yang, Xiangli; Song, Zixing; King, Irwin ...
IEEE transactions on knowledge and data engineering,
2023-Sept.-1, 2023-9-1, Letnik:
35, Številka:
9
Journal Article
Recenzirano
Odprti dostop
Deep semi-supervised learning is a fast-growing field with a range of practical applications. This paper provides a comprehensive survey on both fundamentals and recent advances in deep ...semi-supervised learning methods from perspectives of model design and unsupervised loss functions. We first present a taxonomy for deep semi-supervised learning that categorizes existing methods, including deep generative methods, consistency regularization methods, graph-based methods, pseudo-labeling methods, and hybrid methods. Then we provide a comprehensive review of 60 representative methods and offer a detailed comparison of these methods in terms of the type of losses, architecture differences, and test performance results. In addition to the progress in the past few years, we further discuss some shortcomings of existing methods and provide some tentative heuristic solutions for solving these open problems.
Hyperspectral image (HSI) classification is an active research topic in remote sensing. Supervised learning-based methods have been widely used in HSI classification tasks due to their powerful ...feature extraction capabilities for cases of sufficiently labeled samples. However, practical applications often have limited samples with accurate labels due to the high cost of labeling or unreliable visual interpretation. We introduce a contrastive self-supervised learning (SSL) algorithm to achieve HSI classification for problems with few labeled samples. First, a new HSI-specific augmentation module is developed to generate sample pairs. Then, a contrastive SSL model based on Siamese networks is used to extract features from these easily accessible sample pairs. Finally, the labeled samples are taken to fine-tune the parameters of the classification model to boost classification performance. Tests of the contrastive self-supervised algorithm have been performed on two widely used HSI datasets. The experimental results reveal that the proposed algorithm requires a few labeled samples to achieve superior performance.
A review of research on co‐training Ning, Xin; Wang, Xinran; Xu, Shaohui ...
Concurrency and computation,
15 August 2023, Letnik:
35, Številka:
18
Journal Article
Recenzirano
Summary
Co‐training algorithm is one of the main methods of semi‐supervised learning in machine learning, which explores the effective information in unlabeled data by multi‐learner collaboration. ...Based on the development of co‐training algorithm, the research work in recent years was further summarized in this article. In particular, three main steps of relevant co‐training algorithms are introduced: view acquisition, learners' differentiation, and label confidence estimation. Finally, we summarized the problems existing in the current co‐training methods, gave some suggestions for improvement, and looked forward to the future development direction of the co‐training algorithm.
Self‐supervised colocalization is to localize common objects in the data set containing only one superclass without using human‐annotated labels. Existing methods achieve impressive results by ...employing self‐supervised pretext learning. However, a common limitation still exists. They either tend to overextend activations to the background, or they tend to activate the most discriminative object part. To alleviate this problem, we propose an object representation enhancement model to weaken background distraction and to mine complementary object regions during the object representation learning. Specifically, we first propose an Object‐aware Representation Enhancement (ORE) module to estimate an object mask for each input image, guiding the model to disregard the background content and focus on the foreground object. The ORE module and the subsequent self‐supervised learning can mutually reinforce each other. Then we propose a Masked Self‐supervised Learning branch and design a masked attention consistency objective to induce the model to activate complementary parts of the object effectively. Extensive experiments on four fine‐grained data sets demonstrate the superiority of the proposed model.
Recently, supervised deep learning has achieved a great success in remote sensing image (RSI) semantic segmentation. However, supervised learning for semantic segmentation requires a large number of ...labeled samples, which is difficult to obtain in the field of remote sensing. A new learning paradigm, self-supervised learning (SSL), can be used to solve such problems by pretraining a general model with a large number of unlabeled images and then fine-tuning it on a downstream task with very few labeled samples. Contrastive learning is a typical method of SSL that can learn general invariant features. However, most existing contrastive learning methods are designed for classification tasks to obtain an image-level representation, which may be suboptimal for semantic segmentation tasks requiring pixel-level discrimination. Therefore, we propose a global style and local matching contrastive learning network (GLCNet) for RSI semantic segmentation. Specifically, first, the global style contrastive learning module is used to better learn an image-level representation, as we consider that style features can better represent the overall image features. Next, the local features matching the contrastive learning module is designed to learn the representations of local regions, which is beneficial for semantic segmentation. We evaluate four RSI semantic segmentation datasets, and the experimental results show that our method mostly outperforms the state-of-the-art self-supervised methods and the ImageNet pretraining method. Specifically, with 1% annotation from the original dataset, our approach improves Kappa by 6% on the International Society for Photogrammetry and Remote Sensing (ISPRS) Potsdam dataset relative to the existing baseline. Moreover, our method outperforms supervised learning methods when there are some differences between the datasets of upstream tasks and downstream tasks. Our study promotes the development of SSL in the field of RSI semantic segmentation. Since SSL could directly learn the essential characteristics of data from unlabeled data, which is easy to obtain in the remote sensing field, this may be of great significance for tasks such as global mapping. The source code is available at https://github.com/GeoX-Lab/G-RSIM .
Abstract
Supervised learning techniques construct predictive models by learning from a large number of training examples, where each training example has a label indicating its ground-truth output. ...Though current techniques have achieved great success, it is noteworthy that in many tasks it is difficult to get strong supervision information like fully ground-truth labels due to the high cost of the data-labeling process. Thus, it is desirable for machine-learning techniques to work with weak supervision. This article reviews some research progress of weakly supervised learning, focusing on three typical types of weak supervision: incomplete supervision, where only a subset of training data is given with labels; inexact supervision, where the training data are given with only coarse-grained labels; and inaccurate supervision, where the given labels are not always ground-truth.
Self-supervised learning (SSL) achieves great success in speech recognition, while limited exploration has been attempted for other speech processing tasks. As speech signal contains multi-faceted ...information including speaker identity, paralinguistics, spoken content, etc., learning universal representations for all speech tasks is challenging. To tackle the problem, we propose a new pre-trained model, WavLM, to solve full-stack downstream speech tasks. WavLM jointly learns masked speech prediction and denoising in pre-training. By this means, WavLM does not only keep the speech content modeling capability by the masked speech prediction, but also improves the potential to non-ASR tasks by the speech denoising. In addition, WavLM employs gated relative position bias for the Transformer structure to better capture the sequence ordering of input speech. We also scale up the training dataset from 60 k hours to 94 k hours. WavLM Large achieves state-of-the-art performance on the SUPERB benchmark, and brings significant improvements for various speech processing tasks on their representative benchmarks.
Semi-supervised learning (SSL) suffers from severe performance degradation when labeled and unlabeled data come from inconsistent and imbalanced distribution. Nonetheless, there is a lack of ...theoretical guidance regarding a remedy for this issue. To bridge the gap between theoretical insights and practical solutions, we embark to an analysis of generalization bound of classic SSL algorithms. This analysis reveals that distribution inconsistency between unlabeled and labeled data can cause a significant generalization error bound. Motivated by this theoretical insight, we present a Triplet Adaptation Framework (TAF) to reduce the distribution divergence and improve the generalization of SSL models. TAF comprises three adapters: Balanced Residual Adapter , aiming to map the class distribution of labeled and unlabeled data to a uniform distribution for reducing class distribution divergence; Representation Adapter , aiming to map the representation distribution of unlabeled data to labeled one for reducing representation distribution divergence; and Pseudo-Label Adapter , aiming to align the predicted pseudo-labels with the class distribution of unlabeled data, thereby preventing erroneous pseudo-labels from exacerbating representation divergence. These three adapters collaborate synergistically to reduce the generalization bound, ultimately achieving a more robust and generalizable SSL model. Extensive experiments across various robust SSL scenarios validate the efficacy of our method.
As an emerging and challenging problem in the computer vision community, weakly supervised object localization and detection plays an important role for developing new generation computer vision ...systems and has received significant attention in the past decade. As methods have been proposed, a comprehensive survey of these topics is of great importance. In this work, we review (1) classic models, (2) approaches with feature representations from off-the-shelf deep networks, (3) approaches solely based on deep learning, and (4) publicly available datasets and standard evaluation metrics that are widely used in this field. We also discuss the key challenges in this field, development history of this field, advantages/disadvantages of the methods in each category, the relationships between methods in different categories, applications of the weakly supervised object localization and detection methods, and potential future directions to further promote the development of this research field.