Deep neural network (DNN) architecture based models have high expressive power and learning capacity. However, they are essentially a black box method since it is not easy to mathematically formulate ...the functions that are learned within its many layers of representation. Realizing this, many researchers have started to design methods to exploit the drawbacks of deep learning based algorithms questioning their robustness and exposing their singularities. In this paper, we attempt to unravel three aspects related to the robustness of DNNs for face recognition: (i) assessing the impact of deep architectures for face recognition in terms of vulnerabilities to attacks, (ii) detecting the singularities by characterizing abnormal filter response behavior in the hidden layers of deep networks; and (iii) making corrections to the processing pipeline to alleviate the problem. Our experimental evaluation using multiple open-source DNN-based face recognition networks, and three publicly available face databases demonstrates that the performance of deep learning based face recognition algorithms can suffer greatly in the presence of such distortions. We also evaluate the proposed approaches on four existing quasi-imperceptible distortions: DeepFool, Universal adversarial perturbations,
l
2
, and Elastic-Net (EAD). The proposed method is able to detect both types of attacks with very high accuracy by suitably designing a classifier using the response of the hidden layers in the network. Finally, we present effective countermeasures to mitigate the impact of adversarial attacks and improve the overall robustness of DNN-based face recognition.
Images captured in low-light environments often suffer from complex degradation. Simply adjusting light would inevitably result in burst of hidden noise and color distortion. To seek results with ...satisfied lighting, cleanliness, and realism from degraded inputs, this paper presents a novel framework inspired by the divide-and-rule principle, greatly alleviating the degradation entanglement. Assuming that an image can be decomposed into texture (with possible noise) and color components, one can specifically execute noise removal and color correction along with light adjustment. For this purpose, we propose to convert an image from the RGB colorspace into a luminance-chrominance one. An adjustable noise suppression network is designed to eliminate noise in the brightened luminance, having the illumination map estimated to indicate noise amplification levels. The enhanced luminance further serves as guidance for the chrominance mapper to generate realistic colors. Extensive experiments are conducted to reveal the effectiveness of our design, and demonstrate its superiority over state-of-the-art alternatives both quantitatively and qualitatively on several benchmark datasets. Our code has been made publicly available at
https://github.com/mingcv/Bread
.
With the increasing popularity of Unmanned Aerial Vehicles (UAVs) in computer vision-related applications, intelligent UAV video analysis has recently attracted the attention of an increasing number ...of researchers. To facilitate research in the UAV field, this paper presents a UAV dataset with 100 videos featuring approximately 2700 vehicles recorded under unconstrained conditions and 840k manually annotated bounding boxes. These UAV videos were recorded in complex real-world scenarios and pose significant new challenges, such as complex scenes, high density, small objects, and large camera motion, to the existing object detection and tracking methods. These challenges have encouraged us to define a benchmark for three fundamental computer vision tasks, namely, object detection, single object tracking (SOT) and multiple object tracking (MOT), on our UAV dataset. Specifically, our UAV benchmark facilitates evaluation and detailed analysis of state-of-the-art detection and tracking methods on the proposed UAV dataset. Furthermore, we propose a novel approach based on the so-called Context-aware Multi-task Siamese Network (CMSN) model that explores new cues in UAV videos by judging the consistency degree between objects and contexts and that can be used for SOT and MOT. The experimental results demonstrate that our model could make tracking results more robust in both SOT and MOT, showing that the current tracking and detection methods have limitations in dealing with the proposed UAV benchmark and that further research is indeed needed.
Psychiatric disorders are increasingly being recognised as having a biological basis, but their diagnosis is made exclusively behaviourally. A promising approach for 'biomarker' discovery has been ...based on pattern recognition methods applied to neuroimaging data, which could yield clinical utility in future. In this review we survey the literature on pattern recognition for making diagnostic predictions in psychiatric disorders, and evaluate progress made in translating such findings towards clinical application. We evaluate studies on many criteria, including data modalities used, the types of features extracted and algorithm applied. We identify problems common to many studies, such as a relatively small sample size and a primary focus on estimating generalisability within a single study. Furthermore, we highlight challenges that are not widely acknowledged in the field including the importance of accommodating disease prevalence, the necessity of more extensive validation using large carefully acquired samples, the need for methodological innovations to improve accuracy and to discriminate between multiple disorders simultaneously. Finally, we identify specific clinical contexts in which pattern recognition can add value in the short to medium term.
Human keypoint detection from a single image is very challenging due to occlusion, blur, illumination, and scale variance. In this paper, we address this problem from three aspects by devising an ...efficient network structure, proposing three effective training strategies, and exploiting four useful postprocessing techniques. First, we find that context information plays an important role in reasoning human body configuration and invisible keypoints. Inspired by this, we propose a cascaded context mixer (CCM), which efficiently integrates spatial and channel context information and progressively refines them. Then, to maximize CCM’s representation capability, we develop a hard-negative person detection mining strategy and a joint-training strategy by exploiting abundant unlabeled data. It enables CCM to learn discriminative features from massive diverse poses. Third, we present several sub-pixel refinement techniques for postprocessing keypoint predictions to improve detection accuracy. Extensive experiments on the MS COCO keypoint detection benchmark demonstrate the superiority of the proposed method over representative state-of-the-art (SOTA) methods. Our single model achieves comparable performance with the winner of the 2018 COCO Keypoint Detection Challenge. The final ensemble model sets a new SOTA on this benchmark. The source code will be released at
https://github.com/chaimi2013/CCM
.
RIG-I and MDA5 sense virus-derived short 5′ppp blunt-ended or long dsRNA, respectively, causing interferon production. Non-signaling LGP2 appears to positively and negatively regulate MDA5 and RIG-I ...signaling, respectively. Co-crystal structures of chicken (ch) LGP2 with dsRNA display a fully or semi-closed conformation depending on the presence or absence of nucleotide. LGP2 caps blunt, 3′ or 5′ overhang dsRNA ends with 1 bp longer overall footprint than RIG-I. Structures of 1:1 and 2:1 complexes of chMDA5 with short dsRNA reveal head-to-head packing rather than the polar head-to-tail orientation described for long filaments. chLGP2 and chMDA5 make filaments with a similar axial repeat, although less co-operatively for chLGP2. Overall, LGP2 resembles a chimera combining a MDA5-like helicase domain and RIG-I like CTD supporting both stem and end binding. Functionally, RNA binding is required for LGP2-mediated enhancement of MDA5 activation. We propose that LGP2 end-binding may promote nucleation of MDA5 oligomerization on dsRNA.
Display omitted
•chLPG2-dsRNA structures reveal RIG-I like end binding, but overhangs are possible•chMDA5-dsRNA complex structures show head-to-head packing on short dsRNAs•LGP2 also has MDA5-like behavior, coating dsRNA but with less cooperativity•Both human and chicken LGP2 enhance MDA5 signaling in an RNA-dependent manner
Uchikawa et al. reveal structural details of dsRNA recognition by MDA5 and LGP2 that synergistically sense viral RNA and activate interferon expression. LGP2 is primarily a dsRNA end binder but can also coat dsRNA, but less co-operatively than MDA5. Functional studies show that LGP2 enhancement of MDA5 signaling is RNA dependent.
We investigate the role of sparsity and localized features in a biologically-inspired model of visual object classification. As in the model of Serre, Wolf, and Poggio, we first apply Gabor filters ...at all positions and scales; feature complexity and position/scale invariance are then built up by alternating template matching and max pooling operations. We refine the approach in several biologically plausible ways. Sparsity is increased by constraining the number of feature inputs, lateral inhibition, and feature selection. We also demonstrate the value of retaining some position and scale information above the intermediate feature level. Our final model is competitive with current computer vision algorithms on several standard datasets, including the Caltech 101 object categories and the UIUC car localization task. The results further the case for biologically-motivated approaches to object classification.