In video surveillance, the captured face images are usually of low resolution (LR). Thus, a framework based on singular value decomposition (SVD) for performing both face hallucination and ...recognition simultaneously is proposed in this paper. Conventionally, LR face recognition is carried out by super-resolving the LR input face first, and then performing face recognition to identify the input face. By considering face hallucination and recognition simultaneously, the accuracy of both the hallucination and the recognition can be improved. In this paper, singular values are first proved to be effective for representing face images, and the singular values of a face image at different resolutions have approximately a linear relation. In our algorithm, each face image is represented using SVD. For each LR input face, the corresponding LR and high-resolution (HR) face-image pairs can then be selected from the face gallery. Based on these selected LR-HR pairs, the mapping functions for interpolating the two matrices in the SVD representation for the reconstruction of HR face images can be learned more accurately. Therefore, the final estimation of the high-frequency details of the HR face images will become more reliable and effective. The experimental results demonstrate that our proposed framework can achieve promising results for both face hallucination and recognition.
Different from general face recognition, age-invariant face recognition (AIFR) aims at matching faces with a big age gap. Previous discriminative methods usually focus on decomposing facial feature ...into age-related and age-invariant components, which suffer from the loss of facial identity information. In this article, we propose a novel Multi-feature Fusion and Decomposition (MFD) framework for age-invariant face recognition, which learns more discriminative and robust features and reduces the intra-class variants. Specifically, we first sample multiple face images of different ages with the same identity as a face time sequence. Then, the multi-head attention is employed to capture contextual information from facial feature series, extracted by the backbone network. Next, we combine feature decomposition with fusion based on the face time sequence to ensure that the final age-independent features effectively represent the identity information of the face and have stronger robustness against the aging process. Besides, we also mitigate imbalanced age distribution in the training data by a re-weighted age loss. We experimented with the proposed MFD over the popular CACD and CACD-VS datasets, where we show that our approach improves the AIFR performance than previous state-of-the-art methods. We simultaneously show the performance of MFD on LFW dataset.
With the recent advancement of deep convolutional neural networks, significant progress has been made in general face recognition. However, the state-of-the-art general face recognition models do not ...generalize well to occluded face images, which are exactly the common cases in real-world scenarios. The potential reasons are the absences of large-scale occluded face data for training and specific designs for tackling corrupted features brought by occlusions. This article presents a novel face recognition method that is robust to occlusions based on a single end-to-end deep neural network. Our approach, named FROM (Face Recognition with Occlusion Masks), learns to discover the corrupted features from the deep convolutional neural networks, and clean them by the dynamically learned masks. In addition, we construct massive occluded face images to train FROM effectively and efficiently. FROM is simple yet powerful compared to the existing methods that either rely on external detectors to discover the occlusions or employ shallow models which are less discriminative. Experimental results on the LFW, Megaface challenge 1, RMF2, AR dataset and other simulated occluded/masked datasets confirm that FROM dramatically improves the accuracy under occlusions, and generalizes well on general face recognition.
Cross modal face matching between the thermal and visible spectrum is a much desired capability for night-time surveillance and security applications. Due to a very large modality gap, ...thermal-to-visible face recognition is one of the most challenging face matching problem. In this paper, we present an approach to bridge this modality gap by a significant margin. Our approach captures the highly non-linear relationship between the two modalities by using a deep neural network. Our model attempts to learn a non-linear mapping from the visible to the thermal spectrum while preserving the identity information. We show substantive performance improvement on three difficult thermal–visible face datasets. The presented approach improves the state-of-the-art by more than 10 % on the UND-X1 dataset and by more than 15–30 % on the NVESD dataset in terms of Rank-1 identification. Our method bridges the drop in performance due to the modality gap by more than 40 %.
It has been claimed that faces are recognized as a "whole" rather than by the recognition of individual parts. In a paper published in the Quarterly Journal of Experimental Psychology in 1993, Martha ...Farah and I attempted to operationalize the holistic claim using the part/whole task. In this task, participants studied a face and then their memory presented in isolation and in the whole face. Consistent with the holistic view, recognition of the part was superior when tested in the whole-face condition compared to when it was tested in isolation. The "whole face" or holistic advantage was not found for faces that were inverted, or scrambled, nor for non-face objects, suggesting that holistic encoding was specific to normal, intact faces. In this paper, we reflect on the part/whole paradigm and how it has contributed to our understanding of what it means to recognize a face as a "whole" stimulus. We describe the value of part/whole task for developing theories of holistic and non-holistic recognition of faces and objects. We discuss the research that has probed the neural substrates of holistic processing in healthy adults and people with prosopagnosia and autism. Finally, we examine how experience shapes holistic face recognition in children and recognition of own- and other-race faces in adults. The goal of this article is to summarize the research on the part/whole task and speculate on how it has informed our understanding of holistic face processing.
Achieving the upper limits of face identification accuracy in forensic applications can minimize errors that have profound social and personal consequences. Although forensic examiners identify faces ...in these applications, systematic tests of their accuracy are rare. How can we achieve the most accurate face identification: using people and/or machines working alone or in collaboration? In a comprehensive comparison of face identification by humans and computers, we found that forensic facial examiners, facial reviewers, and superrecognizers were more accurate than fingerprint examiners and students on a challenging face identification test. Individual performance on the test varied widely. On the same test, four deep convolutional neural networks (DCNNs), developed between 2015 and 2017, identified faces within the range of human accuracy. Accuracy of the algorithms increased steadily over time, with the most recent DCNN scoring above the median of the forensic facial examiners. Using crowd-sourcing methods, we fused the judgments of multiple forensic facial examiners by averaging their rating-based identity judgments. Accuracy was substantially better for fused judgments than for individuals working alone. Fusion also served to stabilize performance, boosting the scores of lower-performing individuals and decreasing variability. Single forensic facial examiners fused with the best algorithm were more accurate than the combination of two examiners. Therefore, collaboration among humans and between humans and machines offers tangible benefits to face identification accuracy in important applications. These results offer an evidence-based roadmap for achieving the most accurate face identification possible.
Typically, the deployment of face recognition models in the wild needs to identify low-resolution faces with extremely low computational cost. To address this problem, a feasible solution is ...compressing a complex face model to achieve higher speed and lower memory at the cost of minimal performance drop. Inspired by that, this paper proposes a learning approach to recognize low-resolution faces via selective knowledge distillation. In this approach, a two-stream convolutional neural network (CNN) is first initialized to recognize high-resolution faces and resolution-degraded faces with a teacher stream and a student stream, respectively. The teacher stream is represented by a complex CNN for high-accuracy recognition, and the student stream is represented by a much simpler CNN for low-complexity recognition. To avoid significant performance drop at the student stream, we then selectively distil the most informative facial features from the teacher stream by solving a sparse graph optimization problem, which are then used to regularize the fine-tuning process of the student stream. In this way, the student stream is actually trained by simultaneously handling two tasks with limited computational resources: approximating the most informative facial cues via feature regression, and recovering the missing facial cues via low-resolution face classification. Experimental results show that the student stream performs impressively in recognizing low-resolution faces and costs only 0.15-MB memory and runs at 418 faces per second on CPU and 9433 faces per second on GPU.
Face recognition has achieved advanced development by using convolutional neural network (CNN) based recognizers. Existing recognizers typically demonstrate powerful capacity in recognizing ...un-occluded faces, but often suffer from accuracy degradation when directly identifying occluded faces. This is mainly due to insufficient visual and identity cues caused by occlusions. On the other hand, generative adversarial network (GAN) is particularly suitable when it needs to reconstruct visually plausible occlusions by face inpainting. Motivated by these observations, this paper proposes identity-diversity inpainting to facilitate occluded face recognition. The core idea is integrating GAN with an optimized pre-trained CNN recognizer which serves as the third player to compete with the generator by distinguishing diversity within the same identity class. To this end, a collect of identity-centered features is applied in the recognizer as supervision to enable the inpainted faces clustering towards their identity centers. In this way, our approach can benefit from GAN for reconstruction and CNN for representation, and simultaneously addresses two challenging tasks, face inpainting and face recognition. Experimental results compared with 4 state-of-the-arts prove the efficacy of the proposed approach.