Micro-expressions (MEs) are rapid and subtle facial movements that are difficult to detect and recognize. Most recent works have attempted to recognize MEs with spatial and temporal information from ...video clips. According to psychological studies, the apex frame conveys the most emotional information expressed in facial expressions. However, it is not clear how the single apex frame contributes to micro-expression recognition. To alleviate that problem, this paper firstly proposes a new method to detect the apex frame by estimating pixel-level change rates in the frequency domain. With frequency information, it performs more effectively on apex frame spotting than the currently existing apex frame spotting methods based on the spatio-temporal change information. Secondly, with the apex frame, this paper proposes a joint feature learning architecture coupling local and global information to recognize MEs, because not all regions make the same contribution to ME recognition and some regions do not even contain any emotional information. More specifically, the proposed model involves the local information learned from the facial regions contributing major emotion information, and the global information learned from the whole face. Leveraging the local and global information enables our model to learn discriminative ME representations and suppress the negative influence of unrelated regions to MEs. The proposed method is extensively evaluated using CASME, CASME II, SAMM, SMIC, and composite databases. Experimental results demonstrate that our method with the detected apex frame achieves considerably promising ME recognition performance, compared with the state-of-the-art methods employing the whole ME sequence. Moreover, the results indicate that the apex frame can significantly contribute to micro-expression recognition.
The vulnerability of face recognition systems to presentation attacks (also known as direct attacks or spoof attacks) has received a great deal of interest from the biometric community. The rapid ...evolution of face recognition systems into real-time applications has raised new concerns about their ability to resist presentation attacks, particularly in unattended application scenarios such as automated border control. The goal of a presentation attack is to subvert the face recognition system by presenting a facial biometric artifact. Popular face biometric artifacts include a printed photo, the electronic display of a facial photo, replaying video using an electronic display, and 3D face masks. These have demonstrated a high security risk for state-of-the-art face recognition systems. However, several presentation attack detection (PAD) algorithms (also known as countermeasures or antispoofing methods) have been proposed that can automatically detect and mitigate such targeted attacks. The goal of this survey is to present a systematic overview of the existing work on face presentation attack detection that has been carried out. This paper describes the various aspects of face presentation attacks, including different types of face artifacts, state-of-the-art PAD algorithms and an overview of the respective research labs working in this domain, vulnerability assessments and performance evaluation metrics, the outcomes of competitions, the availability of public databases for benchmarking new PAD algorithms in a reproducible manner, and finally a summary of the relevant international standardization in this field. Furthermore, we discuss the open challenges and future work that need to be addressed in this evolving field of biometrics.
Generative adversarial networks can be exploited to launch attacks against detection systems that rely on artificial intelligence (AI). To build effective cyberphysical systems that are operationally ...robust and socially accepted, we must expend significant effort to develop novel AI-based safety-critical systems.
While accurate lip synchronization has been achieved for arbitrary-subject audio-driven talking face generation, the problem of how to efficiently drive the head pose remains. Previous methods rely ...on pre-estimated structural information such as landmarks and 3D parameters, aiming to generate personalized rhythmic movements. However, the inaccuracy of such estimated information under extreme conditions would lead to degradation problems. In this paper, we propose a clean yet effective framework to generate pose-controllable talking faces. We operate on non-aligned raw face images, using only a single photo as an identity reference. The key is to modularize audio-visual representations by devising an implicit low-dimension pose code. Substantially, both speech content and head pose information lie in a joint non-identity embedding space. While speech content information can be defined by learning the intrinsic synchronization between audio-visual modalities, we identify that a pose code will be complementarily learned in a modulated convolution-based reconstruction framework.Extensive experiments show that our method generates accurately lip-synced talking faces whose poses are controllable by other videos. Moreover, our model has multiple advanced capabilities including extreme view robustness and talking face frontalization. 1
2 nm thin gold nanowires (AuNWs) have extremely high aspect ratio (≈10 000) and are nanoscale soft building blocks; this is different from conventional silver nanowires (AgNWs), which are more rigid. ...Here, highly sensitive, stretchable, patchable, and transparent strain sensors are fabricated based on the hybrid films of soft/hard networks. They are mechanically stretchable, optically transparent, and electrically conductive and are fabricated using a simple and cost‐effective solution process. The combination of soft and more rigid nanowires enables their use as high‐performance strain sensors with the maximum gauge factor (GF) of ≈236 at low strain (<5%), the highest stretchability of up to 70% strain, and the optical transparency is from 58.7% to 66.7% depending on the amount of the AuNW component. The sensors can detect strain as low as 0.05% and are energy efficient to operate at a voltage as low as 0.1 V. These attributes are difficult to achieve with a single component of either AuNWs or AgNWs. The outstanding sensing performance indicates their potential applications as “invisible” wearable sensors for biometric information collection, as demonstrated in applications for detecting facial expressions, respiration, and apexcardiogram.
The AgNWs/AuNWs percolation networks are optically transparent, electrically conductive, and mechanically stretchable, enabling their use as novel skin‐like “invisible” and “unfeelable” strain sensors for various biometric information monitoring, including heart beats, apexcardiograms, respiration, and facial expressions. The sensors are potentially useful for future practical applications in the medical health care system.
•Effective: PLFace significantly improves the accuracy of masked face recognition while maintaining the performance of normal face recognition on several face recognition benchmarks, including ...mask-free and masked datasets.•Easy: PLFace can be easily migrated to the existing loss functions of face recognition, e.g., CosFace, ArcFace, CurricularFace.•Efficient: PLFace only adds negligible computational complexity during the training process, and has the same cost as the backbone model during the inference process.
The outbreak of the COVID-19 coronavirus epidemic has promoted the development of masked face recognition (MFR). Nevertheless, the performance of regular face recognition is severely compromised when the MFR accuracy is blindly pursued. More facts indicate that MFR should be regarded as a mask bias of face recognition rather than an independent task. To mitigate mask bias, we propose a novel Progressive Learning Loss (PLFace) that achieves a progressive training strategy for deep face recognition to learn balanced performance for masked/mask-free faces recognition based on margin losses. Particularly, our PLFace adaptively adjusts the relative importance of masked and mask-free samples during different training stages. In the early stage of training, PLFace mainly learns the feature representations of mask-free samples. At this time, the regular sample embeddings shrink to the prototype. In the later stage of training, PLFace converges on mask-free samples and further focuses on masked samples until the masked sample embeddings are also gathered in the center of the class. The entire training process emphasizes the paradigm that normal samples shrink first and masked samples gather afterward. Extensive experimental results on popular regular and masked face benchmarks demonstrate the superiority of our PLFace over state-of-the-art competitors.
A recent approach to implicitly study face recognition skills has been the fast periodic visual stimulation (FPVS) coupled with electroencephalography (EEG). Its relationship with explicit behavioral ...measures of face individuation remains largely undocumented. We evaluated the relationship of the FPVS–EEG measure of individuation and performance at a computer version of the Benton Face Recognition Test. High-density EEG was recorded in 32 participants presented with an unfamiliar face at a rate of 6 Hz (F) for 60 s. Every five faces, new identities were inserted. The resulting 1.2 Hz (F/5) EEG response and its harmonics objectively indexed rapid individuation of unfamiliar faces. The robust individuation response, observed over occipitotemporal sites, was significantly correlated with speed, but not accuracy rate of the computer version of the Benton Face Recognition Test. This effect was driven by a few individuals who were particularly slow at the behavioral test and also showed the lowest face individuation response. These results highlight the importance of considering the time taken to recognize a face, as a complementary to accuracy rate variable, providing valuable information about one’s recognition skills. Overall, these observations strengthen the diagnostic value of FPVS–EEG as an objective and rapid flag for specific difficulties at individual face recognition in the human population.
Prosopagnosia is a cognitive disorder in which facial recognition is severely impaired despite normal vision and intelligence. Prosopagnosia was first reported in the 1800s, but its cause remains ...unclear. Although other neurological symptoms are often present, some patients have pure prosopagnosia. The bilateral occipital lobes are believed to be associated with symptoms. Recent brain imaging techniques have identified the right fusiform gyrus (rFG), located at the junction of the right occipital temporal lobe, as the affected region. In this report, we present a case of associative prosopagnosia with no concomitant symptoms in a 76-year-old man. Brain magnetic resonance imaging detected a subcortical hemorrhage in the right temporal lobe. Using tractography based on diffusion tensor imaging, we visualized atrophy of the right inferior longitudinal fasciculus (ILF). This is the first time tractography has been used to show a clear association between associative prosopagnosia and ILF damage projecting from the rFG.Prosopagnosia is a cognitive disorder in which facial recognition is severely impaired despite normal vision and intelligence. Prosopagnosia was first reported in the 1800s, but its cause remains unclear. Although other neurological symptoms are often present, some patients have pure prosopagnosia. The bilateral occipital lobes are believed to be associated with symptoms. Recent brain imaging techniques have identified the right fusiform gyrus (rFG), located at the junction of the right occipital temporal lobe, as the affected region. In this report, we present a case of associative prosopagnosia with no concomitant symptoms in a 76-year-old man. Brain magnetic resonance imaging detected a subcortical hemorrhage in the right temporal lobe. Using tractography based on diffusion tensor imaging, we visualized atrophy of the right inferior longitudinal fasciculus (ILF). This is the first time tractography has been used to show a clear association between associative prosopagnosia and ILF damage projecting from the rFG.
Face recognition difficulties are common in autism and could be a consequence of perceptual atypicalities that disrupt the ability to integrate current and prior information. We tested this theory by ...measuring the strength of serial dependence for faces (i.e. how likely is it that current perception of a face is biased towards a previously seen face) across the broader autism phenotype. Though serial dependence was not weaker in individuals with more autistic traits, more autistic traits were associated with greater integration of less similar faces. These results suggest that serial dependence is less specialised, and may not operate optimally, in individuals with more autistic traits and could therefore be a contributing factor to autism-linked face recognition difficulties.
Pain is an unpleasant feeling that has been shown to be an important factor for the recovery of patients. Since this is costly in human resources and difficult to do objectively, there is the need ...for automatic systems to measure it. In this paper, contrary to current state-of-the-art techniques in pain assessment, which are based on facial features only, we suggest that the performance can be enhanced by feeding the raw frames to deep learning models, outperforming the latest state-of-the-art results while also directly facing the problem of imbalanced data. As a baseline, our approach first uses convolutional neural networks (CNNs) to learn facial features from VGG_Faces, which are then linked to a long short-term memory to exploit the temporal relation between video frames. We further compare the performances of using the so popular schema based on the canonically normalized appearance versus taking into account the whole image. As a result, we outperform current state-of-the-art area under the curve performance in the UNBC-McMaster Shoulder Pain Expression Archive Database. In addition, to evaluate the generalization properties of our proposed methodology on facial motion recognition, we also report competitive results in the Cohn Kanade+ facial expression database.