DIKUL - logo

Search results

Basic search    Expert search   

Currently you are NOT authorised to access e-resources UL. For full access, REGISTER.

1 2 3 4 5
hits: 654
1.
  • Quo Vadis, Action Recogniti... Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
    Carreira, Joao; Zisserman, Andrew 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 07/2017
    Conference Proceeding
    Open access

    The paucity of videos in current action classification datasets (UCF-101 and HMDB-51) has made it difficult to identify good video architectures, as most methods obtain similar performance on ...
Full text
Available for: UL

PDF
2.
Full text

PDF
3.
  • Learning to lip read words ... Learning to lip read words by watching videos
    Chung, Joon Son; Zisserman, Andrew Computer vision and image understanding, August 2018, 2018-08-00, Volume: 173
    Journal Article
    Peer reviewed
    Open access

    •Fully automated collection of a large-scale lip reading dataset from TV broadcasts.•Two-stream CNN for lip synchronization and active speaker detection.•Deep learning architectures to lip read ...
Full text
Available for: UL

PDF
4.
  • Convolutional Two-Stream Network Fusion for Video Action Recognition
    Feichtenhofer, Christoph; Pinz, Axel; Zisserman, Andrew 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 06/2016
    Conference Proceeding
    Open access

    Recent applications of Convolutional Neural Networks (ConvNets) for human action recognition in videos have proposed different solutions for incorporating the appearance and motion information. We ...
Full text
Available for: UL

PDF
5.
  • Symbiotic Segmentation and ... Symbiotic Segmentation and Part Localization for Fine-Grained Categorization
    Yuning Chai; Lempitsky, Victor; Zisserman, Andrew 2013 IEEE International Conference on Computer Vision, 12/2013
    Conference Proceeding, Journal Article

    We propose a new method for the task of fine-grained visual categorization. The method builds a model of the base-level category that can be fitted to images, producing high-quality foreground ...
Full text
Available for: UL
6.
  • Utterance-level Aggregation for Speaker Recognition in the Wild
    Xie, Weidi; Nagrani, Arsha; Chung, Joon Son ... ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 05/2019
    Conference Proceeding
    Open access

    The objective of this paper is speaker recognition 'in the wild' - where utterances may be of variable length and also contain irrelevant signals. Crucial elements in the design of deep networks for ...
Full text
Available for: UL

PDF
7.
  • A Statistical Approach to T... A Statistical Approach to Texture Classification from Single Images
    Varma, Manik; Zisserman, Andrew International journal of computer vision, 04/2005, Volume: 62, Issue: 1-2
    Journal Article
    Peer reviewed

    Issue Title: Special Issue on Texture Analysis and Synthesis We investigate texture classification from single images obtained under unknown viewpoint and illumination. A statistical approach is ...
Full text
Available for: CEKLJ, UL
8.
  • Flowing ConvNets for Human Pose Estimation in Videos
    Pfister, Tomas; Charles, James; Zisserman, Andrew 2015 IEEE International Conference on Computer Vision (ICCV), 12/2015
    Conference Proceeding, Journal Article

    The objective of this work is human pose estimation in videos, where multiple frames are available. We investigate a ConvNet architecture that is able to benefit from temporal context by combining ...
Full text
Available for: UL

PDF
9.
  • ASR is All You Need: Cross-Modal Distillation for Lip Reading
    Afouras, Triantafyllos; Chung, Joon Son; Zisserman, Andrew ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 05/2020
    Conference Proceeding
    Open access

    The goal of this work is to train strong models for visual speech recognition without requiring human annotated ground truth data. We achieve this by distilling from an Automatic Speech Recognition ...
Full text
Available for: UL

PDF
10.
  • Triangulation Embedding and... Triangulation Embedding and Democratic Aggregation for Image Search
    Jegou, Herve; Zisserman, Andrew 2014 IEEE Conference on Computer Vision and Pattern Recognition, 06/2014
    Conference Proceeding
    Open access

    We consider the design of a single vector representation for an image that embeds and aggregates a set of local patch descriptors such as SIFT. More specifically we aim to construct a dense ...
Full text
Available for: UL

PDF
1 2 3 4 5
hits: 654

Load filters