scispace - formally typeset
A

Andrew Zisserman

Researcher at University of Oxford

Publications -  808
Citations -  312028

Andrew Zisserman is an academic researcher from University of Oxford. The author has contributed to research in topics: Convolutional neural network & Real image. The author has an hindex of 167, co-authored 808 publications receiving 261717 citations. Previous affiliations of Andrew Zisserman include University of Edinburgh & Microsoft.

Papers
More filters
Proceedings ArticleDOI

Temporal Query Networks for Fine-grained Video Understanding

TL;DR: Temporal Query Network (TQN) as discussed by the authors proposes a temporal attention mechanism to attend to relevant segments for each query with an attention mechanism, and can be trained using only the labels for each queried segment.
Proceedings ArticleDOI

"Here's looking at you, kid": Detecting people looking at each other in videos

TL;DR: Standard cinematography practice is to first establish which characters are looking at each other using a medium or wide shot, and then edit subsequent close-up shots so that the eyelines match the point of view of the characters.
Posted Content

You said that

TL;DR: In this article, an encoder-decoder CNN model was proposed to generate synthesized talking face video frames using a joint embedding of the face and audio to synthesize talking face videos.
Proceedings ArticleDOI

Disentangled Speech Embeddings Using Cross-Modal Self-Supervision

TL;DR: In this paper, a self-supervised learning objective that exploits the natural cross-modal synchrony between faces and audio in video is developed to learn representations of speaker identity without access to manually annotated data.
Book ChapterDOI

Matching and Reconstruction from Widely Separated Views

TL;DR: The objective of this work is to automatically estimate the trifocal tensor and feature correspondences over images triplets, under unrestricted camera motions and changes in internal parameters between views, by extending a previous wide baseline 2-view algorithm to 3-views.