scispace - formally typeset
Search or ask a question
Author

Andrea Cavallaro

Bio: Andrea Cavallaro is an academic researcher from Queen Mary University of London. The author has contributed to research in topics: Video tracking & Object detection. The author has an hindex of 46, co-authored 345 publications receiving 8945 citations. Previous affiliations of Andrea Cavallaro include Tel Aviv University & Dalhousie University.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper provides a comprehensive analysis of facial representations by uncovering their advantages and limitations, and elaborate on the type of information they encode and how they deal with the key challenges of illumination variations, registration errors, head-pose variations, occlusions, and identity bias.
Abstract: Automatic affect analysis has attracted great interest in various contexts including the recognition of action units and basic or non-basic emotions In spite of major efforts, there are several open questions on what the important cues to interpret facial expressions are and how to encode them In this paper, we review the progress across a range of affect recognition applications to shed light on these fundamental questions We analyse the state-of-the-art solutions by decomposing their pipelines into fundamental components, namely face registration, representation, dimensionality reduction and recognition We discuss the role of these components and highlight the models and new trends that are followed in their design Moreover, we provide a comprehensive analysis of facial representations by uncovering their advantages and limitations; we elaborate on the type of information they encode and discuss how they deal with the key challenges of illumination variations, registration errors, head-pose variations, occlusions, and identity bias This survey allows us to identify open issues and to define future directions for designing real-world affect recognition systems

601 citations

Journal ArticleDOI
TL;DR: An overview of the state of the art in atmospheric correction algorithms is provided, recent advances are highlighted and the possible potential for hyperspectral data to address the current challenges is discussed.
Abstract: Accurate correction of the corrupting effects of the atmosphere and the water’s surface are essential in order to obtain the optical, biological and biogeochemical properties of the water from satellite-based multi- and hyper-spectral sensors. The major challenges now for atmospheric correction are the conditions of turbid coastal and inland waters and areas in which there are strongly-absorbing aerosols. Here, we outline how these issues can be addressed, with a focus on the potential of new sensor technologies and the opportunities for the development of novel algorithms and aerosol models. We review hardware developments, which will provide qualitative and quantitative increases in spectral, spatial, radiometric and temporal data of the Earth, as well as measurements from other sources, such as the Aerosol Robotic Network for Ocean Color (AERONET-OC) stations, bio-optical sensors on Argo (Bio–Argo) floats and polarimeters. We provide an overview of the state of the art in atmospheric correction algorithms, highlight recent advances and discuss the possible potential for hyperspectral data to address the current challenges.

490 citations

Journal ArticleDOI
TL;DR: A new cast shadow segmentation algorithm is proposed that exploits spectral and geometrical properties of shadows in a scene to perform this task and is robust and efficient in detecting shadows for a large class of scenes.

408 citations

Proceedings ArticleDOI
01 Oct 2019
TL;DR: Zhou et al. as mentioned in this paper designed a residual block composed of multiple convolutional feature streams, each detecting features at a certain scale, and a novel unified aggregation gate was introduced to dynamically fuse multi-scale features with input-dependent channel-wise weights.
Abstract: As an instance-level recognition problem, person re-identification (ReID) relies on discriminative features, which not only capture different spatial scales but also encapsulate an arbitrary combination of multiple scales. We callse features of both homogeneous and heterogeneous scales omni-scale features. In this paper, a novel deep ReID CNN is designed, termed Omni-Scale Network (OSNet), for omni-scale feature learning. This is achieved by designing a residual block composed of multiple convolutional feature streams, each detecting features at a certain scale. Importantly, a novel unified aggregation gate is introduced to dynamically fuse multi-scale features with input-dependent channel-wise weights. To efficiently learn spatial-channel correlations and avoid overfitting, the building block uses both pointwise and depthwise convolutions. By stacking such blocks layer-by-layer, our OSNet is extremely lightweight and can be trained from scratch on existing ReID benchmarks. Despite its small model size, our OSNet achieves state-of-the-art performance on six person-ReID datasets. Code and models are available at: https://github.com/KaiyangZhou/deep-person-reid.

390 citations

Posted Content
TL;DR: A novel deep ReID CNN is designed, termed Omni-Scale Network (OSNet), for omni-scale feature learning by designing a residual block composed of multiple convolutional feature streams, each detecting features at a certain scale.
Abstract: As an instance-level recognition problem, person re-identification (ReID) relies on discriminative features, which not only capture different spatial scales but also encapsulate an arbitrary combination of multiple scales. We call features of both homogeneous and heterogeneous scales omni-scale features. In this paper, a novel deep ReID CNN is designed, termed Omni-Scale Network (OSNet), for omni-scale feature learning. This is achieved by designing a residual block composed of multiple convolutional streams, each detecting features at a certain scale. Importantly, a novel unified aggregation gate is introduced to dynamically fuse multi-scale features with input-dependent channel-wise weights. To efficiently learn spatial-channel correlations and avoid overfitting, the building block uses pointwise and depthwise convolutions. By stacking such block layer-by-layer, our OSNet is extremely lightweight and can be trained from scratch on existing ReID benchmarks. Despite its small model size, OSNet achieves state-of-the-art performance on six person ReID datasets, outperforming most large-sized models, often by a clear margin. Code and models are available at: \url{this https URL}.

371 citations


Cited by
More filters
Journal ArticleDOI

[...]

08 Dec 2001-BMJ
TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.
Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

33,785 citations

Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

01 Jan 2004
TL;DR: Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance and describes numerous important application areas such as image based rendering and digital libraries.
Abstract: From the Publisher: The accessible presentation of this book gives both a general view of the entire computer vision enterprise and also offers sufficient detail to be able to build useful applications. Users learn techniques that have proven to be useful by first-hand experience and a wide range of mathematical methods. A CD-ROM with every copy of the text contains source code for programming practice, color images, and illustrative movies. Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance. Topics are discussed in substantial and increasing depth. Application surveys describe numerous important application areas such as image based rendering and digital libraries. Many important algorithms broken down and illustrated in pseudo code. Appropriate for use by engineers as a comprehensive reference to the computer vision enterprise.

3,627 citations

Journal ArticleDOI
TL;DR: A novel tracking framework (TLD) that explicitly decomposes the long-term tracking task into tracking, learning, and detection, and develops a novel learning method (P-N learning) which estimates the errors by a pair of “experts”: P-expert estimates missed detections, and N-ex Expert estimates false alarms.
Abstract: This paper investigates long-term tracking of unknown objects in a video stream. The object is defined by its location and extent in a single frame. In every frame that follows, the task is to determine the object's location and extent or indicate that the object is not present. We propose a novel tracking framework (TLD) that explicitly decomposes the long-term tracking task into tracking, learning, and detection. The tracker follows the object from frame to frame. The detector localizes all appearances that have been observed so far and corrects the tracker if necessary. The learning estimates the detector's errors and updates it to avoid these errors in the future. We study how to identify the detector's errors and learn from them. We develop a novel learning method (P-N learning) which estimates the errors by a pair of “experts”: (1) P-expert estimates missed detections, and (2) N-expert estimates false alarms. The learning process is modeled as a discrete dynamical system and the conditions under which the learning guarantees improvement are found. We describe our real-time implementation of the TLD framework and the P-N learning. We carry out an extensive quantitative evaluation which shows a significant improvement over state-of-the-art approaches.

3,137 citations