scispace - formally typeset
Search or ask a question
Proceedings Article

Identification of great apes using face recognition

01 Aug 2011-pp 922-926
TL;DR: This paper presents a technique for the identification of great apes, in particular chimpanzees, using state-of-the-art algorithms for human face recognition in combination with several classification schemes.
Abstract: In recent years, thousands of species populations declined catastrophically leaving many species on the brink of extinction. Several biological studies have shown that especially primates like chimpanzees and gorillas are threatened. An essential part of effective biodiversity conservation management is population monitoring using remote camera devices. However, due to the large amount of data, the manual analysis of video recordings is extremely time consuming and highly cost intensive. Consequently, there is a high demand for automatic analytical routine procedures using computer vision techniques to overcome this issue. In this paper we present a technique for the identification of great apes, in particular chimpanzees, using state-of-the-art algorithms for human face recognition in combination with several classification schemes. For benchmark purposes we provide a publicly available dataset of captive chimpanzees. In our experiments we applied several common techniques like the well known Eigenfaces, Fisherfaces, Laplacianfaces and Randomfaces approaches to identify individuals. We compare all of these methods in combination with the classification approaches Nearest Neighbor (NN), Support Vector Machine (SVM) and a new concept for face recognition, Sparse Representation Classification (SRC) based on Compressive Sensing (CS).
Citations
More filters
Journal ArticleDOI
TL;DR: The proposed convolutional neural network (CNN)-based method has a single and generic trained architecture with promising performance for fish species identification.

171 citations

Journal ArticleDOI
TL;DR: An automated framework for photo identification of chimpanzees including face detection, face alignment, and face recognition is presented, which can be used by biologists, researchers, and gamekeepers to estimate population sizes faster and more precisely than the current frameworks.
Abstract: Due to the ongoing biodiversity crisis, many species including great apes like chimpanzees are on the brink of extinction. Consequently, there is an urgent need to protect the remaining populations of threatened species. To overcome the catastrophic decline of biodiversity, biologists and gamekeepers recently started to use remote cameras and recording devices for wildlife monitoring in order to estimate the size of remaining populations. However, the manual analysis of the resulting image and video material is extremely tedious, time consuming, and cost intensive. To overcome the burden of time-consuming routine work, we have recently started to develop computer vision algorithms for automated chimpanzee detection and identification of individuals. Based on the assumption that humans and great apes share similar properties of the face, we proposed to adapt and extend face detection and recognition algorithms, originally developed to recognize humans, for chimpanzee identification. In this paper we do not only summarize our earlier work in the field, we also extend our previous approaches towards a more robust system which is less prone to difficult lighting situations, various poses, and expressions as well as partial occlusion by branches, leafs, or other individuals. To overcome the limitations of our previous work, we combine holistic global features and locally extracted descriptors using a decision fusion scheme. We present an automated framework for photo identification of chimpanzees including face detection, face alignment, and face recognition. We thoroughly evaluate our proposed algorithms on two datasets of captive and free-living chimpanzee individuals which were annotated by experts. In three experiments we show that the presented framework outperforms previous approaches in the field of great ape identification and achieves promising results. Therefore, our system can be used by biologists, researchers, and gamekeepers to estimate population sizes faster and more precisely than the current frameworks. Thus, the proposed framework for chimpanzee identification has the potential to open up new venues in efficient wildlife monitoring and can help researches to develop innovative protection schemes in the future.

69 citations

Proceedings ArticleDOI
01 Oct 2017
TL;DR: In this article, a system for automatic interpretation of sightings of individual western lowland gorillas (Gorilla gorilla gorilla) as captured in facial field photography in the wild is described.
Abstract: In this paper we report on the context and evaluation of a system for an automatic interpretation of sightings of individual western lowland gorillas (Gorilla gorilla gorilla) as captured in facial field photography in the wild. This effort aligns with a growing need for effective and integrated monitoring approaches for assessing the status of biodiversity at high spatio-temporal scales. Manual field photography and the utilisation of autonomous camera traps have already transformed the way ecological surveys are conducted. In principle, many environments can now be monitored continuously, and with a higher spatio-temporal resolution than ever before. Yet, the manual effort required to process photographic data to derive relevant information delimits any large scale application of this methodology. The described system applies existing computer vision techniques including deep convolutional neural networks to cover the tasks of detection and localisation, as well as individual identification of gorillas in a practically relevant setup. We evaluate the approach on a relatively large and challenging data corpus of 12,765 field images of 147 individual gorillas with image-level labels (i.e. missing bounding boxes) photographed at Mbeli Bai at the Nouabal-Ndoki National Park, Republic of Congo. Results indicate a facial detection rate of 90.8% AP and an individual identification accuracy for ranking within the Top 5 set of 80.3%. We conclude that, whilst keeping the human in the loop is critical, this result is practically relevant as it exemplifies model transferability and has the potential to assist manual identification efforts. We argue further that there is significant need towards integrating computer vision deeper into ecological sampling methodologies and field practice to move the discipline forward and open up new research horizons.

58 citations

Journal ArticleDOI
TL;DR: In this paper, the authors used a mark-recapture method to estimate the number of mountain gorillas in Bwindi Impenetrable National Park, Uganda, and found that a notable proportion of gorillas were missed in either of the two sweeps (minimum 35% and 31%, respectively).

45 citations

Posted Content
TL;DR: A system for identifying elephants in the face of a large number of individuals with only few training images per individual is presented, combining object part localization, off-the-shelf CNN features, and support vector machine classification to provide field researches with proposals of possible individuals given new images of an elephant.
Abstract: Identifying animals from a large group of possible individuals is very important for biodiversity monitoring and especially for collecting data on a small number of particularly interesting individuals, as these have to be identified first before this can be done. Identifying them can be a very time-consuming task. This is especially true, if the animals look very similar and have only a small number of distinctive features, like elephants do. In most cases the animals stay at one place only for a short period of time during which the animal needs to be identified for knowing whether it is important to collect new data on it. For this reason, a system supporting the researchers in identifying elephants to speed up this process would be of great benefit. In this paper, we present such a system for identifying elephants in the face of a large number of individuals with only few training images per individual. For that purpose, we combine object part localization, off-the-shelf CNN features, and support vector machine classification to provide field researches with proposals of possible individuals given new images of an elephant. The performance of our system is demonstrated on a dataset comprising a total of 2078 images of 276 individual elephants, where we achieve 56% top-1 test accuracy and 80% top-10 accuracy. To deal with occlusion, varying viewpoints, and different poses present in the dataset, we furthermore enable the analysts to provide the system with multiple images of the same elephant to be identified and aggregate confidence values generated by the classifier. With that, our system achieves a top-1 accuracy of 74% and a top-10 accuracy of 88% on the held-out test dataset.

34 citations

References
More filters
Journal ArticleDOI
TL;DR: A face recognition algorithm which is insensitive to large variation in lighting direction and facial expression is developed, based on Fisher's linear discriminant and produces well separated classes in a low-dimensional subspace, even under severe variations in lighting and facial expressions.
Abstract: We develop a face recognition algorithm which is insensitive to large variation in lighting direction and facial expression. Taking a pattern classification approach, we consider each pixel in an image as a coordinate in a high-dimensional space. We take advantage of the observation that the images of a particular face, under varying illumination but fixed pose, lie in a 3D linear subspace of the high dimensional image space-if the face is a Lambertian surface without shadowing. However, since faces are not truly Lambertian surfaces and do indeed produce self-shadowing, images will deviate from this linear subspace. Rather than explicitly modeling this deviation, we linearly project the image into a subspace in a manner which discounts those regions of the face with large deviation. Our projection method is based on Fisher's linear discriminant and produces well separated classes in a low-dimensional subspace, even under severe variation in lighting and facial expressions. The eigenface technique, another method based on linearly projecting the image space to a low dimensional subspace, has similar computational requirements. Yet, extensive experimental results demonstrate that the proposed "Fisherface" method has error rates that are lower than those of the eigenface technique for tests on the Harvard and Yale face databases.

11,674 citations

Journal ArticleDOI
TL;DR: This work considers the problem of automatically recognizing human faces from frontal views with varying expression and illumination, as well as occlusion and disguise, and proposes a general classification algorithm for (image-based) object recognition based on a sparse representation computed by C1-minimization.
Abstract: We consider the problem of automatically recognizing human faces from frontal views with varying expression and illumination, as well as occlusion and disguise. We cast the recognition problem as one of classifying among multiple linear regression models and argue that new theory from sparse signal representation offers the key to addressing this problem. Based on a sparse representation computed by C1-minimization, we propose a general classification algorithm for (image-based) object recognition. This new framework provides new insights into two crucial issues in face recognition: feature extraction and robustness to occlusion. For feature extraction, we show that if sparsity in the recognition problem is properly harnessed, the choice of features is no longer critical. What is critical, however, is whether the number of features is sufficiently large and whether the sparse representation is correctly computed. Unconventional features such as downsampled images and random projections perform just as well as conventional features such as eigenfaces and Laplacianfaces, as long as the dimension of the feature space surpasses certain threshold, predicted by the theory of sparse representation. This framework can handle errors due to occlusion and corruption uniformly by exploiting the fact that these errors are often sparse with respect to the standard (pixel) basis. The theory of sparse representation helps predict how much occlusion the recognition algorithm can handle and how to choose the training images to maximize robustness to occlusion. We conduct extensive experiments on publicly available databases to verify the efficacy of the proposed algorithm and corroborate the above claims.

9,658 citations


"Identification of great apes using ..." refers background or methods in this paper

  • ...For Randomfaces, we used a feature dimension of 540 as suggested by [10] for high recognition results....

    [...]

  • ...Recently, SRC has been successfully applied to face recognition and promising results were obtained even under difficult lighting conditions and partial occlusion [10]....

    [...]

  • ...Recently, also randomly generated projection matrices, so called Randomfaces, in combination with a Compressive Sensing (CS) framework for classification, also known as Sparse Representation Classification (SRC), achieved excellent results even under difficult conditions like varying lighting or occlusion [10]....

    [...]

  • ...A detailed description of the Compressive Sensing based face recognition algorithm and Sparse Representation Classification in general can be found in [10]....

    [...]

Journal ArticleDOI
TL;DR: If the objects of interest are sparse in a fixed basis or compressible, then it is possible to reconstruct f to within very high accuracy from a small number of random measurements by solving a simple linear program.
Abstract: Suppose we are given a vector f in a class FsubeRopfN , e.g., a class of digital signals or digital images. How many linear measurements do we need to make about f to be able to recover f to within precision epsi in the Euclidean (lscr2) metric? This paper shows that if the objects of interest are sparse in a fixed basis or compressible, then it is possible to reconstruct f to within very high accuracy from a small number of random measurements by solving a simple linear program. More precisely, suppose that the nth largest entry of the vector |f| (or of its coefficients in a fixed basis) obeys |f|(n)lesRmiddotn-1p/, where R>0 and p>0. Suppose that we take measurements yk=langf# ,Xkrang,k=1,...,K, where the Xk are N-dimensional Gaussian vectors with independent standard normal entries. Then for each f obeying the decay estimate above for some 0

6,342 citations

Posted Content
TL;DR: In this article, it was shown that if the objects of interest are sparse or compressible in the sense that the reordered entries of a signal $f \in {\cal F}$ decay like a power-law, then it is possible to reconstruct $f$ to within very high accuracy from a small number of random measurements.
Abstract: Suppose we are given a vector $f$ in $\R^N$. How many linear measurements do we need to make about $f$ to be able to recover $f$ to within precision $\epsilon$ in the Euclidean ($\ell_2$) metric? Or more exactly, suppose we are interested in a class ${\cal F}$ of such objects--discrete digital signals, images, etc; how many linear measurements do we need to recover objects from this class to within accuracy $\epsilon$? This paper shows that if the objects of interest are sparse or compressible in the sense that the reordered entries of a signal $f \in {\cal F}$ decay like a power-law (or if the coefficient sequence of $f$ in a fixed basis decays like a power-law), then it is possible to reconstruct $f$ to within very high accuracy from a small number of random measurements.

5,693 citations

Proceedings ArticleDOI
03 Jun 1991
TL;DR: An approach to the detection and identification of human faces is presented, and a working, near-real-time face recognition system which tracks a subject's head and then recognizes the person by comparing characteristics of the face to those of known individuals is described.
Abstract: An approach to the detection and identification of human faces is presented, and a working, near-real-time face recognition system which tracks a subject's head and then recognizes the person by comparing characteristics of the face to those of known individuals is described. This approach treats face recognition as a two-dimensional recognition problem, taking advantage of the fact that faces are normally upright and thus may be described by a small set of 2-D characteristic views. Face images are projected onto a feature space ('face space') that best encodes the variation among known face images. The face space is defined by the 'eigenfaces', which are the eigenvectors of the set of faces; they do not necessarily correspond to isolated features such as eyes, ears, and noses. The framework provides the ability to learn to recognize new faces in an unsupervised manner. >

5,489 citations


"Identification of great apes using ..." refers methods in this paper

  • ...The famous Eigenfaces approach, introduced by Turk and Pentland in the early nineties [9], is one of the approaches for dimensionality reduction....

    [...]

  • ...The most famous methods for this purpose are Principle Component Analysis (Eigenfaces) [9], Linear Discriminant Analysis (Fisherfaces) [2] and Locality Preserving Projections (Laplacianfaces) [7]....

    [...]