scispace - formally typeset
Search or ask a question
Author

Sílvio Filipe

Bio: Sílvio Filipe is an academic researcher from University of Beira Interior. The author has contributed to research in topics: Image segmentation & Biometrics. The author has an hindex of 7, co-authored 11 publications receiving 633 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: The main purpose of this paper is to announce the availability of the UBIRIS.v2 database, a multisession iris images database which singularly contains data captured in the visible wavelength, at-a-distance and on on-the-move.
Abstract: The iris is regarded as one of the most useful traits for biometric recognition and the dissemination of nationwide iris-based recognition systems is imminent. However, currently deployed systems rely on heavy imaging constraints to capture near infrared images with enough quality. Also, all of the publicly available iris image databases contain data correspondent to such imaging constraints and therefore are exclusively suitable to evaluate methods thought to operate on these type of environments. The main purpose of this paper is to announce the availability of the UBIRIS.v2 database, a multisession iris images database which singularly contains data captured in the visible wavelength, at-a-distance (between four and eight meters) and on on-the-move. This database is freely available for researchers concerned about visible wavelength iris recognition and will be useful in accessing the feasibility and specifying the constraints of this type of biometric recognition.

482 citations

Proceedings Article
01 Jan 2014
TL;DR: This paper proposes to do a description and evaluation of existing keypoint detectors in a public available point cloud library with real objects and performs a comparative evaluation on 3D point clouds, and evaluates the invariance of the 3D key point detectors according to rotations, scale changes and translations.
Abstract: When processing 3D point cloud data, features must be extracted from a small set of points, usually called keypoints. This is done to avoid the computational complexity required to extract features from all points in a point cloud. There are many keypoint detectors and this suggests the need of a comparative evaluation. When the keypoint detectors are applied to 3D objects, the aim is to detect a few salient structures which can be used, instead of the whole object, for applications like object registration, retrieval and data simplification. In this paper, we propose to do a description and evaluation of existing keypoint detectors in a public available point cloud library with real objects and perform a comparative evaluation on 3D point clouds. We evaluate the invariance of the 3D keypoint detectors according to rotations, scale changes and translations. The evaluation criteria used are the absolute and the relative repeatability rate. Using these criteria, we evaluate the robustness of the detectors with respect to changes of point-of-view. In our experiments, the method that achieved better repeatability rate was the ISS3D method.

94 citations

Journal ArticleDOI
TL;DR: This article included text and ideas taken by the first author, without acknowledgement, from the following published article: “State-ofthe-art in visual attention modeling”, Ali Borji, Laurent Itti, IEEE Transactions on Pattern Analysis and Machine Intelligence 35(1) (2013) 185–207, published online 05/04/12.
Abstract: This article has been retracted by the authors. The article included text and ideas taken by the first author, without acknowledgement, from the following published article: “State-ofthe-art in visual attention modeling”, Ali Borji, Laurent Itti, IEEE Transactions on Pattern Analysis and Machine Intelligence 35(1) (2013) 185–207, published online 05/04/12. Most notably: • In Sect. 3.1 (Biological plausible methods) the following paragraphs or sentences largely derive from the Borji and Itti article: “Rosenholtz (1999), Rosenholtz et al. (2004) designed a model ...”; “In Gu et al. (2005), a saliency map ... ”; “Le Meur et al. (2006) proposed ...”; “Kootstra et al. (2008) developed ...”; “Marat et al. (2009) proposed ...”; “Chikkerur et al. (2010) proposed ...”; and “Murray et al. (2011) introduced ...”. • In Sect. 3.2 (Computational methods) the following paragraphs or sentences largely derive from the Borji and Itti article: “Salah et al. (2002) proposed ...”; “Ramstrom and Christensen (2002) introduced ...”; “In Rao et al. (2002) and Rao (2005), they proposed ...”; “Jodogne andPiater (2007) introduced ...”; “Boccignone (2008) presented ...”; “Rosin (2009) proposed ...”; “Mahadevan and Vasconcelos (2010) presented ...”; and “Wang et al. (2011) introduced ...”. • In Sect. 3.3 (Hybrid methods) the following paragraphs or sentences largely derive from theBorji and Itti article: “Lee andYu (1999) proposed ...”; “Peters et al. (2005), Peters and Itti (2007a,b, 2008) trained ...”; “Weights between two nodes ...”; “The model consists of a nonlinear ...”; “Zhang et al. (2007, 2008) proposed ...”; “Pang et al. (2008) presented ...”; “Zhang et al. (2009) extended ...”; and “Li et al. (2010a) presented ...”. • Section 6 (Discussion) largely derives from, or summarizes ideas presented in, Sects. 2.1, 2.2, 2.4, 2.6 and 3.1–3.8 of the Borji and Itti article. The first author apologizes for his action.

40 citations

Journal ArticleDOI
TL;DR: A new method for the detection of 3D keypoints on point clouds is presented and the keypoint detector is inspired by the behavior and neural architecture of the primate visual system, which outperforms the other eight3D keypoint detectors evaluated by achieving the best result in 32 of the evaluated metrics in the category and object recognition experiments.
Abstract: One of the major problems found when developing a 3D recognition system involves the choice of keypoint detector and descriptor. To help solve this problem, we present a new method for the detection of 3D keypoints on point clouds and we perform benchmarking between each pair of 3D keypoint detector and 3D descriptor to evaluate their performance on object and category recognition. These evaluations are done in a public database of real 3D objects. Our keypoint detector is inspired by the behavior and neural architecture of the primate visual system. The 3D keypoints are extracted based on a bottom-up 3D saliency map, that is, a map that encodes the saliency of objects in the visual environment. The saliency map is determined by computing conspicuity maps (a combination across different modalities) of the orientation, intensity, and color information in a bottom-up and in a purely stimulus-driven manner. These three conspicuity maps are fused into a 3D saliency map and, finally, the focus of attention (or keypoint location) is sequentially directed to the most salient points in this map. Inhibiting this location automatically allows the system to attend to the next most salient location. The main conclusions are: with a similar average number of keypoints, our 3D keypoint detector outperforms the other eight 3D keypoint detectors evaluated by achieving the best result in 32 of the evaluated metrics in the category and object recognition experiments, when the second best detector only obtained the best result in eight of these metrics. The unique drawback is the computational time, since biologically inspired 3D keypoint based on bottom-up saliency is slower than the other detectors. Given that there are big differences in terms of recognition performance, size and time requirements, the selection of the keypoint detector and descriptor has to be matched to the desired task and we give some directions to facilitate this choice.

24 citations

Book ChapterDOI
07 Sep 2015
TL;DR: This paper proposes a fully automated surveillance system for human recognition purposes, attained by combining human detection and tracking, further enhanced by a PTZ camera that delivers data with enough quality to perform biometric recognition.
Abstract: Efforts in biometrics are being held into extending robust recognition techniques to in the wild scenarios. Nonetheless, and despite being a very attractive goal, human identification in the surveillance context remains an open problem. In this paper, we introduce a novel biometric system – Quis-Campi – that effectively bridges the gap between surveillance and biometric recognition while having a minimum amount of operational restrictions. We propose a fully automated surveillance system for human recognition purposes, attained by combining human detection and tracking, further enhanced by a PTZ camera that delivers data with enough quality to perform biometric recognition. Along with the system concept, implementation details for both hardware and software modules are provided, as well as preliminary results over a real scenario.

21 citations


Cited by
More filters
01 Jan 1998
TL;DR: The lateral intraparietal area (LIP) as mentioned in this paper has been shown to have visual responses to stimuli appearing abruptly at particular retinal locations (their receptive fields) and the visual representation in LIP is sparse, with only the most salient or behaviourally relevant objects being strongly represented.
Abstract: When natural scenes are viewed, a multitude of objects that are stable in their environments are brought in and out of view by eye movements. The posterior parietal cortex is crucial for the analysis of space, visual attention and movement 1 . Neurons in one of its subdivisions, the lateral intraparietal area (LIP), have visual responses to stimuli appearing abruptly at particular retinal locations (their receptive fields)2. We have tested the responses of LIP neurons to stimuli that entered their receptive field by saccades. Neurons had little or no response to stimuli brought into their receptive field by saccades, unless the stimuli were behaviourally significant. We established behavioural significance in two ways: either by making a stable stimulus task-relevant, or by taking advantage of the attentional attraction of an abruptly appearing stimulus. Our results show that under ordinary circumstances the entire visual world is only weakly represented in LIP. The visual representation in LIP is sparse, with only the most salient or behaviourally relevant objects being strongly represented.

1,007 citations

Journal ArticleDOI
TL;DR: This paper compares ten popular local feature descriptors in the contexts of 3D object recognition, 3D shape retrieval, and 3D modeling and presents the performance results of these descriptors when combined with different 3D keypoint detection methods.
Abstract: A number of 3D local feature descriptors have been proposed in the literature. It is however, unclear which descriptors are more appropriate for a particular application. A good descriptor should be descriptive, compact, and robust to a set of nuisances. This paper compares ten popular local feature descriptors in the contexts of 3D object recognition, 3D shape retrieval, and 3D modeling. We first evaluate the descriptiveness of these descriptors on eight popular datasets which were acquired using different techniques. We then analyze their compactness using the recall of feature matching per each float value in the descriptor. We also test the robustness of the selected descriptors with respect to support radius variations, Gaussian noise, shot noise, varying mesh resolution, distance to the mesh boundary, keypoint localization error, occlusion, clutter, and dataset size. Moreover, we present the performance results of these descriptors when combined with different 3D keypoint detection methods. We finally analyze the computational efficiency for generating each descriptor.

503 citations

Journal ArticleDOI
20 Apr 2017-Sensors
TL;DR: A simulated deep convolutional neural network for yield estimation based on robotic agriculture that counts efficiently even if fruits are under shadow, occluded by foliage, branches, or if there is some degree of overlap amongst fruits.
Abstract: Recent years have witnessed significant advancement in computer vision research based on deep learning. Success of these tasks largely depends on the availability of a large amount of training samples. Labeling the training samples is an expensive process. In this paper, we present a simulated deep convolutional neural network for yield estimation. Knowing the exact number of fruits, flowers, and trees helps farmers to make better decisions on cultivation practices, plant disease prevention, and the size of harvest labor force. The current practice of yield estimation based on the manual counting of fruits or flowers by workers is a very time consuming and expensive process and it is not practical for big fields. Automatic yield estimation based on robotic agriculture provides a viable solution in this regard. Our network is trained entirely on synthetic data and tested on real data. To capture features on multiple scales, we used a modified version of the Inception-ResNet architecture. Our algorithm counts efficiently even if fruits are under shadow, occluded by foliage, branches, or if there is some degree of overlap amongst fruits. Experimental results show a 91% average test accuracy on real images and 93% on synthetic images.

375 citations

Journal ArticleDOI
TL;DR: An overview of soft biometrics is provided and some of the techniques that have been proposed to extract them from the image and the video data are discussed, a taxonomy for organizing and classifying soft biometric attributes is introduced, and the strengths and limitations are enumerated.
Abstract: Recent research has explored the possibility of extracting ancillary information from primary biometric traits viz., face, fingerprints, hand geometry, and iris. This ancillary information includes personal attributes, such as gender, age, ethnicity, hair color, height, weight, and so on. Such attributes are known as soft biometrics and have applications in surveillance and indexing biometric databases. These attributes can be used in a fusion framework to improve the matching accuracy of a primary biometric system (e.g., fusing face with gender information), or can be used to generate qualitative descriptions of an individual (e.g., young Asian female with dark eyes and brown hair). The latter is particularly useful in bridging the semantic gap between human and machine descriptions of the biometric data. In this paper, we provide an overview of soft biometrics and discuss some of the techniques that have been proposed to extract them from the image and the video data. We also introduce a taxonomy for organizing and classifying soft biometric attributes, and enumerate the strengths and limitations of these attributes in the context of an operational biometric system. Finally, we discuss open research problems in this field. This survey is intended for researchers and practitioners in the field of biometrics.

355 citations

Journal ArticleDOI
TL;DR: It is concluded that systems employing 2D views of 3D data typically surpass voxel-based (3D) deep models, which however, can perform better with more layers and severe data augmentation, therefore, larger-scale datasets and increased resolutions are required.
Abstract: Deep learning has recently gained popularity achieving state-of-the-art performance in tasks involving text, sound, or image processing. Due to its outstanding performance, there have been efforts to apply it in more challenging scenarios, for example, 3D data processing. This article surveys methods applying deep learning on 3D data and provides a classification based on how they exploit them. From the results of the examined works, we conclude that systems employing 2D views of 3D data typically surpass voxel-based (3D) deep models, which however, can perform better with more layers and severe data augmentation. Therefore, larger-scale datasets and increased resolutions are required.

269 citations