scispace - formally typeset
Search or ask a question
Author

Marc Lalonde

Bio: Marc Lalonde is an academic researcher from McGill University. The author has contributed to research in topics: Image processing & Video tracking. The author has an hindex of 9, co-authored 27 publications receiving 876 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: Reports on the design and test of an image processing algorithm for the localization of the optic disk in low-resolution (about 20 /spl mu//pixel) color fundus images and a confidence level is associated to the final detection that indicates the "level of difficulty" the detector has to identify the OD position and shape.
Abstract: Reports on the design and test of an image processing algorithm for the localization of the optic disk (OD) in low-resolution (about 20 /spl mu//pixel) color fundus images The design relies on the combination of two procedures: 1) a Hausdorff-based template matching technique on edge map, guided by 2) a pyramidal decomposition for large scale object tracking The two approaches are tested against a database of 40 images of various visual quality and retinal pigmentation, as well as of normal and small pupils An average error of 7% on OD center positioning is reached with no false detection In addition, a confidence level is associated to the final detection that indicates the "level of difficulty" the detector has to identify the OD position and shape

413 citations

Proceedings ArticleDOI
03 Jul 2001
TL;DR: In this article, an overview of the design and test of an image processing procedure for detecting all important anatomical structures in color fundus images is presented. But this procedure is not suitable for the detection of the retinal network.
Abstract: We present an overview of the design and test of an image processing procedure for detecting all important anatomical structures in color fundus images. These structures are the optic disk, the macula and the retinal network. The algorithm proceeds through five main steps: (1) automatic mask generation using pixels value statistics and color threshold, (2) visual image quality assessment using histogram matching and Canny edge distribution modeling, (3) optic disk localization using pyramidal decomposition, Hausdorff-based template matching and confidence assignment, (4) macula localization using pyramidal decomposition and (5) bessel network tracking using recursive dual edge tracking and connectivity recovering. The procedure has been tested on a database of about 40 color fundus images acquired from a digital non-mydriatic fundus camera. The database is composed of images of various types (macula- and optic disk-centered) and of various visual quality (with or without abnormal bright or dark regions, blurred, etc).© (2001) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

191 citations

Proceedings ArticleDOI
28 May 2007
TL;DR: This paper reports on the implementation of a GPU-based, real-time eye blink detector on very low contrast images acquired under near-infrared illumination that is part of a multi-sensor data acquisition and analysis system for driver performance assessment and training.
Abstract: This paper reports on the implementation of a GPU-based, real-time eye blink detector on very low contrast images acquired under near-infrared illumination. This detector is part of a multi-sensor data acquisition and analysis system for driver performance assessment and training. Eye blinks are detected inside regions of interest that are aligned with the subject's eyes at initialization. Alignment is maintained through time by tracking SIFT feature points that are used to estimate the affine transformation between the initial face pose and the pose in subsequent frames. The GPU implementation of the SIFT feature point extraction algorithm ensures real-time processing. An eye blink detection rate of 97% is obtained on a video dataset of 33,000 frames showing 237 blinks from 22 subjects.

123 citations

Proceedings ArticleDOI
27 Apr 2007
TL;DR: The paper reports about the development of a software module that allows autonomous object detection, recognition and tracking in outdoor urban environment and its operational uses within the commercial system.
Abstract: The paper reports about the development of a software module that allows autonomous object detection, recognition and tracking in outdoor urban environment. The purpose of the project was to endow a commercial PTZ camera with object tracking and recognition capability to automate some surveillance tasks. The module can discriminate between various moving objects and identify the presence of pedestrians or vehicles, track them, and zoom on them, in near real-time. The paper gives an overview of the module characteristics and its operational uses within the commercial system.

47 citations

Proceedings ArticleDOI
25 Aug 1996
TL;DR: A system that allows the user to input maps into a geographic information system (GIS) by using automatic symbol and line recognition based on the Hausdorff distance and neural networks is presented.
Abstract: We present a system that allows the user to input maps into a geographic information system (GIS) by using automatic symbol and line recognition. The system is composed of a user interface, a symbol recognition engine, a knowledge base and a database. The recognition is based on the Hausdorff distance and neural networks, where our main contribution is to make the recognition efficient and robust for handling very large maps and many symbols of different scales and orientations. The system allows for efficient and coherent management of maps, recognition processes and recognition results.

28 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A method is presented for automated segmentation of vessels in two-dimensional color images of the retina based on extraction of image ridges, which coincide approximately with vessel centerlines, which is compared with two recently published rule-based methods.
Abstract: A method is presented for automated segmentation of vessels in two-dimensional color images of the retina. This method can be used in computer analyses of retinal images, e.g., in automated screening for diabetic retinopathy. The system is based on extraction of image ridges, which coincide approximately with vessel centerlines. The ridges are used to compose primitives in the form of line elements. With the line elements an image is partitioned into patches by assigning each image pixel to the closest line element. Every line element constitutes a local coordinate frame for its corresponding patch. For every pixel, feature vectors are computed that make use of properties of the patches and the line elements. The feature vectors are classified using a kNN-classifier and sequential forward feature selection. The algorithm was tested on a database consisting of 40 manually labeled images. The method achieves an area under the receiver operating characteristic curve of 0.952. The method is compared with two recently published rule-based methods of Hoover et al. and Jiang et al. . The results show that our method is significantly better than the two rule-based methods (p<0.01). The accuracy of our method is 0.944 versus 0.947 for a second observer.

3,416 citations

Proceedings ArticleDOI
29 Sep 2007
TL;DR: This paper uses a bag of words approach to represent videos, and presents a method to discover relationships between spatio-temporal words in order to better describe the video data.
Abstract: In this paper we introduce a 3-dimensional (3D) SIFT descriptor for video or 3D imagery such as MRI data. We also show how this new descriptor is able to better represent the 3D nature of video data in the application of action recognition. This paper will show how 3D SIFT is able to outperform previously used description methods in an elegant and efficient manner. We use a bag of words approach to represent videos, and present a method to discover relationships between spatio-temporal words in order to better describe the video data.

1,757 citations

Journal ArticleDOI
TL;DR: A neural network scheme for pixel classification and computes a 7-D vector composed of gray-level and moment invariants-based features for pixel representation that is suitable for retinal image computer analyses such as automated screening for early diabetic retinopathy detection.
Abstract: This paper presents a new supervised method for blood vessel detection in digital retinal images. This method uses a neural network (NN) scheme for pixel classification and computes a 7-D vector composed of gray-level and moment invariants-based features for pixel representation. The method was evaluated on the publicly available DRIVE and STARE databases, widely used for this purpose, since they contain retinal images where the vascular structure has been precisely marked by experts. Method performance on both sets of test images is better than other existing solutions in literature. The method proves especially accurate for vessel detection in STARE images. Its application to this database (even when the NN was trained on the DRIVE database) outperforms all analyzed segmentation approaches. Its effectiveness and robustness with different image conditions, together with its simplicity and fast implementation, make this blood vessel segmentation proposal suitable for retinal image computer analyses such as automated screening for early diabetic retinopathy detection.

913 citations

Book
20 Apr 2009
TL;DR: This book and the accompanying website, focus on template matching, a subset of object recognition techniques of wide applicability, which has proved to be particularly effective for face recognition applications.
Abstract: The detection and recognition of objects in images is a key research topic in the computer vision community Within this area, face recognition and interpretation has attracted increasing attention owing to the possibility of unveiling human perception mechanisms, and for the development of practical biometric systems This book and the accompanying website, focus on template matching, a subset of object recognition techniques of wide applicability, which has proved to be particularly effective for face recognition applications Using examples from face processing tasks throughout the book to illustrate more general object recognition approaches, Roberto Brunelli: examines the basics of digital image formation, highlighting points critical to the task of template matching; presents basic and advanced template matching techniques, targeting grey-level images, shapes and point sets; discusses recent pattern classification paradigms from a template matching perspective; illustrates the development of a real face recognition system; explores the use of advanced computer graphics techniques in the development of computer vision algorithms Template Matching Techniques in Computer Vision is primarily aimed at practitioners working on the development of systems for effective object recognition such as biometrics, robot navigation, multimedia retrieval and landmark detection It is also of interest to graduate students undertaking studies in these areas

721 citations

Book ChapterDOI
06 Sep 2014
TL;DR: A Convolutional Neural Network classifier is developed that can be used for text spotting in natural images and a method of automated data mining of Flickr, that generates word and character level annotations is used to form an end-to-end, state-of-the-art text spotting system.
Abstract: The goal of this work is text spotting in natural images. This is divided into two sequential tasks: detecting words regions in the image, and recognizing the words within these regions. We make the following contributions: first, we develop a Convolutional Neural Network (CNN) classifier that can be used for both tasks. The CNN has a novel architecture that enables efficient feature sharing (by using a number of layers in common) for text detection, character case-sensitive and insensitive classification, and bigram classification. It exceeds the state-of-the-art performance for all of these. Second, we make a number of technical changes over the traditional CNN architectures, including no downsampling for a per-pixel sliding window, and multi-mode learning with a mixture of linear models (maxout). Third, we have a method of automated data mining of Flickr, that generates word and character level annotations. Finally, these components are used together to form an end-to-end, state-of-the-art text spotting system. We evaluate the text-spotting system on two standard benchmarks, the ICDAR Robust Reading data set and the Street View Text data set, and demonstrate improvements over the state-of-the-art on multiple measures.

681 citations