scispace - formally typeset
Search or ask a question

Showing papers on "Object (computer science) published in 2006"


Journal ArticleDOI
TL;DR: The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends to discuss the important issues related to tracking including the use of appropriate image features, selection of motion models, and detection of objects.
Abstract: The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends. Object tracking, in general, is a challenging problem. Difficulties in tracking objects can arise due to abrupt object motion, changing appearance patterns of both the object and the scene, nonrigid object structures, object-to-object and object-to-scene occlusions, and camera motion. Tracking is usually performed in the context of higher-level applications that require the location and/or shape of the object in every frame. Typically, assumptions are made to constrain the tracking problem in the context of a particular application. In this survey, we categorize the tracking methods on the basis of the object and motion representations used, provide detailed descriptions of representative methods in each category, and examine their pros and cons. Moreover, we discuss the important issues related to tracking including the use of appropriate image features, selection of motion models, and detection of objects.

5,318 citations


Journal ArticleDOI
TL;DR: It is found that on a database of more than 100 categories, the Bayesian approach produces informative models when the number of training examples is too small for other methods to operate successfully.
Abstract: Learning visual models of object categories notoriously requires hundreds or thousands of training examples. We show that it is possible to learn much information about a category from just one, or a handful, of images. The key insight is that, rather than learning from scratch, one can take advantage of knowledge coming from previously learned categories, no matter how different these categories might be. We explore a Bayesian implementation of this idea. Object categories are represented by probabilistic models. Prior knowledge is represented as a probability density function on the parameters of these models. The posterior model for an object category is obtained by updating the prior in the light of one or more observations. We test a simple implementation of our algorithm on a database of 101 diverse object categories. We compare category models learned by an implementation of our Bayesian approach to models learned from by maximum likelihood (ML) and maximum a posteriori (MAP) methods. We find that on a database of more than 100 categories, the Bayesian approach produces informative models when the number of training examples is too small for other methods to operate successfully.

2,976 citations


01 Jul 2006
TL;DR: This memo provides information for the Internet community on JSON, a lightweight, text-based, language-independent data interchange format derived from the ECMAScript Programming Language Standard.
Abstract: JavaScript Object Notation (JSON) is a lightweight, text-based, language-independent data interchange format. It was derived from the ECMAScript Programming Language Standard. JSON defines a small set of formatting rules for the portable representation of structured data. This memo provides information for the Internet community.

1,119 citations


Journal ArticleDOI
TL;DR: The object-recognition task has been used to study mutant mice, aging deficits, early developmental influences, nootropic manipulations, teratological drug exposure and novelty seeking.
Abstract: Rats and mice have a tendency to interact more with a novel object than with a familiar object. This tendency has been used by behavioral pharmacologists and neuroscientists to study learning and memory. A popular protocol for such research is the object-recognition task. Animals are first placed in an apparatus and allowed to explore an object. After a prescribed interval, the animal is returned to the apparatus, which now contains the familiar object and a novel object. Object recognition is distinguished by more time spent interacting with the novel object. Although the exact processes that underlie this 'recognition memory' requires further elucidation, this method has been used to study mutant mice, aging deficits, early developmental influences, nootropic manipulations, teratological drug exposure and novelty seeking.

1,029 citations


Journal ArticleDOI
17 Jun 2006
TL;DR: This paper provides a framework for placing local object detection in the context of the overall 3D scene by modeling the interdependence of objects, surface orientations, and camera viewpoint by allowing probabilistic object hypotheses to refine geometry and vice-versa.
Abstract: Image understanding requires not only individually estimating elements of the visual world but also capturing the interplay among them. In this paper, we provide a framework for placing local object detection in the context of the overall 3D scene by modeling the interdependence of objects, surface orientations, and camera viewpoint. Most object detection methods consider all scales and locations in the image as equally likely. We show that with probabilistic estimates of 3D geometry, both in terms of surfaces and world coordinates, we can put objects into perspective and model the scale and location variance in the image. Our approach reflects the cyclical nature of the problem by allowing probabilistic object hypotheses to refine geometry and vice-versa. Our framework allows painless substitution of almost any object detector and is easily extended to include other aspects of image understanding. Our results confirm the benefits of our integrated approach.

929 citations


Proceedings ArticleDOI
17 Jun 2006
TL;DR: This work compute multiple segmentations of each image and then learns the object classes and chooses the correct segmentations, demonstrating that such an algorithm succeeds in automatically discovering many familiar objects in a variety of image datasets, including those from Caltech, MSRC and LabelMe.
Abstract: Given a large dataset of images, we seek to automatically determine the visually similar object and scene classes together with their image segmentation. To achieve this we combine two ideas: (i) that a set of segmented objects can be partitioned into visual object classes using topic discovery models from statistical text analysis; and (ii) that visual object classes can be used to assess the accuracy of a segmentation. To tie these ideas together we compute multiple segmentations of each image and then: (i) learn the object classes; and (ii) choose the correct segmentations. We demonstrate that such an algorithm succeeds in automatically discovering many familiar objects in a variety of image datasets, including those from Caltech, MSRC and LabelMe.

737 citations


Patent
08 Mar 2006
TL;DR: In this paper, a distributed, web-services based storage system is described, which includes a web service interface configured to receive, according to a web services protocol, a given client request for access to a given data object, the request including a key value corresponding to the object.
Abstract: A distributed, web-services based storage system A system may include a web services interface configured to receive, according to a web services protocol, a given client request for access to a given data object, the request including a key value corresponding to the object The system may also include storage nodes configured to store replicas of the objects, where each replica is accessible via a respective unique locator value, and a keymap instance configured to store a respective keymap entry for each object For the given object, the respective keymap entry includes the key value and each locator value corresponding to replicas of the object A coordinator may receive the given client request from the web services interface, responsively access the keymap instance to identify locator values corresponding to the key value and, for a particular locator value, retrieve a corresponding replica from a corresponding storage node

704 citations


Proceedings ArticleDOI
17 Jun 2006
TL;DR: A biologically inspired model of visual object recognition to the multiclass object categorization problem, modifies that of Serre, Wolf, and Poggio, and demonstrates the value of retaining some position and scale information above the intermediate feature level.
Abstract: We apply a biologically inspired model of visual object recognition to the multiclass object categorization problem. Our model modifies that of Serre, Wolf, and Poggio. As in that work, we first apply Gabor filters at all positions and scales; feature complexity and position/scale invariance are then built up by alternating template matching and max pooling operations. We refine the approach in several biologically plausible ways, using simple versions of sparsification and lateral inhibition. We demonstrate the value of retaining some position and scale information above the intermediate feature level. Using feature selection we arrive at a model that performs better with fewer features. Our final model is tested on the Caltech 101 object categories and the UIUC car localization task, in both cases achieving state-of-the-art performance. The results strengthen the case for using this class of model in computer vision.

539 citations


Patent
Bas Ording1, Scott Forstall1, Greg Christie1, Stephen O. Lemay1, Imran Chaudhri1 
29 Dec 2006
TL;DR: In this article, a portable communication device with multi-touch input detects one or more multitouch contacts and motions and performs one-touch operations on an object based on the contacts and/or motions, such that the object has a resolution that is less than a pre-determined threshold when the operation is performed on the object, and the object's resolution is greater than the threshold at other times.
Abstract: A portable communication device with multi-touch input detects one or more multi-touch contacts and motions and performs one or more operations on an object based on the one or more multi-touch contacts and/or motions. The object has a resolution that is less than a pre-determined threshold when the operation is performed on the object, and the object has a resolution that is greater than the pre-determined threshold at other times.

539 citations


Patent
13 Sep 2006
TL;DR: In this paper, the authors present a system for providing an improved three-dimensional graphical user interface, which can be represented as two or more objects within a 3D virtual space displayed to the user.
Abstract: Methods and systems are provided for providing an improved three-dimensional graphical user interface. In one embodiment, the method generally comprises: receiving an input from an end user, and capturing computing output from at least one computer source in response to the received end-user input. The computing output can be presented as two or more objects within a three-dimensional virtual space displayed to the end user. In one embodiment, the method further comprises generating a timeline that includes an icon for each object presented within the virtual space. In another embodiment, the method further comprises providing a database for storing and categorizing data regarding each object presented within the virtual space.

426 citations


Patent
17 Nov 2006
TL;DR: In this article, a method for detecting direction when interfacing with a computer program is presented, which includes capturing an image presented in front of an image capture device and assigning an object held by the person in the image and assigning the object an object location in coordinate space.
Abstract: A method for detecting direction when interfacing with a computer program is provided. The method includes capturing an image presented in front of an image capture device. The image capture device has a capture location in a coordinate space. When a person is captured in the image, the method includes identifying a human head in the image and assigning the human head a head location in the coordinate space. The method also includes identifying an object held by the person in the image and assigning the object an object location in coordinate space. The method further includes identifying a relative position in coordinate space between the head location and the object location when viewed from the capture location. The relative position includes a dimension of depth. The method may be practiced on a computer system, such as one used in the gaming field.

Journal ArticleDOI
TL;DR: The performance of a detection algorithm is illustrated intuitively by performance graphs which present object level precision and recall depending on constraints on detection quality, and a representative single performance value is computed from the graphs.
Abstract: Evaluation of object detection algorithms is a non-trivial task: a detection result is usually evaluated by comparing the bounding box of the detected object with the bounding box of the ground truth object. The commonly used precision and recall measures are computed from the overlap area of these two rectangles. However, these measures have several drawbacks: they don't give intuitive information about the proportion of the correctly detected objects and the number of false alarms, and they cannot be accumulated across multiple images without creating ambiguity in their interpretation. Furthermore, quantitative and qualitative evaluation is often mixed resulting in ambiguous measures. In this paper we propose a new approach which tackles these problems. The performance of a detection algorithm is illustrated intuitively by performance graphs which present object level precision and recall depending on constraints on detection quality. In order to compare different detection algorithms, a representative single performance value is computed from the graphs. The influence of the test database on the detection performance is illustrated by performance/generality graphs. The evaluation method can be applied to different types of object detection algorithms. It has been tested on different text detection algorithms, among which are the participants of the ICDAR 2003 text detection competition.

Patent
28 Jun 2006
TL;DR: In this paper, the authors describe a method for executing a gesture with a user-manipulated physical object in the vicinity of a device, and interpreting the data as pertaining to at least one object, such as an object displayed by the device.
Abstract: A method includes executing a gesture with a user-manipulated physical object in the vicinity of a device; generating data that is descriptive of the presence of the user-manipulated object when executing the gesture; and interpreting the data as pertaining to at least one object, such as an object displayed by the device.

Journal ArticleDOI
TL;DR: The proposed method has the robust ability to track theMoving object in the consecutive frames under some kinds of real-world complex situations such as the moving object disappearing totally or partially due to occlusion by other ones, fast moving object, changing lighting, changing the direction and orientation of the movingobject, and changing the velocity of moving object suddenly.

Patent
Juha Henrik Arrasvuori1
19 Sep 2006
TL;DR: In this paper, the authors present a system that facilitates shopping for a tangible object via a network using a mobile device, where a graphical representation of a scene of a local environment using a sensor of the mobile device is obtained via the network.
Abstract: Facilitating shopping for a tangible object via a network using a mobile device involves obtaining a graphical representation of a scene of a local environment using a sensor of the mobile device. Graphical object data that enables a three-dimensional representation of the tangible object to be rendered on the mobile device is obtained via the network, in response to a shopping selection. The three-dimensional representation of the tangible object is displayed with the graphical representation of the scene via the mobile device so that the appearance of the tangible object in the scene is simulated.


Journal ArticleDOI
TL;DR: The Fedora architecture is an extensible framework for the storage, management, and dissemination of complex objects and the relationships among them, providing the foundation for a variety of end-user applications for digital libraries, archives, institutional repositories, and learning object systems.
Abstract: The Fedora architecture is an extensible framework for the storage, management, and dissemination of complex objects and the relationships among them. Fedora accommodates the aggregation of local and distributed content into digital objects and the association of services with objects. This allows an object to have several accessible representations, some of them dynamically produced. The architecture includes a generic Resource Description Framework (RDF)-based relationship model that represents relationships among objects and their components. Queries against these relationships are supported by an RDF triple store. The architecture is implemented as a web service, with all aspects of the complex object architecture and related management functions exposed through REST and SOAP interfaces. The implementation is available as open-source software, providing the foundation for a variety of end-user applications for digital libraries, archives, institutional repositories, and learning object systems.

Patent
11 Sep 2006
TL;DR: In this article, a method for determining an intensity value of an interaction with a computer program is described, which includes capturing an image of a capture zone, identifying an input object in the image, identifying the initial value of a parameter of the input object, capturing a second image of the capture zone and identifying a second value of the parameter.
Abstract: A method for determining an intensity value of an interaction with a computer program is described. The method and device includes capturing an image of a capture zone, identifying an input object in the image, identifying an initial value of a parameter of the input object, capturing a second image of the capture zone, and identifying a second value of the parameter of the input object. The parameter identifies one or more of a shape, color, or brightness of the input object and is affected by human manipulation of the input object. The extent of change in the parameter is calculated, which is the difference between the second value and the first value. An activity input is provided to the computer program, the activity input including an intensity value representing the extent of change of the parameter. A method for detecting an intensity value from sound generating input objects, and a computer video game are also described. A game controller having LEDs, sound capture and generation, or an accelerometer is also described.

Proceedings ArticleDOI
17 Jun 2006
TL;DR: The Implicit Shape Model for object class detection is combined with the multi-view specific object recognition system of Ferrari et al. to detect object instances from arbitrary viewpoints.
Abstract: We present a novel system for generic object class detection. In contrast to most existing systems which focus on a single viewpoint or aspect, our approach can detect object instances from arbitrary viewpoints. This is achieved by combining the Implicit Shape Model for object class detection proposed by Leibe and Schiele with the multi-view specific object recognition system of Ferrari et al. After learning single-view codebooks, these are interconnected by so-called activation links, obtained through multi-view region tracks across different training views of individual object instances. During recognition, these integrated codebooks work together to determine the location and pose of the object. Experimental results demonstrate the viability of the approach and compare it to a bank of independent single-view detectors

Patent
08 Feb 2006
TL;DR: In this paper, a new class of metrics known as "interestingness" is proposed to rank media objects based on the quantity of user-entered metadata concerning the media object.
Abstract: Media objects, such as images or soundtracks, may be ranked according to a new class of metrics known as “interestingness.” These rankings may be based at least in part on the quantity of user-entered metadata concerning the media object, the number of users who have assigned metadata to the media object, access patterns related to the media object, and/or a lapse of time related to the media object.

Patent
14 Mar 2006
TL;DR: In this article, a system consisting of an illuminating unit and an imaging unit is presented for real-time reconstruction of a three-dimensional map of the object in the optical path of illuminating light propagating from the light source towards an object, thereby projecting onto the object a coherent random speckle pattern.
Abstract: A system and method are presented for use in the object reconstruction The system comprises an illuminating unit, and an imaging unit (see figure 1) The illuminating unit comprises a coherent light source and a generator of a random speckle pattern accommodated in the optical path of illuminating light propagating from the light source towards an object, thereby projecting onto the object a coherent random speckle pattern The imaging unit is configured for detecting a light response of an illuminated region and generating image data The image data is indicative of the object with the projected speckles pattern and thus indicative of a shift of the pattern in the image of the object relative to a reference image of said pattern This enables real-time reconstruction of a three- dimensional map of the object

Reference BookDOI
01 Dec 2006
TL;DR: In this article, an object-oriented approach for image analysis is presented, using multispectral remote sensing and multi-scale image analysis techniques, where the parent-child object relations are explored using semantic relations.
Abstract: Introduction Background Objects and Human Interpretation Process Object-Oriented Paradigm Organization of the Book Multispectral Remote Sensing Spatial Resolution Spectral Resolution Radiometric Resolution Temporal Resolution Multispectral Image Analysis Why an Object-Oriented Approach? Object Properties Advantages of Object-Oriented Approach Creating Objects Image Segmentation Techniques Creating and Classifying Objects at Multiple Scales Object Classification Creating Multiple Levels Creating Class Hierarchy and Classifying Objects Final Classification Using Object Relationships between Levels Object-Based Image Analysis Image Analysis Techniques Supervised Classification Using Multispectral Information Exploring the Spatial Dimension Using Contextual Information Taking Advantage of Morphology Parameters Taking Advantage of Texture Adding Temporal Dimension Advanced Object Image Analysis Techniques to Control Image Segmentation within eCognition Techniques to Control Image Segmentation within eCognition Multi-Scale Approach for Image Analysis Objects vs. Spatial Resolution Exploring the Parent-Child Object Relationships Using Semantic Relationships Taking Advantage of Ancillary Data Accuracy Assessment Sample Selection Sampling Techniques Ground Truth Collection Accuracy Assessment Measures References Index

Patent
10 Oct 2006
TL;DR: In this paper, a system and method for monitoring events derived from a computer target application presentation layer including the steps of providing, independent of recompiling the target application's source code, a script running at a level within a target application.
Abstract: Presented is a system and method for monitoring events derived from a computer target application presentation layer including the steps of providing, independent of recompiling the target application's source code, a script running at a level within the target application. The script scans run-time instantiations of objects of the target application, and allocates structures in real-time to the object instantiations. These allocated structures are adapted to create a reflection of the target application structure, which is used along with detected object instantiations that match a predetermined object structure to capture a portion of an environmental spectrum of the detected object. Further, the system can process state machine events occurring on at least one of a server machine and a client/localized machine, correlate the state machine events with the environmental spectrum, and deduce a user experience based on the correlated state machine events.

Dissertation
01 Jan 2006
TL;DR: The approach couples topic models originally developed for text analysis with spatial transformations, and thus consistently accounts for geometric constraints by building integrated scene models, which may discover contextual relationships, and better exploit partially labeled training images.
Abstract: We develop statistical methods which allow effective visual detection, categorization, and tracking of objects in complex scenes. Such computer vision systems must be robust to wide variations in object appearance; the often small size of training databases, and ambiguities induced by articulated or partially occluded objects. Graphical models provide a powerful framework for encoding the statistical structure of visual scenes, and developing corresponding learning and inference algorithms. In this thesis, we describe several models which integrate graphical representations with nonparametric statistical methods. This approach leads to inference algorithms which tractably recover high-dimensional, continuous object pose variations, and learning procedures which transfer knowledge among related recognition tasks. Motivated by visual tracking problems, we first develop a nonparametric extension of the belief propagation (BP) algorithm. Using Monte Carlo methods, we provide general procedures for recursively updating particle-based approximations of continuous sufficient statistics. Efficient multiscale sampling methods then allow this nonparametric BP algorithm to be flexibly adapted to many different applications. As a particular example, we consider a graphical model describing the hand's three-dimensional (3D) structure, kinematics, and dynamics. This graph encodes global hand pose via the 3D position and orientation of several rigid components, and thus exposes local structure in a high-dimensional articulated model. Applying nonparametric BP, we recover a hand tracking algorithm which is robust to outliers and local visual ambiguities. Via a set of latent occupancy masks, we also extend our approach to consistently infer occlusion events in a distributed fashion. In the second half of this thesis, we develop methods for learning hierarchical models of objects, the parts composing them, and the scenes surrounding them. Our approach couples topic models originally developed for text analysis with spatial transformations, and thus consistently accounts for geometric constraints. By building integrated scene models, we may discover contextual relationships, and better exploit partially labeled training images. We first consider images of isolated objects, and show that sharing parts among object categories improves accuracy when learning from few examples. Turning to multiple object scenes, we propose nonparametric models which use Dirichlet processes to automatically learn the number of parts underlying each object category, and objects composing each scene. Adapting these transformed Dirichlet processes to images taken with a binocular stereo camera, we learn integrated, 3D models of object geometry and appearance. This leads to a Monte Carlo algorithm which automatically infers 3D scene structure from the predictable geometry of known object categories. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)

Proceedings ArticleDOI
23 May 2006
TL;DR: The ADABU prototype for JAVA has successfully mined models of undocumented behavior from the AspectJ compiler and the Columba email client; the models tend to be small and easily understandable.
Abstract: To learn what constitutes correct program behavior, one can start with normal behavior. We observe actual program executions to construct state machines that summarize object behavior. These state machines, called object behavior models, capture the relationships between two kinds of methods: mutators that change the state (such as add()) and inspectors that keep the state unchanged (such as isEmpty()): "A Vector object initially is in isEmpty() state; after add(), it goes into ¬isEmpty() state". Our ADABU prototype for JAVA has successfully mined models of undocumented behavior from the AspectJ compiler and the Columba email client; the models tend to be small and easily understandable.

Patent
Robert M. Morris1, Leet E. Denton1
10 Aug 2006
TL;DR: In this paper, the authors present a system that allows objects to be graphically inserted into the program under development by dragging and dropping associated icons into one of the four views of Output, Map, Multitrack, and Workform.
Abstract: A computer implemented application development (authoring) system permits objects (such as VBX custom controls) to be graphically inserted into the program under development by dragging and dropping associated icons into one of four views. The properties associated with the object may then be assigned settings. Development of a complete application is accomplished by visually arranging, ordering, and interconnecting the objects without the necessity of writing any code. The four views of Output, Map, Multitrack, and Workform may be synchronized so that changes made to the program in one view are simultaneously reflected in all other views. The system generates as output a script listing the objects and their properties which is then executed by a separate run time program. The system permits use of objects written to a standard specification and the addition at any time of additional objects written to that. Integration of the objects into the system is achieved by wrapping each object in an “envelope” of system specific properties.

Journal Article
TL;DR: In this paper, a weakly supervised approach is proposed to learn both a model of local part appearance and a model for the spatial relations between those parts, and the results show that the effect on performance depends substantially on the particular object class and on the difficulty of the test dataset.
Abstract: In this paper we investigate a new method of learning part-based models for visual object recognition, from training data that only provides information about class membership (and not object location or configuration). This method learns both a model of local part appearance and a model of the spatial relations between those parts. In contrast, other work using such a weakly supervised learning paradigm has not considered the problem of simultaneously learning appearance and spatial models. Some of these methods use a bag model where only part appearance is considered whereas other methods learn spatial models but only given the output of a particular feature detector. Previous techniques for learning both part appearance and spatial relations have instead used a highly supervised learning process that provides substantial information about object part location. We show that our weakly supervised technique produces better results than these previous highly supervised methods. Moreover, we investigate the degree to which both richer spatial models and richer appearance models are helpful in improving recognition performance. Our results show that while both spatial and appearance information can be useful, the effect on performance depends substantially on the particular object class and on the difficulty of the test dataset.

Patent
Suman Nath1
26 Oct 2006
TL;DR: In this article, a search is conducted on a keyword string of one or more keywords descriptive or otherwise representative of a geographically-relevant object, and a location is identified, geographic-related semantic information of the location is associated with the location.
Abstract: Techniques for associating geographic-related information with objects are described. In one implementation, a search is conducted on a keyword string of one or more keywords descriptive or otherwise representative of a geographically-relevant object. If a location is identified, geographic-related semantic information of the location is associated with the geographically-relevant object. In some cases, multiple possible locations may be identified as a result of searching the keyword string. If multiple locations are identified, a probable location is determined and then geographic-related semantic information of the probable location is associated with the geographically-relevant object described by the keyword string.

Patent
04 May 2006
TL;DR: In this paper, an agent and gateway together assist a web browser in fetching HTTP contents faster from Internet Web sites over long-latency data links, by coordinating the fetching of selective embedded objects in such a way that an object is ready and available on a host platform before the resident browser requires it.
Abstract: The invention increases performance of HTTP over long-latency links by pre-fetching objects concurrently via aggregated and flow-controlled channels. An agent and gateway together assist a Web browser in fetching HTTP contents faster from Internet Web sites over long-latency data links. The gateway and the agent coordinate the fetching of selective embedded objects in such a way that an object is ready and available on a host platform before the resident browser requires it. The seemingly instantaneous availability of objects to a browser enables it to complete processing the object to request the next object without much wait. Without this instantaneous availability of an embedded object, a browser waits for its request and the corresponding response to traverse a long delay link.

01 Jan 2006
TL;DR: This paper presents a set of metrics and algorithms from statistical detection and estimation theory tailored to object detection and tracking tasks using frame-based as well as object-based evaluation paradigms for performance evaluation of object tracking systems.
Abstract: This paper presents a set of metrics and algorithms for performance evaluation of object tracking systems. Our emphasis is on wide-ranging, robust metrics which can be used for evaluation purposes without inducing any bias towards the evaluation results. The goal is to report a set of unbiased metrics and to leave the final evaluation of the evaluation process to the research community analyzing the results, keeping the human in the loop. We propose metrics from statistical detection and estimation theory tailored to object detection and tracking tasks using frame-based as well as object-based evaluation paradigms. Object correspondences between multiple ground truth objects to multiple tracker result objects are established from a correspondence matrix. The correspondence matrix is built using three different methods of distance computation between trajectories. Results on PETS 2001 data set are presented in terms of 1st and 2nd order statistical descriptors of these metrics.