scispace - formally typeset
Search or ask a question

Showing papers on "Object (computer science) published in 2005"


Journal ArticleDOI
TL;DR: A computationally efficient framework for part-based modeling and recognition of objects, motivated by the pictorial structure models introduced by Fischler and Elschlager, that allows for qualitative descriptions of visual appearance and is suitable for generic recognition problems.
Abstract: In this paper we present a computationally efficient framework for part-based modeling and recognition of objects. Our work is motivated by the pictorial structure models introduced by Fischler and Elschlager. The basic idea is to represent an object by a collection of parts arranged in a deformable configuration. The appearance of each part is modeled separately, and the deformable configuration is represented by spring-like connections between pairs of parts. These models allow for qualitative descriptions of visual appearance, and are suitable for generic recognition problems. We address the problem of using pictorial structure models to find instances of an object in an image as well as the problem of learning an object model from training examples, presenting efficient algorithms in both cases. We demonstrate the techniques by learning models that represent faces and human bodies and using the resulting models to locate the corresponding objects in novel images.

2,514 citations


Patent
30 Sep 2005
TL;DR: Proximity based systems and methods that are implemented on an electronic device are disclosed in this article, where the method includes sensing an object spaced away and in close proximity to the electronic device.
Abstract: Proximity based systems and methods that are implemented on an electronic device are disclosed. The method includes sensing an object spaced away and in close proximity to the electronic device. The method also includes performing an action in the electronic device when an object is sensed.

1,337 citations


Journal ArticleDOI
TL;DR: This paper presents an online feature selection mechanism for evaluating multiple features while tracking and adjusting the set of features used to improve tracking performance, and notes susceptibility of the variance ratio feature selection method to distraction by spatially correlated background clutter.
Abstract: This paper presents an online feature selection mechanism for evaluating multiple features while tracking and adjusting the set of features used to improve tracking performance. Our hypothesis is that the features that best discriminate between object and background are also best for tracking the object. Given a set of seed features, we compute log likelihood ratios of class conditional sample densities from object and background to form a new set of candidate features tailored to the local object/background discrimination task. The two-class variance ratio is used to rank these new features according to how well they separate sample distributions of object and background pixels. This feature evaluation mechanism is embedded in a mean-shift tracking system that adaptively selects the top-ranked discriminative features for tracking. Examples are presented that demonstrate how this method adapts to changing appearances of both tracked object and scene background. We note susceptibility of the variance ratio feature selection method to distraction by spatially correlated background clutter and develop an additional approach that seeks to minimize the likelihood of distraction.

1,279 citations


Proceedings ArticleDOI
17 Oct 2005
TL;DR: This work treats object categories as topics, so that an image containing instances of several categories is modeled as a mixture of topics, and develops a model developed in the statistical text literature: probabilistic latent semantic analysis (pLSA).
Abstract: We seek to discover the object categories depicted in a set of unlabelled images. We achieve this using a model developed in the statistical text literature: probabilistic latent semantic analysis (pLSA). In text analysis, this is used to discover topics in a corpus using the bag-of-words document representation. Here we treat object categories as topics, so that an image containing instances of several categories is modeled as a mixture of topics. The model is applied to images by using a visual analogue of a word, formed by vector quantizing SIFT-like region descriptors. The topic discovery approach successfully translates to the visual domain: for a small set of objects, we show that both the object categories and their approximate spatial layout are found without supervision. Performance of this unsupervised method is compared to the supervised approach of Fergus et al. (2003) on a set of unseen images containing only one object per image. We also extend the bag-of-words vocabulary to include 'doublets' which encode spatially local co-occurring regions. It is demonstrated that this extended vocabulary gives a cleaner image segmentation. Finally, the classification and segmentation methods are applied to a set of images containing multiple objects per image. These results demonstrate that we can successfully build object class models from an unsupervised analysis of images.

1,129 citations


Journal ArticleDOI
TL;DR: While network analysis has been studied in depth in particular areas such as social network analysis, hypertext mining, and web analysis, only recently has there been a cross-fertilization of ideas among these different communities.
Abstract: Many datasets of interest today are best described as a linked collection of interrelated objects. These may represent homogeneous networks, in which there is a single-object type and link type, or richer, heterogeneous networks, in which there may be multiple object and link types (and possibly other semantic information). Examples of homogeneous networks include single mode social networks, such as people connected by friendship links, or the WWW, a collection of linked web pages. Examples of heterogeneous networks include those in medical domains describing patients, diseases, treatments and contacts, or in bibliographic domains describing publications, authors, and venues. Link mining refers to data mining techniques that explicitly consider these links when building predictive or descriptive models of the linked data. Commonly addressed link mining tasks include object ranking, group detection, collective classification, link prediction and subgraph discovery. While network analysis has been studied in depth in particular areas such as social network analysis, hypertext mining, and web analysis, only recently has there been a cross-fertilization of ideas among these different communities. This is an exciting, rapidly expanding area. In this article, we review some of the common emerging themes.

1,067 citations


Proceedings ArticleDOI
20 Jun 2005
TL;DR: The performance of the approach constitutes a suggestive plausibility proof for a class of feedforward models of object recognition in cortex and exhibits excellent recognition performance and outperforms several state-of-the-art systems on a variety of image datasets including many different object categories.
Abstract: We introduce a novel set of features for robust object recognition. Each element of this set is a complex feature obtained by combining position- and scale-tolerant edge-detectors over neighboring positions and multiple orientations. Our system's architecture is motivated by a quantitative model of visual cortex. We show that our approach exhibits excellent recognition performance and outperforms several state-of-the-art systems on a variety of image datasets including many different object categories. We also demonstrate that our system is able to learn from very few examples. The performance of the approach constitutes a suggestive plausibility proof for a class of feedforward models of object recognition in cortex.

969 citations


Proceedings ArticleDOI
17 Oct 2005
TL;DR: A new model, TSI-pLSA, is developed, which extends pLSA (as applied to visual words) to include spatial information in a translation and scale invariant manner, and can handle the high intra-class variability and large proportion of unrelated images returned by search engines.
Abstract: Current approaches to object category recognition require datasets of training images to be manually prepared, with varying degrees of supervision. We present an approach that can learn an object category from just its name, by utilizing the raw output of image search engines available on the Internet. We develop a new model, TSI-pLSA, which extends pLSA (as applied to visual words) to include spatial information in a translation and scale invariant manner. Our approach can handle the high intra-class variability and large proportion of unrelated images returned by search engines. We evaluate tire models on standard test sets, showing performance competitive with existing methods trained on hand prepared datasets

807 citations


Patent
04 Mar 2005
TL;DR: In this paper, an improved human user computer interface system, providing a graphic representation of a hierarchy populated with naturally classified objects, having included therein at least one associated object having a distinct classification.
Abstract: An improved human user computer interface system, providing a graphic representation of a hierarchy populated with naturally classified objects, having included therein at least one associated object having a distinct classification. Preferably, a collaborative filter is employed to define the appropriate associated object. The associated object preferably comprises a sponsored object, generating a subsidy or revenue.

607 citations


Book ChapterDOI
15 Jun 2005
TL;DR: Text2Onto as discussed by the authors is a framework for ontology learning from textual resources, where the learned knowledge is represented at a meta-level in the form of instantiated modeling primitives within a so-called Probabilistic Ontology Model (POM).
Abstract: In this paper we present Text2Onto, a framework for ontology learning from textual resources. Three main features distinguish Text2Onto from our earlier framework TextToOnto as well as other state-of-the-art ontology learning frameworks. First, by representing the learned knowledge at a meta-level in the form of instantiated modeling primitives within a so called Probabilistic Ontology Model (POM), we remain independent of a concrete target language while being able to translate the instantiated primitives into any (reasonably expressive) knowledge representation formalism. Second, user interaction is a core aspect of Text2Onto and the fact that the system calculates a confidence for each learned object allows to design sophisticated visualizations of the POM. Third, by incorporating strategies for data-driven change discovery, we avoid processing the whole corpus from scratch each time it changes, only selectively updating the POM according to the corpus changes instead. Besides increasing efficiency in this way, it also allows a user to trace the evolution of the ontology with respect to the changes in the underlying corpus.

597 citations


Patent
28 Oct 2005
TL;DR: In this paper, a plurality of permissions associated with a cloud customer is created, and each of the permissions describes an action performed on an object, while the second set of permissions describe an action to be performed by one or more users.
Abstract: A cloud computing environment having a plurality of computing nodes is described. A plurality of permissions associated with a cloud customer is created. A first set of permissions from the plurality of permissions is associated with one or more objects. Each of the first set of permissions describes an action performed on an object. A second set of permissions from the plurality of permissions is associated with one or more users. Each of the second set of permissions describes an action to be performed by one or more users.

593 citations


25 Feb 2005
TL;DR: Given a set of images containing multiple object categories, this work seeks to discover those categories and their image locations without supervision using generative models from the statistical text literature: probabilistic Latent Semantic Analysis (pLSA), and Latent Dirichlet Allocation (LDA).
Abstract: Given a set of images containing multiple object categories, we seek to discover those categories and their image locations without supervision. We achieve this using generative models from the statistical text literature: probabilistic Latent Semantic Analysis (pLSA), and Latent Dirichlet Allocation (LDA). In text analysis these are used to discover topics in a corpus using the bag-of-words document representation. Here we discover topics as object categories, so that an image containing instances of several categories is modelled as a mixture of topics. The models are applied to images by using a visual analogue of a word, formed by vector quantizing SIFT like region descriptors. We investigate a set of increasingly demanding scenarios, starting with image sets containing only two object categories through to sets containing multiple categories (including airplanes, cars, faces, motorbikes, spotted cats) and background clutter. The object categories sample both intra-class and scale variation, and both the categories and their approximate spatial layout are found without supervision. We also demonstrate classification of unseen images and images containing multiple objects. Performance of the proposed unsupervised method is compared to the semi-supervised approach of [7].1 1This work was sponsored in part by the EU Project CogViSys, the University of Oxford, Shell Oil, and the National Geospatial-Intelligence Agency.

Patent
12 Jan 2005
TL;DR: In this paper, a method for obtaining information about objects in the environment outside of and around a vehicle and preventing collisions involving the vehicle includes directing a laser beam from the vehicle into the environment, receiving from an object in the path of the laser beam a reflection of the LR at a location on the vehicle, and analyzing the received LR reflections to obtain information about the object from which the LR is being reflected.
Abstract: Method and system for obtaining information about objects in the environment outside of and around a vehicle and preventing collisions involving the vehicle includes directing a laser beam from the vehicle into the environment, receiving from an object in the path of the laser beam a reflection of the laser beam at a location on the vehicle, and analyzing the received laser beam reflections to obtain information about the object from which the laser beam is being reflected. Analysis of the laser beam reflections preferably entails range gating the received laser beam reflections to limit analysis of the received laser beam reflections to only those received from an object within a defined (distance) range such that objects at distances within the range are isolated from surrounding objects.

Proceedings ArticleDOI
18 Oct 2005
TL;DR: A sequence of increasingly powerful probabilistic graphical models for activity recognition are presented that can reason tractably about aggregated object instances and gracefully generalizes from object instances to their classes by using abstraction smoothing.
Abstract: In this paper we present results related to achieving finegrained activity recognition for context-aware computing applications. We examine the advantages and challenges of reasoning with globally unique object instances detected by an RFID glove. We present a sequence of increasingly powerful probabilistic graphical models for activity recognition. We show the advantages of adding additional complexity and conclude with a model that can reason tractably about aggregated object instances and gracefully generalizes from object instances to their classes by using abstraction smoothing. We apply these models to data collected from a morning household routine.

Proceedings ArticleDOI
17 Oct 2005
TL;DR: Applied to a database of images of isolated objects, the sharing of parts among objects improves detection accuracy when few training examples are available and this hierarchical probabilistic model is extended to scenes containing multiple objects.
Abstract: We describe a hierarchical probabilistic model for the detection and recognition of objects in cluttered, natural scenes. The model is based on a set of parts which describe the expected appearance and position, in an object centered coordinate frame, of features detected by a low-level interest operator. Each object category then has its own distribution over these parts, which are shared between objects. We learn the parameters of this model via a Gibbs sampler which uses the graphical model's structure to analytically average over many parameters. Applied to a database of images of isolated objects, the sharing of parts among objects improves detection accuracy when few training examples are available. We also extend this hierarchical framework to scenes containing multiple objects

Journal ArticleDOI
TL;DR: In this paper, a multiscale object-specific segmentation (MOSS) approach is presented for automatically delineating image-objects (i.e., segments) at multiple scales from a high-spatial resolution remotely sensed forest scene.

Proceedings ArticleDOI
20 Jun 2005
TL;DR: A "parts and structure" model for object category recognition that can be learnt efficiently and in a semi-supervised manner is presented, learnt from example images containing category instances, without requiring segmentation from background clutter.
Abstract: We present a "parts and structure" model for object category recognition that can be learnt efficiently and in a semi-supervised manner: the model is learnt from example images containing category instances, without requiring segmentation from background clutter. The model is a sparse representation of the object, and consists of a star topology configuration of parts modeling the output of a variety of feature detectors. The optimal choice of feature types (whose repertoire includes interest points, curves and regions) is made automatically. In recognition, the model may be applied efficiently in an exhaustive manner, bypassing the need for feature detectors, to give the globally optimal match within a query image. The approach is demonstrated on a wide variety of categories, and delivers both successful classification and localization of the object within the image.

Journal ArticleDOI
TL;DR: Object and face representations in ventral temporal (VT) cortex were investigated by combining object confusability data from a computational model of object classification with neural response confusable data from an functional neuroimaging experiment.
Abstract: Object and face representations in ventral temporal (VT) cortex were investigated by combining object confusability data from a computational model of object classification with neural response confusability data from a functional neuroimaging experiment. A pattern-based classification algorithm learned to categorize individual brain maps according to the object category being viewed by the subject. An identical algorithm learned to classify an image-based, view-dependent represen- tation of the stimuli. High correlations were found between the confusability of object categories and the confusability of brain activity maps. This occurred even with the inclusion of multiple views of objects, and when the object classification model was tested with high spatial frequency "line drawings" of the stimuli. Consistent with a distributed representation of objects in VT cortex, the data indicate that object categories with shared image-based attributes have shared neural structure.

Proceedings ArticleDOI
10 May 2005
TL;DR: The experimental results show that PopRank can achieve significantly better ranking results than naively applying PageRank on the object graph, and the proposed efficient approaches to automatically decide these factors are proposed.
Abstract: In contrast with the current Web search methods that essentially do document-level ranking and retrieval, we are exploring a new paradigm to enable Web search at the object level. We collect Web information for objects relevant for a specific application domain and rank these objects in terms of their relevance and popularity to answer user queries. Traditional PageRank model is no longer valid for object popularity calculation because of the existence of heterogeneous relationships between objects. This paper introduces PopRank, a domain-independent object-level link analysis model to rank the objects within a specific domain. Specifically we assign a popularity propagation factor to each type of object relationship, study how different popularity propagation factors for these heterogeneous relationships could affect the popularity ranking, and propose efficient approaches to automatically decide these factors. Our experiments are done using 1 million CS papers, and the experimental results show that PopRank can achieve significantly better ranking results than naively applying PageRank on the object graph.

Proceedings Article
30 Aug 2005
TL;DR: The semantics of skylines are investigated, the subspace skyline analysis is proposed, and a novel notion of skyline group is introduced which essentially is a group of objects that are coincidentally in the skyline of some subspaces.
Abstract: The skyline operator is important for multi-criteria decision making applications. Although many recent studies developed efficient methods to compute skyline objects in a specific space, the fundamental problem on the semantics of skylines remains open: Why and in which subspaces is (or is not) an object in the skyline? Practically, users may also be interested in the skylines in any subspaces. Then, what is the relationship between the skylines in the subspaces and those in the super-spaces? How can we effectively analyze the subspace skylines? Can we efficiently compute skylines in various subspaces?In this paper, we investigate the semantics of skylines, propose the subspace skyline analysis, and extend the full-space skyline computation to subspace skyline computation. We introduce a novel notion of skyline group which essentially is a group of objects that are coincidentally in the skylines of some subspaces. We identify the decisive subspaces that qualify skyline groups in the subspace skylines. The new notions concisely capture the semantics and the structures of skylines in various subspaces. Multidimensional roll-up and drilldown analysis is introduced. We also develop an efficient algorithm, Skyey, to compute the set of skyline groups and, for each subspace, the set of objects that are in the subspace skyline. A performance study is reported to evaluate our approach.

Journal ArticleDOI
TL;DR: This account predicts that answering questions about object manipulation should activate brain regions previously identified as components of the distributed sensory-motor system involved in object use, whereas answering Questions about object function should activate regions identified as component of the systems supporting verbal-declarative features.

Patent
28 Jul 2005
TL;DR: In this paper, a first detection device (e.g., a camera) is used to capture images of the objects, which are then used to compute location data of the object in a first two-dimensional plane.
Abstract: Methods and apparatus for determining an object's three-dimensional location (i.e. real world coordinates) using the audio-video infrastructure of a 3G cellular phone or a 3C (Computer, Communications, Consumer) electronic device. A first detection device (e.g. a camera) is used to capture images of the objects. The captured image data is used to compute location data of the object in a first two-dimensional plane. A second detection device (e.g. microphone or infrared detector) may be used to collect additional location data in a second plane, which when combined with image data from the captured images allows the determination of the real world coordinates (x, y, z) of the object. The real-world coordinate data may be used in various applications. If the size of an object of interest is known or can be calculated, and the size of the projected image does not vary due to rotation of the object, a single camera (e.g. the camera in a 3G or 3C mobile device) may be used to obtain three-dimensional coordinate data for the applications.

Journal ArticleDOI
TL;DR: A general and robust methodology that computes a family of increasingly detailed curve-skeletons, based upon computing a repulsive force field over a discretization of the 3D object and using topological characteristics of the resulting vector field, such as critical points and critical curves, to extract the curve-Skeleton.
Abstract: A curve-skeleton of a 3D object is a stick-like figure or centerline representation of that object. It is used for diverse applications, including virtual colonoscopy and animation. In this paper, we introduce the concept of hierarchical curve-skeletons and describe a general and robust methodology that computes a family of increasingly detailed curve-skeletons. The algorithm is based upon computing a repulsive force field over a discretization of the 3D object and using topological characteristics of the resulting vector field, such as critical points and critical curves, to extract the curve-skeleton. We demonstrate this method on many different types of 3D objects (volumetric, polygonal and scattered point sets) and discuss various extensions of this approach.

Proceedings ArticleDOI
18 Oct 2005
TL;DR: This paper illustrates the ideas of ContextL by providing different UI views on the same object while, at the same time, keeping the conceptual simplicity of object-oriented programming that objects know by themselves how to behave, in this case how to display themselves.
Abstract: ContextL is an extension to the Common Lisp Object System that allows for Context-oriented Programming. It provides means to associate partial class and method definitions with layers and to activate and deactivate such layers in the control flow of a running program. When a layer is activated, the partial definitions become part of the program until this layer is deactivated. This has the effect that the behavior of a program can be modified according to the context of its use without the need to mention such context dependencies in the affected base program. We illustrate these ideas by providing different UI views on the same object while, at the same time, keeping the conceptual simplicity of object-oriented programming that objects know by themselves how to behave, in our case how to display themselves. These seemingly contradictory goals can be achieved by separating class definitions into distinct layers instead of factoring out the display code into different classes.

Journal ArticleDOI
TL;DR: An extensible event and object ontology expressed in VERL is presented and a detailed example of applying VERL and VEML to the description of a "tailgating" event in surveillance video is discussed.
Abstract: The notion of "events" is extremely important in characterizing the contents of video. An event is typically triggered by some kind of change of state captured in the video, such as when an object starts moving. The ability to reason with events is a critical step toward video understanding. This article describes the findings of a recent workshop series that has produced an ontology framework for representing video events-called Video Event Representation Language (VERL) -and a companion annotation framework, called Video Event Markup Language (VEML). One of the key concepts in this work is the modeling of events as composable, whereby complex events are constructed from simpler events by operations such as sequencing, iteration, and alternation. The article presents an extensible event and object ontology expressed in VERL and discusses a detailed example of applying VERL and VEML to the description of a "tailgating" event in surveillance video.

Patent
19 Jan 2005
TL;DR: In this paper, a method of estimating a time to collision (TTC) of a vehicle with an object comprising: acquiring a plurality of images of the object; and determining a TTC from the images that is responsive to a relative velocity and relative acceleration between the vehicle and the object.
Abstract: A method of estimating a time to collision (TTC) of a vehicle with an object comprising: acquiring a plurality of images of the object; and determining a TTC from the images that is responsive to a relative velocity and relative acceleration between the vehicle and the object.

Patent
20 Jun 2005
TL;DR: In this article, a process and facility supports device-specific delivery of a multimedia object to an end user's device as a function of the device's capabilities, the transport interface to the device, and/or the viewing state and access privileges of the user with respect to the object or the user's relationship to an owner of a device and or multimedia object.
Abstract: A process and facility supports device-specific delivery of a multimedia object to an end user's device as a function of the device's capabilities, the transport interface to the device, and/or the viewing state and/or access privileges of the device's user with respect to the object or the user's relationship to an owner of the device and/or multimedia object.

Journal ArticleDOI
TL;DR: Algorithms are presented that modify an initial road-network representation, so that it works better as a basis for predicting an object's position, and an attempt is made to use known movement patterns of the object, in the form of routes, to use acceleration profiles together with the routes.
Abstract: With the continued advances in wireless communications, geo-positioning, and consumer electronics, an infrastructure is emerging that enables location-based services that rely on the tracking of the continuously changing positions of entire populations of service users, termed moving objects. This scenario is characterized by large volumes of updates, for which reason location update technologies become important. A setting is assumed in which a central database stores a representation of each moving object's current position. This position is to be maintained so that it deviates from the user's real position by at most a given threshold. To do so, each moving object stores locally the central representation of its position. Then, an object updates the database whenever the deviation between its actual position (as obtained from a GPS device) and the database position exceeds the threshold. The main issue considered is how to represent the location of a moving object in a database so that tracking can be done with as few updates as possible. The paper proposes to use the road network within which the objects are assumed to move for predicting their future positions. The paper presents algorithms that modify an initial road-network representation, so that it works better as a basis for predicting an object's position; it proposes to use known movement patterns of the object, in the form of routes; and, it proposes to use acceleration profiles together with the routes. Using real GPS-data and a corresponding real road network, the paper offers empirical evaluations and comparisons that include three existing approaches and all the proposed approaches.

Patent
16 Mar 2005
TL;DR: In this paper, a multi-resolution object location system and method for locating objects is presented, which uses a long-range object locator together with a more precise RFID locator to efficiently and accurately determine the location of objects.
Abstract: A multi-resolution object location system and method for locating objects is provided. The multi-resolution system and method uses a long-range object locator together with a more precise RFID locator to efficiently and accurately determine the location of objects that include an RFID tag. The long-range object locator has a relatively longer range and can cover a relatively large area to determine the general location of the object within the large area. The RFID locator has a relatively shorter range, but is able to locate the object more precisely. The object location system uses the long-range locator to first determine the general location of the object, and then the RFID locator is used to determine a more accurate location of the object. Thus, the multi-resolution object location system is able to provide both a long range location of objects over a large area and a precise location of objects.

Patent
15 Mar 2005
TL;DR: In this article, a mobile object tracking system is provided for tracking the removal and use of specific objects within a group checked out from storage between the time the objects are checked out and the replacement of the objects in the storage.
Abstract: A mobile object tracking system is provided for tracking the removal and use of specific objects within a group checked out from storage between the time the objects are checked out and the replacement of the objects in the storage. The system includes a system controller and a storage unit that receive and store a series of object carriers therein. Each object carrier includes a series of object holders in which id tags for the group of objects checked out of the storage are received. The object carrier monitors the time each object is removed from the object carrier, which information is thereafter communicated to the system controller.

Patent
05 Aug 2005
TL;DR: In this article, a method for monitoring a container having an RFID device interior comprises periodically transmitting messages including unique identifying information and receiving messages transmitted by the RFID devices interior the container by a receiving device exterior the container.
Abstract: A method for monitoring a container having an RFID device interior comprises periodically transmitting messages including unique identifying information and receiving messages transmitted by the RFID device interior the container by a receiving device exterior the container. Some messages transmitted when the container is closed are not received by the receiving device, and some messages transmitted when the container is not closed are received. The method comprises storing and/or relaying the received messages and determining from the number of messages received from the RFID device that are stored and/or relayed whether the container has or has not been opened. The method may also comprise transmitting the messages at two or more power levels and selecting messages that are at the lowest power level that produces at least two corresponding stored and/or relayed messages for determining the location of the RFID device.