scispace - formally typeset
Search or ask a question

Showing papers on "Object (computer science) published in 2005"


Journal ArticleDOI
TL;DR: A computationally efficient framework for part-based modeling and recognition of objects, motivated by the pictorial structure models introduced by Fischler and Elschlager, that allows for qualitative descriptions of visual appearance and is suitable for generic recognition problems.
Abstract: In this paper we present a computationally efficient framework for part-based modeling and recognition of objects. Our work is motivated by the pictorial structure models introduced by Fischler and Elschlager. The basic idea is to represent an object by a collection of parts arranged in a deformable configuration. The appearance of each part is modeled separately, and the deformable configuration is represented by spring-like connections between pairs of parts. These models allow for qualitative descriptions of visual appearance, and are suitable for generic recognition problems. We address the problem of using pictorial structure models to find instances of an object in an image as well as the problem of learning an object model from training examples, presenting efficient algorithms in both cases. We demonstrate the techniques by learning models that represent faces and human bodies and using the resulting models to locate the corresponding objects in novel images.

2,514 citations


Patent
30 Sep 2005
TL;DR: Proximity based systems and methods that are implemented on an electronic device are disclosed in this article, where the method includes sensing an object spaced away and in close proximity to the electronic device.
Abstract: Proximity based systems and methods that are implemented on an electronic device are disclosed. The method includes sensing an object spaced away and in close proximity to the electronic device. The method also includes performing an action in the electronic device when an object is sensed.

1,337 citations


Journal ArticleDOI
TL;DR: This paper presents an online feature selection mechanism for evaluating multiple features while tracking and adjusting the set of features used to improve tracking performance, and notes susceptibility of the variance ratio feature selection method to distraction by spatially correlated background clutter.
Abstract: This paper presents an online feature selection mechanism for evaluating multiple features while tracking and adjusting the set of features used to improve tracking performance. Our hypothesis is that the features that best discriminate between object and background are also best for tracking the object. Given a set of seed features, we compute log likelihood ratios of class conditional sample densities from object and background to form a new set of candidate features tailored to the local object/background discrimination task. The two-class variance ratio is used to rank these new features according to how well they separate sample distributions of object and background pixels. This feature evaluation mechanism is embedded in a mean-shift tracking system that adaptively selects the top-ranked discriminative features for tracking. Examples are presented that demonstrate how this method adapts to changing appearances of both tracked object and scene background. We note susceptibility of the variance ratio feature selection method to distraction by spatially correlated background clutter and develop an additional approach that seeks to minimize the likelihood of distraction.

1,279 citations


Proceedings ArticleDOI
17 Oct 2005
TL;DR: This work treats object categories as topics, so that an image containing instances of several categories is modeled as a mixture of topics, and develops a model developed in the statistical text literature: probabilistic latent semantic analysis (pLSA).
Abstract: We seek to discover the object categories depicted in a set of unlabelled images. We achieve this using a model developed in the statistical text literature: probabilistic latent semantic analysis (pLSA). In text analysis, this is used to discover topics in a corpus using the bag-of-words document representation. Here we treat object categories as topics, so that an image containing instances of several categories is modeled as a mixture of topics. The model is applied to images by using a visual analogue of a word, formed by vector quantizing SIFT-like region descriptors. The topic discovery approach successfully translates to the visual domain: for a small set of objects, we show that both the object categories and their approximate spatial layout are found without supervision. Performance of this unsupervised method is compared to the supervised approach of Fergus et al. (2003) on a set of unseen images containing only one object per image. We also extend the bag-of-words vocabulary to include 'doublets' which encode spatially local co-occurring regions. It is demonstrated that this extended vocabulary gives a cleaner image segmentation. Finally, the classification and segmentation methods are applied to a set of images containing multiple objects per image. These results demonstrate that we can successfully build object class models from an unsupervised analysis of images.

1,129 citations


Journal ArticleDOI
TL;DR: While network analysis has been studied in depth in particular areas such as social network analysis, hypertext mining, and web analysis, only recently has there been a cross-fertilization of ideas among these different communities.
Abstract: Many datasets of interest today are best described as a linked collection of interrelated objects. These may represent homogeneous networks, in which there is a single-object type and link type, or richer, heterogeneous networks, in which there may be multiple object and link types (and possibly other semantic information). Examples of homogeneous networks include single mode social networks, such as people connected by friendship links, or the WWW, a collection of linked web pages. Examples of heterogeneous networks include those in medical domains describing patients, diseases, treatments and contacts, or in bibliographic domains describing publications, authors, and venues. Link mining refers to data mining techniques that explicitly consider these links when building predictive or descriptive models of the linked data. Commonly addressed link mining tasks include object ranking, group detection, collective classification, link prediction and subgraph discovery. While network analysis has been studied in depth in particular areas such as social network analysis, hypertext mining, and web analysis, only recently has there been a cross-fertilization of ideas among these different communities. This is an exciting, rapidly expanding area. In this article, we review some of the common emerging themes.

1,067 citations


Proceedings ArticleDOI
20 Jun 2005
TL;DR: The performance of the approach constitutes a suggestive plausibility proof for a class of feedforward models of object recognition in cortex and exhibits excellent recognition performance and outperforms several state-of-the-art systems on a variety of image datasets including many different object categories.
Abstract: We introduce a novel set of features for robust object recognition. Each element of this set is a complex feature obtained by combining position- and scale-tolerant edge-detectors over neighboring positions and multiple orientations. Our system's architecture is motivated by a quantitative model of visual cortex. We show that our approach exhibits excellent recognition performance and outperforms several state-of-the-art systems on a variety of image datasets including many different object categories. We also demonstrate that our system is able to learn from very few examples. The performance of the approach constitutes a suggestive plausibility proof for a class of feedforward models of object recognition in cortex.

969 citations


Proceedings ArticleDOI
17 Oct 2005
TL;DR: A new model, TSI-pLSA, is developed, which extends pLSA (as applied to visual words) to include spatial information in a translation and scale invariant manner, and can handle the high intra-class variability and large proportion of unrelated images returned by search engines.
Abstract: Current approaches to object category recognition require datasets of training images to be manually prepared, with varying degrees of supervision. We present an approach that can learn an object category from just its name, by utilizing the raw output of image search engines available on the Internet. We develop a new model, TSI-pLSA, which extends pLSA (as applied to visual words) to include spatial information in a translation and scale invariant manner. Our approach can handle the high intra-class variability and large proportion of unrelated images returned by search engines. We evaluate tire models on standard test sets, showing performance competitive with existing methods trained on hand prepared datasets

807 citations


Patent
04 Mar 2005
TL;DR: In this paper, an improved human user computer interface system, providing a graphic representation of a hierarchy populated with naturally classified objects, having included therein at least one associated object having a distinct classification.
Abstract: An improved human user computer interface system, providing a graphic representation of a hierarchy populated with naturally classified objects, having included therein at least one associated object having a distinct classification. Preferably, a collaborative filter is employed to define the appropriate associated object. The associated object preferably comprises a sponsored object, generating a subsidy or revenue.

607 citations


Book ChapterDOI
15 Jun 2005
TL;DR: Text2Onto as discussed by the authors is a framework for ontology learning from textual resources, where the learned knowledge is represented at a meta-level in the form of instantiated modeling primitives within a so-called Probabilistic Ontology Model (POM).
Abstract: In this paper we present Text2Onto, a framework for ontology learning from textual resources. Three main features distinguish Text2Onto from our earlier framework TextToOnto as well as other state-of-the-art ontology learning frameworks. First, by representing the learned knowledge at a meta-level in the form of instantiated modeling primitives within a so called Probabilistic Ontology Model (POM), we remain independent of a concrete target language while being able to translate the instantiated primitives into any (reasonably expressive) knowledge representation formalism. Second, user interaction is a core aspect of Text2Onto and the fact that the system calculates a confidence for each learned object allows to design sophisticated visualizations of the POM. Third, by incorporating strategies for data-driven change discovery, we avoid processing the whole corpus from scratch each time it changes, only selectively updating the POM according to the corpus changes instead. Besides increasing efficiency in this way, it also allows a user to trace the evolution of the ontology with respect to the changes in the underlying corpus.

597 citations


Patent
28 Oct 2005
TL;DR: In this paper, a plurality of permissions associated with a cloud customer is created, and each of the permissions describes an action performed on an object, while the second set of permissions describe an action to be performed by one or more users.
Abstract: A cloud computing environment having a plurality of computing nodes is described. A plurality of permissions associated with a cloud customer is created. A first set of permissions from the plurality of permissions is associated with one or more objects. Each of the first set of permissions describes an action performed on an object. A second set of permissions from the plurality of permissions is associated with one or more users. Each of the second set of permissions describes an action to be performed by one or more users.

593 citations


25 Feb 2005
TL;DR: Given a set of images containing multiple object categories, this work seeks to discover those categories and their image locations without supervision using generative models from the statistical text literature: probabilistic Latent Semantic Analysis (pLSA), and Latent Dirichlet Allocation (LDA).
Abstract: Given a set of images containing multiple object categories, we seek to discover those categories and their image locations without supervision. We achieve this using generative models from the statistical text literature: probabilistic Latent Semantic Analysis (pLSA), and Latent Dirichlet Allocation (LDA). In text analysis these are used to discover topics in a corpus using the bag-of-words document representation. Here we discover topics as object categories, so that an image containing instances of several categories is modelled as a mixture of topics. The models are applied to images by using a visual analogue of a word, formed by vector quantizing SIFT like region descriptors. We investigate a set of increasingly demanding scenarios, starting with image sets containing only two object categories through to sets containing multiple categories (including airplanes, cars, faces, motorbikes, spotted cats) and background clutter. The object categories sample both intra-class and scale variation, and both the categories and their approximate spatial layout are found without supervision. We also demonstrate classification of unseen images and images containing multiple objects. Performance of the proposed unsupervised method is compared to the semi-supervised approach of [7].1 1This work was sponsored in part by the EU Project CogViSys, the University of Oxford, Shell Oil, and the National Geospatial-Intelligence Agency.

Proceedings ArticleDOI
18 Oct 2005
TL;DR: A sequence of increasingly powerful probabilistic graphical models for activity recognition are presented that can reason tractably about aggregated object instances and gracefully generalizes from object instances to their classes by using abstraction smoothing.
Abstract: In this paper we present results related to achieving finegrained activity recognition for context-aware computing applications. We examine the advantages and challenges of reasoning with globally unique object instances detected by an RFID glove. We present a sequence of increasingly powerful probabilistic graphical models for activity recognition. We show the advantages of adding additional complexity and conclude with a model that can reason tractably about aggregated object instances and gracefully generalizes from object instances to their classes by using abstraction smoothing. We apply these models to data collected from a morning household routine.

Proceedings ArticleDOI
17 Oct 2005
TL;DR: Applied to a database of images of isolated objects, the sharing of parts among objects improves detection accuracy when few training examples are available and this hierarchical probabilistic model is extended to scenes containing multiple objects.
Abstract: We describe a hierarchical probabilistic model for the detection and recognition of objects in cluttered, natural scenes. The model is based on a set of parts which describe the expected appearance and position, in an object centered coordinate frame, of features detected by a low-level interest operator. Each object category then has its own distribution over these parts, which are shared between objects. We learn the parameters of this model via a Gibbs sampler which uses the graphical model's structure to analytically average over many parameters. Applied to a database of images of isolated objects, the sharing of parts among objects improves detection accuracy when few training examples are available. We also extend this hierarchical framework to scenes containing multiple objects

Proceedings ArticleDOI
20 Jun 2005
TL;DR: A "parts and structure" model for object category recognition that can be learnt efficiently and in a semi-supervised manner is presented, learnt from example images containing category instances, without requiring segmentation from background clutter.
Abstract: We present a "parts and structure" model for object category recognition that can be learnt efficiently and in a semi-supervised manner: the model is learnt from example images containing category instances, without requiring segmentation from background clutter. The model is a sparse representation of the object, and consists of a star topology configuration of parts modeling the output of a variety of feature detectors. The optimal choice of feature types (whose repertoire includes interest points, curves and regions) is made automatically. In recognition, the model may be applied efficiently in an exhaustive manner, bypassing the need for feature detectors, to give the globally optimal match within a query image. The approach is demonstrated on a wide variety of categories, and delivers both successful classification and localization of the object within the image.

Journal ArticleDOI
TL;DR: Object and face representations in ventral temporal (VT) cortex were investigated by combining object confusability data from a computational model of object classification with neural response confusable data from an functional neuroimaging experiment.
Abstract: Object and face representations in ventral temporal (VT) cortex were investigated by combining object confusability data from a computational model of object classification with neural response confusability data from a functional neuroimaging experiment. A pattern-based classification algorithm learned to categorize individual brain maps according to the object category being viewed by the subject. An identical algorithm learned to classify an image-based, view-dependent represen- tation of the stimuli. High correlations were found between the confusability of object categories and the confusability of brain activity maps. This occurred even with the inclusion of multiple views of objects, and when the object classification model was tested with high spatial frequency "line drawings" of the stimuli. Consistent with a distributed representation of objects in VT cortex, the data indicate that object categories with shared image-based attributes have shared neural structure.

Proceedings ArticleDOI
10 May 2005
TL;DR: The experimental results show that PopRank can achieve significantly better ranking results than naively applying PageRank on the object graph, and the proposed efficient approaches to automatically decide these factors are proposed.
Abstract: In contrast with the current Web search methods that essentially do document-level ranking and retrieval, we are exploring a new paradigm to enable Web search at the object level. We collect Web information for objects relevant for a specific application domain and rank these objects in terms of their relevance and popularity to answer user queries. Traditional PageRank model is no longer valid for object popularity calculation because of the existence of heterogeneous relationships between objects. This paper introduces PopRank, a domain-independent object-level link analysis model to rank the objects within a specific domain. Specifically we assign a popularity propagation factor to each type of object relationship, study how different popularity propagation factors for these heterogeneous relationships could affect the popularity ranking, and propose efficient approaches to automatically decide these factors. Our experiments are done using 1 million CS papers, and the experimental results show that PopRank can achieve significantly better ranking results than naively applying PageRank on the object graph.

Journal ArticleDOI
TL;DR: This account predicts that answering questions about object manipulation should activate brain regions previously identified as components of the distributed sensory-motor system involved in object use, whereas answering Questions about object function should activate regions identified as component of the systems supporting verbal-declarative features.

Patent
28 Jul 2005
TL;DR: In this paper, a first detection device (e.g., a camera) is used to capture images of the objects, which are then used to compute location data of the object in a first two-dimensional plane.
Abstract: Methods and apparatus for determining an object's three-dimensional location (i.e. real world coordinates) using the audio-video infrastructure of a 3G cellular phone or a 3C (Computer, Communications, Consumer) electronic device. A first detection device (e.g. a camera) is used to capture images of the objects. The captured image data is used to compute location data of the object in a first two-dimensional plane. A second detection device (e.g. microphone or infrared detector) may be used to collect additional location data in a second plane, which when combined with image data from the captured images allows the determination of the real world coordinates (x, y, z) of the object. The real-world coordinate data may be used in various applications. If the size of an object of interest is known or can be calculated, and the size of the projected image does not vary due to rotation of the object, a single camera (e.g. the camera in a 3G or 3C mobile device) may be used to obtain three-dimensional coordinate data for the applications.

Journal ArticleDOI
TL;DR: A general and robust methodology that computes a family of increasingly detailed curve-skeletons, based upon computing a repulsive force field over a discretization of the 3D object and using topological characteristics of the resulting vector field, such as critical points and critical curves, to extract the curve-Skeleton.
Abstract: A curve-skeleton of a 3D object is a stick-like figure or centerline representation of that object. It is used for diverse applications, including virtual colonoscopy and animation. In this paper, we introduce the concept of hierarchical curve-skeletons and describe a general and robust methodology that computes a family of increasingly detailed curve-skeletons. The algorithm is based upon computing a repulsive force field over a discretization of the 3D object and using topological characteristics of the resulting vector field, such as critical points and critical curves, to extract the curve-skeleton. We demonstrate this method on many different types of 3D objects (volumetric, polygonal and scattered point sets) and discuss various extensions of this approach.

Proceedings ArticleDOI
18 Oct 2005
TL;DR: This paper illustrates the ideas of ContextL by providing different UI views on the same object while, at the same time, keeping the conceptual simplicity of object-oriented programming that objects know by themselves how to behave, in this case how to display themselves.
Abstract: ContextL is an extension to the Common Lisp Object System that allows for Context-oriented Programming. It provides means to associate partial class and method definitions with layers and to activate and deactivate such layers in the control flow of a running program. When a layer is activated, the partial definitions become part of the program until this layer is deactivated. This has the effect that the behavior of a program can be modified according to the context of its use without the need to mention such context dependencies in the affected base program. We illustrate these ideas by providing different UI views on the same object while, at the same time, keeping the conceptual simplicity of object-oriented programming that objects know by themselves how to behave, in our case how to display themselves. These seemingly contradictory goals can be achieved by separating class definitions into distinct layers instead of factoring out the display code into different classes.

Journal ArticleDOI
TL;DR: An extensible event and object ontology expressed in VERL is presented and a detailed example of applying VERL and VEML to the description of a "tailgating" event in surveillance video is discussed.
Abstract: The notion of "events" is extremely important in characterizing the contents of video. An event is typically triggered by some kind of change of state captured in the video, such as when an object starts moving. The ability to reason with events is a critical step toward video understanding. This article describes the findings of a recent workshop series that has produced an ontology framework for representing video events-called Video Event Representation Language (VERL) -and a companion annotation framework, called Video Event Markup Language (VEML). One of the key concepts in this work is the modeling of events as composable, whereby complex events are constructed from simpler events by operations such as sequencing, iteration, and alternation. The article presents an extensible event and object ontology expressed in VERL and discusses a detailed example of applying VERL and VEML to the description of a "tailgating" event in surveillance video.

Patent
19 Jan 2005
TL;DR: In this paper, a method of estimating a time to collision (TTC) of a vehicle with an object comprising: acquiring a plurality of images of the object; and determining a TTC from the images that is responsive to a relative velocity and relative acceleration between the vehicle and the object.
Abstract: A method of estimating a time to collision (TTC) of a vehicle with an object comprising: acquiring a plurality of images of the object; and determining a TTC from the images that is responsive to a relative velocity and relative acceleration between the vehicle and the object.

Patent
20 Jun 2005
TL;DR: In this article, a process and facility supports device-specific delivery of a multimedia object to an end user's device as a function of the device's capabilities, the transport interface to the device, and/or the viewing state and access privileges of the user with respect to the object or the user's relationship to an owner of a device and or multimedia object.
Abstract: A process and facility supports device-specific delivery of a multimedia object to an end user's device as a function of the device's capabilities, the transport interface to the device, and/or the viewing state and/or access privileges of the device's user with respect to the object or the user's relationship to an owner of the device and/or multimedia object.

Journal ArticleDOI
TL;DR: Algorithms are presented that modify an initial road-network representation, so that it works better as a basis for predicting an object's position, and an attempt is made to use known movement patterns of the object, in the form of routes, to use acceleration profiles together with the routes.
Abstract: With the continued advances in wireless communications, geo-positioning, and consumer electronics, an infrastructure is emerging that enables location-based services that rely on the tracking of the continuously changing positions of entire populations of service users, termed moving objects. This scenario is characterized by large volumes of updates, for which reason location update technologies become important. A setting is assumed in which a central database stores a representation of each moving object's current position. This position is to be maintained so that it deviates from the user's real position by at most a given threshold. To do so, each moving object stores locally the central representation of its position. Then, an object updates the database whenever the deviation between its actual position (as obtained from a GPS device) and the database position exceeds the threshold. The main issue considered is how to represent the location of a moving object in a database so that tracking can be done with as few updates as possible. The paper proposes to use the road network within which the objects are assumed to move for predicting their future positions. The paper presents algorithms that modify an initial road-network representation, so that it works better as a basis for predicting an object's position; it proposes to use known movement patterns of the object, in the form of routes; and, it proposes to use acceleration profiles together with the routes. Using real GPS-data and a corresponding real road network, the paper offers empirical evaluations and comparisons that include three existing approaches and all the proposed approaches.

Patent
16 Mar 2005
TL;DR: In this paper, a multi-resolution object location system and method for locating objects is presented, which uses a long-range object locator together with a more precise RFID locator to efficiently and accurately determine the location of objects.
Abstract: A multi-resolution object location system and method for locating objects is provided. The multi-resolution system and method uses a long-range object locator together with a more precise RFID locator to efficiently and accurately determine the location of objects that include an RFID tag. The long-range object locator has a relatively longer range and can cover a relatively large area to determine the general location of the object within the large area. The RFID locator has a relatively shorter range, but is able to locate the object more precisely. The object location system uses the long-range locator to first determine the general location of the object, and then the RFID locator is used to determine a more accurate location of the object. Thus, the multi-resolution object location system is able to provide both a long range location of objects over a large area and a precise location of objects.

Patent
15 Mar 2005
TL;DR: In this article, a mobile object tracking system is provided for tracking the removal and use of specific objects within a group checked out from storage between the time the objects are checked out and the replacement of the objects in the storage.
Abstract: A mobile object tracking system is provided for tracking the removal and use of specific objects within a group checked out from storage between the time the objects are checked out and the replacement of the objects in the storage. The system includes a system controller and a storage unit that receive and store a series of object carriers therein. Each object carrier includes a series of object holders in which id tags for the group of objects checked out of the storage are received. The object carrier monitors the time each object is removed from the object carrier, which information is thereafter communicated to the system controller.

Patent
08 Dec 2005
TL;DR: In this article, a method of transferring data objects over a network comprises intercepting a network transfer message with a passing object, creating a unique identifier for the object using a predetermined function, the same function having been used to provide identifiers for objects stored at predetermined nodes of said network, removing the object and sending on the network transmission message with the unique identifier in place of the object.
Abstract: A method of transferring data objects over a network comprises intercepting a network transfer message with a passing object, creating a unique identifier for the object using a predetermined function, the same function having been used to provide identifiers for objects stored at predetermined nodes of said network, removing the object and sending on the network transfer message with the unique identifier in place of the object. Then, at the recipient end it is possible to obtain the unique identifier and use it as a key to search for a corresponding object in the local nodes. The search starts with a node closest to the recipient and steadily spreads outwards. The object when found is reattached for the benefit of the recipient and network bandwidth has been saved by the avoidance of redundant transfer since the object is brought to the recipient from the node which is the closest to him.

Patent
29 Apr 2005
TL;DR: In this paper, the present invention relates to one more populating, indexing, and searching a database of fine-grained web objects or object specifications, and pertains to the field of computer software.
Abstract: The present invention pertains to the field of computer software. More specifically, the present invention relates to one more populating, indexing, and searching a database of fine-grained web objects or object specifications.

Patent
20 May 2005
TL;DR: In this paper, a shared closure of data objects, which consists of the first data object and a transitive closure of the referenced data objects is identified, and a determination is made as to whether the shared closure is usable in a second runtime system.
Abstract: Methods and apparatus, including computer systems and program products, for sharing data objects in runtime systems. An identification of a first data object in a first runtime system is received. The first data object references zero or more referenced data objects. A shared closure of data objects, which consists of the first data object and a transitive closure of the referenced data objects, is identified, and a determination is made as to whether the shared closure of data objects is usable in a second runtime system. In some implementations, determining whether a shared closure is usable in a second runtime system includes determining whether each data object in the shared closure is serializable without execution of custom code, or determining whether the runtime class of each object instance in the shared closure is shareable. Using shared closures to share objects between runtime systems can provide isolation between user sessions.

Patent
20 Dec 2005
TL;DR: In this article, the authors present a system and method that facilitates management and navigation of various data objects by making use of a unique time-line based navigation tool, which allows a user to navigate or browse through the bands and objects according to a desired time parameter or range of time.
Abstract: The subject invention provides a unique system and method that facilitates management and navigation of various data objects by making use of a unique time-line based navigation tool. In particular, objects can organized into a plurality of bands based on their respective subject matter. Each band can be created to designate a particular topic. Objects are organized within the appropriate bands based in part on a time parameter such as a time or date that the object was created, for example. The navigation tool allows a user to navigate or browse through the bands and objects according to a desired time parameter or range of time. Zooming and other browsing options are available to the user to view objects of interest at varying levels of detail. The objects are represented as ASCII thumbnails that are operational. Thus, the content of any object can be modified directly via the thumbnail.