Apparatus and method performing audio-video sensor fusion for object localization, tracking, and separation

Patent

Apparatus and method performing audio-video sensor fusion for object localization, tracking, and separation

Chats0

TLDR

In this article, an audio likelihood module is used to determine corresponding audio likelihoods for each of a plurality of sounds received from corresponding different directions, each audio likelihood indicating a likelihood a sound is an object to be tracked.

Abstract:

An apparatus for tracking and identifying objects includes an audio likelihood module which determines corresponding audio likelihoods for each of a plurality of sounds received from corresponding different directions, each audio likelihood indicating a likelihood a sound is an object to be tracked; a video likelihood module which receives a video and determines video likelihoods for each of a plurality of images disposed in corresponding different directions in the video, each video likelihood indicating a likelihood that the image is an object to be tracked; and an identification and tracking module which determines correspondences between the audio likelihoods and the video likelihoods, if a correspondence is determined to exist between one of the audio likelihoods and one of the video likelihoods, identifies and tracks a corresponding one of the objects using each determined pair of audio and video likelihoods.

Citations

PDF

Open Access

More filters

Patent

Intelligent Automated Assistant

Thomas R. Gruber, +7 more

TL;DR: In this article, an intelligent automated assistant system engages with the user in an integrated, conversational manner using natural language dialog, and invokes external services when appropriate to obtain information or perform various actions.

...read moreread less

Patent

Using context information to facilitate processing of commands in a virtual assistant

Thomas R. Gruber, +4 more

TL;DR: In this article, a virtual assistant uses context information to supplement natural language or gestural input from a user, which helps to clarify the user's intent and reduce the number of candidate interpretations of user's input, and reduces the need for the user to provide excessive clarification input.

...read moreread less

Patent

Method and apparatus for building an intelligent automated assistant

Adam Cheyer, +1 more

TL;DR: In this paper, a method for building an automated assistant includes interfacing a service-oriented architecture that includes a plurality of remote services to an active ontology, where the active ontologies includes at least one active processing element that models a domain.

...read moreread less

Patent

Electronic Devices with Voice Command and Contextual Data Processing Capabilities

Aram Lindahl

TL;DR: In this paper, an electronic device may capture a voice command from a user and store contextual information about the state of the electronic device when the voice command is received, such as a desktop computer or a remote server.

...read moreread less

Patent

Automatically adapting user interfaces for hands-free interaction

Thomas R. Gruber, +1 more

TL;DR: In this article, the authors present a method for automatically determining whether a digital assistant application has been separately invoked by a user without regard to whether a user has separately invoked the application.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Rapid object detection using a boosted cascade of simple features

Paul A. Viola, +1 more

TL;DR: A machine learning approach for visual object detection which is capable of processing images extremely rapidly and achieving high detection rates and the introduction of a new image representation called the "integral image" which allows the features used by the detector to be computed very quickly.

...read moreread less

Journal ArticleDOI

Multiple emitter location and signal parameter estimation

R. O. Schmidt

- 01 Mar 1986 -

IEEE Transactions on Antennas and Propag...

TL;DR: In this article, a description of the multiple signal classification (MUSIC) algorithm, which provides asymptotically unbiased estimates of 1) number of incident wavefronts present; 2) directions of arrival (DOA) (or emitter locations); 3) strengths and cross correlations among the incident waveforms; 4) noise/interference strength.

...read moreread less

Journal ArticleDOI

Kernel-based object tracking

Dorin Comaniciu, +2 more

- 01 May 2003 -

IEEE Transactions on Pattern Analysis an...

TL;DR: A new approach toward target representation and localization, the central component in visual tracking of nonrigid objects, is proposed, which employs a metric derived from the Bhattacharyya coefficient as similarity measure, and uses the mean shift procedure to perform the optimization.

...read moreread less

Journal ArticleDOI

Detecting Pedestrians Using Patterns of Motion and Appearance

Paul A. Viola, +2 more

TL;DR: This paper describes a pedestrian detection system that integrates image intensity information with motion information, and is the first to combine both sources of information in a single detector.

...read moreread less

Fastslam: a factored solution to the simultaneous localization and mapping problem with unknown data association

Michael Montemerlo, +2 more

TL;DR: This paper presents FastSLAM, an algorithm that recursively estimates the full posterior distribution over robot pose and landmark locations, yet scales logarithmically with the number of landmarks in the map.

...read moreread less

Collapse

Apparatus and method performing audio-video sensor fusion for object localization, tracking, and separation

Citations

Intelligent Automated Assistant

Using context information to facilitate processing of commands in a virtual assistant

Method and apparatus for building an intelligent automated assistant

Electronic Devices with Voice Command and Contextual Data Processing Capabilities

Automatically adapting user interfaces for hands-free interaction

References

Rapid object detection using a boosted cascade of simple features

Multiple emitter location and signal parameter estimation

Kernel-based object tracking

Detecting Pedestrians Using Patterns of Motion and Appearance

Fastslam: a factored solution to the simultaneous localization and mapping problem with unknown data association

Related Papers (5)

Telepresence system and method for video teleconferencing

Sound localization system for teleconferencing using self-steering microphone arrays

Microphone actuation control system suitable for teleconference systems

Audio Channel Assignment for Audio Output in a Movable Device

Portable telephone with integrated heads-up display and data terminal functions