scispace - formally typeset
Search or ask a question
Patent

Ordering recognition results produced by an automatic speech recognition engine for a multimodal application

TL;DR: In this article, a method is described for ordering recognition results produced by an automatic speech recognition (ASR) engine for a multimodal application implemented with a grammar of the multimodAL application in the ASR engine.
Abstract: A method is described for ordering recognition results produced by an automatic speech recognition ( ASR ) engine for a multimodal application implemented with a grammar of the multimodalapplication in the ASR engine, with the multimodal application operating in a multimodalbrowser on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application 10 operativelycoupled to the ASR enginethrough a VoiceXML interpreter. The method includes: receiving, in the VoiceXML interpreter from the multimodal application, a voice utterance; determining, by the VoiceXML interpreter using the ASR engine, a plurality of recognition results in dependence upon the voice utterance and the grammar; determining, by the VoiceXML interpreter according to semantic interpretation scripts of the grammar, a 15 weight for each recognition result; and sorting, bythe VoiceXML interpreter, the plurality of recognition results in dependence upon the weight for each recognition result.
Citations
More filters
Patent
11 Jan 2011
TL;DR: In this article, an intelligent automated assistant system engages with the user in an integrated, conversational manner using natural language dialog, and invokes external services when appropriate to obtain information or perform various actions.
Abstract: An intelligent automated assistant system engages with the user in an integrated, conversational manner using natural language dialog, and invokes external services when appropriate to obtain information or perform various actions. The system can be implemented using any of a number of different platforms, such as the web, email, smartphone, and the like, or any combination thereof. In one embodiment, the system is based on sets of interrelated domains and tasks, and employs additional functionally powered by external services with which the system can interact.

1,462 citations

Patent
28 Sep 2012
TL;DR: In this article, a virtual assistant uses context information to supplement natural language or gestural input from a user, which helps to clarify the user's intent and reduce the number of candidate interpretations of user's input, and reduces the need for the user to provide excessive clarification input.
Abstract: A virtual assistant uses context information to supplement natural language or gestural input from a user. Context helps to clarify the user's intent and to reduce the number of candidate interpretations of the user's input, and reduces the need for the user to provide excessive clarification input. Context can include any available information that is usable by the assistant to supplement explicit user input to constrain an information-processing problem and/or to personalize results. Context can be used to constrain solutions during various phases of processing, including, for example, speech recognition, natural language processing, task flow processing, and dialog generation.

593 citations

Patent
08 Sep 2006
TL;DR: In this paper, a method for building an automated assistant includes interfacing a service-oriented architecture that includes a plurality of remote services to an active ontology, where the active ontologies includes at least one active processing element that models a domain.
Abstract: A method and apparatus are provided for building an intelligent automated assistant. Embodiments of the present invention rely on the concept of “active ontologies” (e.g., execution environments constructed in an ontology-like manner) to build and run applications for use by intelligent automated assistants. In one specific embodiment, a method for building an automated assistant includes interfacing a service-oriented architecture that includes a plurality of remote services to an active ontology, where the active ontology includes at least one active processing element that models a domain. At least one of the remote services is then registered for use in the domain.

389 citations

Patent
Aram Lindahl1
24 May 2012
TL;DR: In this paper, an electronic device may capture a voice command from a user and store contextual information about the state of the electronic device when the voice command is received, such as a desktop computer or a remote server.
Abstract: An electronic device may capture a voice command from a user. The electronic device may store contextual information about the state of the electronic device when the voice command is received. The electronic device may transmit the voice command and the contextual information to computing equipment such as a desktop computer or a remote server. The computing equipment may perform a speech recognition operation on the voice command and may process the contextual information. The computing equipment may respond to the voice command. The computing equipment may also transmit information to the electronic device that allows the electronic device to respond to the voice command.

385 citations

Patent
30 Sep 2011
TL;DR: In this article, the authors present a method for automatically determining whether a digital assistant application has been separately invoked by a user without regard to whether a user has separately invoked the application.
Abstract: The method includes automatically, without user input and without regard to whether a digital assistant application has been separately invoked by a user, determining that the electronic device is in a vehicle. In some implementations, determining that the electronic device is in a vehicle comprises detecting that the electronic device is in communication with the vehicle (e.g., via a wired or wireless communication techniques and/or protocols). The method also includes, responsive to the determining, invoking a listening mode of a virtual assistant implemented by the electronic device. In some implementations, the method also includes limiting the ability of a user to view visual output presented by the electronic device, provide typed input to the electronic device, and the like.

367 citations

References
More filters
Patent
02 Sep 2009
TL;DR: In this paper, the authors present systems and methods for navigating hypermedia using multiple coordinated input/output device sets, allowing a user and/or an author to control what resources are presented on which device sets (whether they are integrated or not), and provide for coordinating browsing activities to enable such a user interface to be employed across multiple independent systems.
Abstract: Systems and methods for navigating hypermedia using multiple coordinated input/output device sets. Disclosed systems and methods allow a user and/or an author to control what resources are presented on which device sets (whether they are integrated or not), and provide for coordinating browsing activities to enable such a user interface to be employed across multiple independent systems. Disclosed systems and methods also support new and enriched aspects and applications of hypermedia browsing and related business activities.

1,974 citations

Patent
06 Jan 2014
TL;DR: In this article, the authors present systems and methods for navigating hypermedia using multiple coordinated input/output device sets, allowing a user and/or an author to control what resources are presented on which device sets (whether they are integrated or not), and provide for coordinating browsing activities to enable such a user interface to be employed across multiple independent systems.
Abstract: Systems and methods for navigating hypermedia using multiple coordinated input/output device sets. Disclosed systems and methods allow a user and/or an author to control what resources are presented on which device sets (whether they are integrated or not), and provide for coordinating browsing activities to enable such a user interface to be employed across multiple independent systems. Disclosed systems and methods also support new and enriched aspects and applications of hypermedia browsing and related business activities.

1,344 citations

PatentDOI
TL;DR: In this paper, a system for receiving speech and non-speech communications of natural language questions and commands, transcribing the speech and NN communications to textual messages, and executing the questions and/or commands is presented.
Abstract: Systems and methods are provided for receiving speech and non-speech communications of natural language questions and/or commands, transcribing the speech and non-speech communications to textual messages, and executing the questions and/or commands. The invention applies context, prior information, domain knowledge, and user specific profile data to achieve a natural environment for one or more users presenting questions or commands across multiple domains. The systems and methods creates, stores and uses extensive personal profile information for each user, thereby improving the reliability of determining the context of the speech and non-speech communications and presenting the expected results for a particular question or command.

1,164 citations

Patent
15 Sep 2011
TL;DR: In this article, a method and apparatus for creation, distribution, assembly and verification of media, including one embodiment, media is transmitted to a receiver where the receiver assembles the media into programming.
Abstract: A method and apparatus for creation, distribution, assembly and verification of media, including one embodiment, media is transmitted to a receiver where the receiver assembles the media into programming. In another embodiment, media is transmitted to the receiver from a plurality of sources. In a further embodiment, a source of media performs a tagging operation to associate sets of tags with elements of the stream of media. In various embodiments, different combinations of look-and-feel, content and other tags are associated with the media stream. In an additional embodiment, tagging of the media stream is performed at the receiver. A user at the receiver may also provide data about the user to the receiver. In yet another embodiment, the receiver uses the tags to assemble the media into a program. In still further embodiments of the invention, various Royalty Only Aggregate Revenues or “ROAR” models and apparatus are disclosed.

683 citations

Patent
18 Jan 2002
TL;DR: In this paper, a system and method for visually building multi-channel and multi-modal applications is provided for visualizing and interacting with multi-application components in an interactive development/design environment (IDE).
Abstract: A system and method are provided for visually building multi-channel and multi-modal applications. The system includes a process design module for designing application workflow, an integration design module for integrating data sources into the application; a presentation design module for designing application views; a media library; and a componentization module, for packaging designed workflow into reusable components. The system further includes an interactive development/design environment (IDE). The IDE provides a graphical user interface for allowing a developer to visually interact with and operate modules. The system allows a developer to design a single application that can operate across multiple network standards, devices, browsers and languages, and that operate in one or more modes, such as real-time, off-line and asynchronous modes.

518 citations