Toward Natural Gesture/Speech Control of a Large Display

Open AccessPosted Content

Toward Natural Gesture/Speech Control of a Large Display

- 17 May 2001 -

arXiv: Computer Vision and Pattern Recog...

TLDR

In this article, a structured approach for studying patterns of multimodal language in the context of a 2D-display control is proposed, where gestures from observable kinematical primitives to their semantics are considered pertinent to a linguistic structure.

Abstract:

In recent years because of the advances in computer vision research, free hand gestures have been explored as means of human-computer interaction (HCI). Together with improved speech processing technology it is an important step toward natural multimodal HCI. However, inclusion of non-predefined continuous gestures into a multimodal framework is a challenging problem. In this paper, we propose a structured approach for studying patterns of multimodal language in the context of a 2D-display control. We consider systematic analysis of gestures from observable kinematical primitives to their semantics as pertinent to a linguistic structure. Proposed semantic classification of co-verbal gestures distinguishes six categories based on their spatio-temporal deixis. We discuss evolution of a computational framework for gesture and speech integration which was used to develop an interactive testbed (iMAP). The testbed enabled elicitation of adequate, non-sequential, multimodal patterns in a narrative mode of HCI. Conducted user studies illustrate significance of accounting for the temporal alignment of gesture and speech parts in semantic mapping. Furthermore, co-occurrence analysis of gesture/speech production suggests syntactic organization of gestures at the lexical level.

Toward Natural Gesture/Speech Control of a Large Display

Citations

Controlling objects via gesturing

Providing a user interface experience based on inferred vehicle state

Visual and linguistic information in gesture classification

Visual and linguistic information in gesture classification

Determining a position of a pointing device

References

Ecological Interfaces: Extending the Pointing Paradigm by Visual Context

Related Papers (5)

Toward Natural Gesture/Speech Control of a Large Display

Toward natural gesture/speech control of a large display. Discussion

Data-based analysis of speech and gesture: the Bielefeld Speech and Gesture Alignment corpus (SaGA) and its applications

Multimodal communication from multimodal thinking|towards an integrated model of speech and gesture production

Retrieving Target Gestures Toward Speech Driven Animation with Meaningful Behaviors