scispace - formally typeset
Journal ArticleDOI

SpeechSkimmer: a system for interactively skimming recorded speech

Reads0
Chats0
TLDR
SpeakSkimmer as discussed by the authors uses speech processing techniques to allow a user to hear recorded sounds quickly, and at several levels of detail, and provides continuous real-time control of the speed and detail level of the audio presentation.
Abstract
Listening to a speech recording is much more difficult than visually scanning a document because of the transient and temporal nature of audio. Audio recordings capture the richness of speech, yet it is difficult to directly browse the stored information. This article describes techniques for structuring, filtering, and presenting recorded speech, allowing a user to navigate and interactively find information in the audio domain. This article describes the SpeechSkimmer system for interactively skimming speech recordings. SpeechSkimmer uses speech-processing techniques to allow a user to hear recorded sounds quickly, and at several levels of detail. User interaction, through a manual input device, provides continuous real-time control of the speed and detail level of the audio presentation. SpeechSkimmer reduces the time needed to listen by incorporating time-compressed speech, pause shortening, automatic emphasis detection, and nonspeech audio feedback. This article also presents a multilevel structural approach to auditory skimming and user interface techniques for interacting with recorded speech. An observational usability test of SpeechSkimmer is discussed, as well as a redesign and reimplementation of the user interface based on the results of this usability test.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

An overview of audio information retrieval

TL;DR: The state of the art in audio information retrieval is reviewed, and recent advances in automatic speech recognition, word spotting, speaker and music identification, and audio similarity are presented with a view towards making audio less “opaque”.
Journal ArticleDOI

MARSYAS: a framework for audio analysis

TL;DR: This paper describes MARSYAS, a framework for experimenting, evaluating and integrating techniques for audio content analysis in restricted domains and a new method for temporal segmentation based on audio texture that is combined with audio analysis techniques and used for hierarchical browsing, classification and annotation of audio files.
Proceedings ArticleDOI

Automatic audio segmentation using a measure of audio novelty

TL;DR: This method can find individual note boundaries or even natural segment boundaries such as verse/chorus or speech/music transitions, even in the absence of cues such as silence, by analyzing local self-similarity.
Journal ArticleDOI

Nomadic radio: speech and audio interaction for contextual messaging in nomadic environments

TL;DR: Iterative design and a preliminary user evaluation suggest that audio is an appropriate medium for mobile messaging, but that care must be taken to minimally intrude on the wearer's social and physical environment.
Proceedings ArticleDOI

Auto-summarization of audio-video presentations

TL;DR: A user study is reported that compares automatically generated summaries that are 20%-25% the length of full presentations to author generated summary, and finds that users learn from the computer-generated summaries, although less than from authors' summaries.
References
More filters
Journal ArticleDOI

A tutorial on hidden Markov models and selected applications in speech recognition

TL;DR: In this paper, the authors provide an overview of the basic theory of hidden Markov models (HMMs) as originated by L.E. Baum and T. Petrie (1966) and give practical details on methods of implementation of the theory along with a description of selected applications of HMMs to distinct problems in speech recognition.
Book

Usability Engineering

Jakob Nielsen
TL;DR: This guide to the methods of usability engineering provides cost-effective methods that will help developers improve their user interfaces immediately and shows you how to avoid the four most frequently listed reasons for delay in software projects.
Journal ArticleDOI

Protocol Analysis: Verbal Reports as Data.

TL;DR: This article reviewed major advances in verbal reports over the past decade, including new evidence on how giving verbal reports affects subjects' cognitive processes, and on the validity and completeness of such reports.
Book

Protocol Analysis: Verbal Reports as Data

TL;DR: In this article, the authors reviewed major advances in verbal reports over the past decade, including new evidence on how giving verbal reports affects subjects' cognitive processes, and on the validity and completeness of such reports.