scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

SoundSense: scalable sound sensing for people-centric applications on mobile phones

Hong Lu1, Wei Pan1, Nicholas D. Lane1, Tanzeem Choudhury1, Andrew T. Campbell1 
22 Jun 2009-pp 165-178
TL;DR: This paper proposes SoundSense, a scalable framework for modeling sound events on mobile phones that represents the first general purpose sound sensing system specifically designed to work on resource limited phones and demonstrates that SoundSense is capable of recognizing meaningful sound events that occur in users' everyday lives.
Abstract: Top end mobile phones include a number of specialized (e.g., accelerometer, compass, GPS) and general purpose sensors (e.g., microphone, camera) that enable new people-centric sensing applications. Perhaps the most ubiquitous and unexploited sensor on mobile phones is the microphone - a powerful sensor that is capable of making sophisticated inferences about human activity, location, and social events from sound. In this paper, we exploit this untapped sensor not in the context of human communications but as an enabler of new sensing applications. We propose SoundSense, a scalable framework for modeling sound events on mobile phones. SoundSense is implemented on the Apple iPhone and represents the first general purpose sound sensing system specifically designed to work on resource limited phones. The architecture and algorithms are designed for scalability and Soundsense uses a combination of supervised and unsupervised learning techniques to classify both general sound types (e.g., music, voice) and discover novel sound events specific to individual users. The system runs solely on the mobile phone with no back-end interactions. Through implementation and evaluation of two proof of concept people-centric sensing applications, we demostrate that SoundSense is capable of recognizing meaningful sound events that occur in users' everyday lives.

Content maybe subject to copyright    Report

Citations
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Journal ArticleDOI
TL;DR: This article surveys existing mobile phone sensing algorithms, applications, and systems, and discusses the emerging sensing paradigms, and formulates an architectural framework for discussing a number of the open issues and challenges emerging in the new area ofMobile phone sensing research.
Abstract: Mobile phones or smartphones are rapidly becoming the central computer and communication device in people's lives. Application delivery channels such as the Apple AppStore are transforming mobile phones into App Phones, capable of downloading a myriad of applications in an instant. Importantly, today's smartphones are programmable and come with a growing set of cheap powerful embedded sensors, such as an accelerometer, digital compass, gyroscope, GPS, microphone, and camera, which are enabling the emergence of personal, group, and communityscale sensing applications. We believe that sensor-equipped mobile phones will revolutionize many sectors of our economy, including business, healthcare, social networks, environmental monitoring, and transportation. In this article we survey existing mobile phone sensing algorithms, applications, and systems. We discuss the emerging sensing paradigms, and formulate an architectural framework for discussing a number of the open issues and challenges emerging in the new area of mobile phone sensing research.

2,316 citations


Cites background or methods from "SoundSense: scalable sound sensing ..."

  • ...In SoundSense [ 11 ] a general-purpose sound classification system for mobile phones is developed using a combination of supervised and unsupervised learning....

    [...]

  • ...By continuously collecting audio from the phone’s microphone, for example, it is possible to classify a diverse set of distinctive sounds associated with a particular context or activity in a person’s life, such as using an automatic teller machine (ATM), being in a particular coffee shop, having a conversation, listening to music, making coffee, and driving [ 11 ]....

    [...]

  • ...These operations occur either directly on the phone, in the mobile cloud, or with some Figure 4. Raw audio data captured from mobile phones is transformed into features allowing learning algorithms to identify classes of behavior (e.g., driving, in conservation, making coffee) occurring in a stream of sensor data, for example, by SoundSense [ 11 ]....

    [...]

  • ...Alternatively, an effective approach for some systems have been sensor sampling routines with admission control stages that do not process data that is low-quality, saving resources, and reducing errors (e.g., SoundSense [ 11 ])....

    [...]

  • ...SoundSense [ 11 ] adopts this strategy: all the audio data is processed on the phone, and raw audio is never stored....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the authors provide a comprehensive hands-on introduction for newcomers to the field of human activity recognition using on-body inertial sensors and describe the concept of an Activity Recognition Chain (ARC) as a general-purpose framework for designing and evaluating activity recognition systems.
Abstract: The last 20 years have seen ever-increasing research activity in the field of human activity recognition. With activity recognition having considerably matured, so has the number of challenges in designing, implementing, and evaluating activity recognition systems. This tutorial aims to provide a comprehensive hands-on introduction for newcomers to the field of human activity recognition. It specifically focuses on activity recognition using on-body inertial sensors. We first discuss the key research challenges that human activity recognition shares with general pattern recognition and identify those challenges that are specific to human activity recognition. We then describe the concept of an Activity Recognition Chain (ARC) as a general-purpose framework for designing and evaluating activity recognition systems. We detail each component of the framework, provide references to related research, and introduce the best practice methods developed by the activity recognition research community. We conclude with the educational example problem of recognizing different hand gestures from inertial sensors attached to the upper and lower arm. We illustrate how each component of this framework can be implemented for this specific activity recognition problem and demonstrate how different implementations compare and how they impact overall recognition performance.

1,214 citations

01 Jan 2014
TL;DR: This tutorial aims to provide a comprehensive hands-on introduction for newcomers to the field of human activity recognition using on-body inertial sensors and describes the concept of an Activity Recognition Chain (ARC) as a general-purpose framework for designing and evaluating activity recognition systems.
Abstract: The last 20 years have seen ever-increasing research activity in the field of human activity recognition. With activity recognition having considerably matured, so has the number of challenges in designing, implementing, and evaluating activity recognition systems. This tutorial aims to provide a comprehensive hands-on introduction for newcomers to the field of human activity recognition. It specifically focuses on activity recognition using on-body inertial sensors. We first discuss the key research challenges that human activity recognition shares with general pattern recognition and identify those challenges that are specific to human activity recognition. We then describe the concept of an Activity Recognition Chain (ARC) as a general-purpose framework for designing and evaluating activity recognition systems. We detail each component of the framework, provide references to related research, and introduce the best practice methods developed by the activity recognition research community. We conclude with the educational example problem of recognizing different hand gestures from inertial sensors attached to the upper and lower arm. We illustrate how each component of this framework can be implemented for this specific activity recognition problem and demonstrate how different implementations compare and how they impact overall recognition performance.

1,078 citations


Cites background from "SoundSense: scalable sound sensing ..."

  • ...…EOB Speaker recognition, localisation by ambient sounds, activity detection, object self-localisation [Amft et al.2005; Clarkson et al. 2000; Lu et al. 2009] Accelerometers or gyroscopes EOB Detection of body movement patterns, object use, ambient infrastructure [Godfrey et al. 2008;…...

    [...]

  • ...For example, long-term acceleration data recorded on a mobile phone can be segmented using GPS traces [Ashbrook and Starner 2003] or sound recorded using the internal microphone [Lu et al. 2009]....

    [...]

  • ...For example, long-term acceleration data recorded on a mobile phone can be segmented using GPS traces [Ashbrook and Starner 2003] or sound recorded using the internal microphone [Lu et al. 2009]....

    [...]

Patent
23 Feb 2011
TL;DR: A smart phone senses audio, imagery, and/or other stimulus from a user's environment, and acts autonomously to fulfill inferred or anticipated user desires as discussed by the authors, and can apply more or less resources to an image processing task depending on how successfully the task is proceeding or based on the user's apparent interest in the task.
Abstract: A smart phone senses audio, imagery, and/or other stimulus from a user's environment, and acts autonomously to fulfill inferred or anticipated user desires. In one aspect, the detailed technology concerns phone-based cognition of a scene viewed by the phone's camera. The image processing tasks applied to the scene can be selected from among various alternatives by reference to resource costs, resource constraints, other stimulus information (e.g., audio), task substitutability, etc. The phone can apply more or less resources to an image processing task depending on how successfully the task is proceeding, or based on the user's apparent interest in the task. In some arrangements, data may be referred to the cloud for analysis, or for gleaning. Cognition, and identification of appropriate device response(s), can be aided by collateral information, such as context. A great number of other features and arrangements are also detailed.

1,056 citations

References
More filters
Book
Christopher M. Bishop1
17 Aug 2006
TL;DR: Probability Distributions, linear models for Regression, Linear Models for Classification, Neural Networks, Graphical Models, Mixture Models and EM, Sampling Methods, Continuous Latent Variables, Sequential Data are studied.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

22,840 citations


"SoundSense: scalable sound sensing ..." refers methods in this paper

  • ...We use a simple Bayes classifier [ 8 ] with equal priors for each class to represent different ambient sound events (e.g., using a washing machine, driving a car)....

    [...]

Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Book
01 Aug 2006
TL;DR: Looking for competent reading resources?
Abstract: Looking for competent reading resources? We have pattern recognition and machine learning information science and statistics to read, not only read, but also download them or even check out online. Locate this fantastic book writtern by by now, simply here, yeah just here. Obtain the reports in the kinds of txt, zip, kindle, word, ppt, pdf, as well as rar. Once again, never ever miss to review online and download this book in our site right here. Click the link.

8,923 citations

Book
01 Jan 1993
TL;DR: This book presents a meta-modelling framework for speech recognition that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive process of manually modeling speech.
Abstract: 1. Fundamentals of Speech Recognition. 2. The Speech Signal: Production, Perception, and Acoustic-Phonetic Characterization. 3. Signal Processing and Analysis Methods for Speech Recognition. 4. Pattern Comparison Techniques. 5. Speech Recognition System Design and Implementation Issues. 6. Theory and Implementation of Hidden Markov Models. 7. Speech Recognition Based on Connected Word Models. 8. Large Vocabulary Continuous Speech Recognition. 9. Task-Oriented Applications of Automatic Speech Recognition.

8,442 citations

Journal ArticleDOI
01 Jan 1978
TL;DR: A comprehensive catalog of data windows along with their significant performance parameters from which the different windows can be compared is included, and an example demonstrates the use and value of windows to resolve closely spaced harmonic signals characterized by large differences in amplitude.
Abstract: This paper makes available a concise review of data windows and their affect on the detection of harmonic signals in the presence of broad-band noise, and in the presence of nearby strong harmonic interference. We also call attention to a number of common errors in the application of windows when used with the fast Fourier transform. This paper includes a comprehensive catalog of data windows along with their significant performance parameters from which the different windows can be compared. Finally, an example demonstrates the use and value of windows to resolve closely spaced harmonic signals characterized by large differences in amplitude.

7,130 citations