Home
/
Authors
/
Nobuyuki Miyake

Author

Nobuyuki Miyake

Bio: Nobuyuki Miyake is an academic researcher from Kobe University. The author has contributed to research in topics: Noise & Noise reduction. The author has an hindex of 4, co-authored 10 publications receiving 49 citations.

Topics: Noise, Noise reduction, Noise power, Voice activity detection, Background noise ...read more

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Human-Robot Interface Using System Request Utterance Detection Based on Acoustic Features

[...]

Tetsuya Takiguchi¹, Atsushi Sako¹, Jerome Revaud, Tomoyuki Yamagata¹, Nobuyuki Miyake, Yasuo Ariki¹ - Show less +2 more•Institutions (1)

Kobe University¹

24 Apr 2008

TL;DR: A new mobile robot with hands-free speech recognition that can understand whether user's utterances are commands for the robot or not, where commands are discriminated from human- human conversations by acoustic features.

...read moreread less

Abstract: For a mobile robot to serve people in actual environments, such as a living room or a party room, it must be easy to control because some users might not even be capable of operating a computer keyboard. For nonexpert users, speech recognition is one of the most effective communication tools when it comes to a hands-free (human-robot) interface. This paper describes a new mobile robot with hands-free speech recognition. For a hands- free speech interface, it is important to detect commands for a robot in spontaneous utterances. Our system can understand whether user's utterances are commands for the robot or not, where commands are discriminated from human- human conversations by acoustic features. Then the robot can move according to the user's voice (command). In order to capture the user's voice only, a robust voice detection system with AdaBoost is also described.

...read moreread less

13 citations

Journal Article•DOI•

Sudden Noise Reduction Based on GMM with Noise Power Estimation

[...]

Nobuyuki Miyake, Tetsuya Takiguchi¹, Yasuo Ariki•Institutions (1)

Kobe University¹

30 Apr 2010-Journal of Software Engineering and Applications

TL;DR: In this article, a method for reducing sudden noise using noise detection and classification methods, and noise power estimation, was described, which achieved good performance for recognition of utterances overlapped by sudden noises.

...read moreread less

Abstract: This paper describes a method for reducing sudden noise using noise detection and classification methods, and noise power estimation. Sudden noise detection and classification have been dealt with in our previous study. In this paper, GMM-based noise reduction is performed using the detection and classification results. As a result of classification, we can determine the kind of noise we are dealing with, but the power is unknown. In this paper, this problem is solved by combining an estimation of noise power with the noise reduction method. In our experiments, the proposed method achieved good performance for recognition of utterances overlapped by sudden noises.

...read moreread less

12 citations

Patent•

Noise detecting device and noise detecting method

[...]

Yasuo Ariki, Kentaro Koga, Nobuyuki Miyake, Tetsuya Takiguchi, 信之三宅, 健太郎古賀, 康雄有木, 哲也滝口 - Show less +4 more

13 Dec 2006

TL;DR: In this paper, a final discriminator for discriminating binary values indicating whether data of a noise-superposed speech having a noise superposed in a speech section is a noise generated by a predetermined sound source was proposed.

...read moreread less

Abstract: PROBLEM TO BE SOLVED: To discriminate the kind (sound source) of a noise. SOLUTION: A final discriminator for discriminating binary values indicating whether data of a noise-superposed speech having a noise superposed in a speech section is a noise generated by a predetermined sound source is held for each predetermined sound source, the input data of the noise-superposed speech is discriminated by using held final discriminators by predetermined sound sources, and a final discriminator having the highest score of one of the binary values is decided according to results of the discrimination to detect the sound source of the noise in the data being the predetermined sound source that the decided final discriminator indicates. Further, a plurality of data including the data of the noise-superposed speech are held as data for learning, and boosting is used to derive final discriminators by the predetermined sound sources using the held data for learning. COPYRIGHT: (C)2008,JPO&INPIT

...read moreread less

8 citations

Proceedings Article•

Sudden noise reduction based on GMM with noise power estimation.

[...]

Nobuyuki Miyake, Tetsuya Takiguchi¹, Yasuo Ariki¹•Institutions (1)

Kobe University¹

01 Jan 2008

TL;DR: This paper describes a method for reducing sudden noise using noise detection and classification methods, and noise power estimation, and achieves good performance for recognition of utterances overlapped by sudden noises.

...read moreread less

6 citations

Proceedings Article•DOI•

Noise Detection and Classification in Speech Signals with Boosting

[...]

Nobuyuki Miyake¹, Tetsuya Takiguchi¹, Yasuo Ariki¹•Institutions (1)

Kobe University¹

26 Aug 2007

TL;DR: A novel method to detect and classify sudden noises in speech signals using Boosting, which can create a complex, non-linear boundary that determines whether the observed signal is speech, noise1, noise2, or so on.

...read moreread less

Abstract: This paper presents a novel method to detect and classify sudden noises in speech signals. There are many sudden and short-period noises in natural environments, such as inside a car. If a speech recognition system can detect sudden noises, it will make it possible for the system to ask the speaker to repeat the same utterance so that the speech data will be clean. If clean speech data can be input, it will help prevent system operation errors. In this paper, we tried to detect and classify sudden noises in user's utterances using Boosting. Boosting can create a complex, non-linear boundary that determines whether the observed signal is speech, noise1, noise2, or so on. In our experiments, the proposed method achieved good performance in comparison to a conventional method based on the GMM (Gaussian Mixture Model).

...read moreread less

4 citations

Cited by

PDF

Open Access

More filters

Dissertation•DOI•

Sound event recognition in unstructured environments using spectrogram image processing

[...]

Jonathan J. Dennis

01 Jan 2014

TL;DR: The approach taken is to interpret the sound event as a two-dimensional spectrogram image, with the two axes as the time and frequency dimensions, which enables novel methods for SER to be developed based on spectrogramimage processing, which are inspired by techniques from the field of image processing.

...read moreread less

Abstract: The objective of this research is to develop feature extraction and classification techniques for the task of sound event recognition (SER) in unstructured environments. Although this field is traditionally overshadowed by the popular field of automatic speech recognition (ASR), an SER system that can achieve human-like sound recognition performance opens up a range of novel application areas. These include acoustic surveillance, bio-acoustical monitoring, environmental context detection, healthcare applications and more generally the rich transcription of acoustic environments. The challenge in such environments are the adverse effects such as noise, distortion and multiple sources, which are more likely to occur with distant microphones compared to the close-talking microphones that are more common in ASR. In addition, the characteristics of acoustic events are less well defined than those of speech, and there is no sub-word dictionary available like the phonemes in speech. Therefore, the performance of ASR systems typically degrades dramatically in these challenging unstructured environments, and it is important to develop new methods that can perform well for this challenging task. In this thesis, the approach taken is to interpret the sound event as a two-dimensional spectrogram image, with the two axes as the time and frequency dimensions. This enables novel methods for SER to be developed based on spectrogram image processing, which are inspired by techniques from the field of image processing. The motivation for such an approach is based on finding an automatic approach to “spectrogram reading”, where it is possible for humans to visually recognise the different sound event signatures in the spectrogram. The advantages of such an approach are twofold. Firstly, the sound event image representation makes it possible to naturally capture the sound information in a two-dimensional feature. This has advantages over conventional onedimensional frame-based features, which capture only a slice of spectral information

...read moreread less

62 citations

Patent•

Noise suppression method, device, and program

[...]

Akihiko Sugiyama¹•Institutions (1)

NEC¹

05 Mar 2008

TL;DR: In this article, a shock noise detection and sound suppression system was proposed to suppress the shock sound in a noise suppression device, which consists of two components: a noise detection unit which detects an input signal including a shock signal and detects a noise according to a change of the input signal; and a sound suppression unit which receives the noise detection result and the noise suppression result and suppress the noise.

...read moreread less

Abstract: The noise suppression device includes: a shock noise detection unit which receives an input signal including a shock noise and detects a shock noise according to a change of the input signal; and a shock sound suppression unit which receives the shock sound detection result and the input signal so as to suppress the shock sound.

...read moreread less

42 citations

Patent•

Apparatus and method for detecting speech

[...]

Chiyoun Park¹, Namhoon Kim¹, Jeong-mi Cho¹•Institutions (1)

Samsung¹

16 Apr 2010

TL;DR: In this paper, a speech detection apparatus and a method were proposed to determine whether a frame is speech or not using feature information extracted from an input signal. But, they did not specify which feature information is required for speech detection for each frame in the estimated situation.

...read moreread less

Abstract: A speech detection apparatus and method are provided. The speech detection apparatus and method determine whether a frame is speech or not using feature information extracted from an input signal. The speech detection apparatus may estimate a situation related to an input frame and determine which feature information is required for speech detection for the input frame in the estimated situation. The speech detection apparatus may detect a speech signal using dynamic feature information that may be more suitable to the situation of a particular frame, instead of using the same feature information for each and every frame.

...read moreread less

15 citations

Proceedings Article•DOI•

Robot-directed speech detection using Multimodal Semantic Confidence based on speech, image, and motion

[...]

Xiang Zuo, Naoto Iwahashi, Ryo Taguchi, Shigeki Matsuda¹, Komei Sugiura¹, Kotaro Funakoshi², Mikio Nakano², Natsuki Oka³ - Show less +4 more•Institutions (3)

National Institute of Information and Communications Technology¹, Honda², Kyoto Institute of Technology³

14 Mar 2010

TL;DR: A novel method to detect robot-directed (RD) speech that adopts the Multimodal Semantic Confidence (MSC) measure, calculated by integrating speech, image, and motion confidence measures with weightings that are optimized by logistic regression.

...read moreread less

Abstract: In this paper, we propose a novel method to detect robot-directed (RD) speech that adopts the Multimodal Semantic Confidence (MSC) measure. The MSC measure is used to decide whether the speech can be interpreted as a feasible action under the current physical situation in an object manipulation task. This measure is calculated by integrating speech, image, and motion confidence measures with weightings that are optimized by logistic regression. Experimental results show that, compared with a baseline method that uses speech confidence only, MSC achieved an absolute increase of 5% for clean speech and 12% for noisy speech in terms of average maximum F-measure.

...read moreread less

12 citations