scispace - formally typeset
Search or ask a question
Author

Keisuke Nakamura

Bio: Keisuke Nakamura is an academic researcher from Honda. The author has contributed to research in topics: Acoustic source localization & Microphone array. The author has an hindex of 18, co-authored 143 publications receiving 1324 citations. Previous affiliations of Keisuke Nakamura include Tokyo Institute of Technology & Centre national de la recherche scientifique.


Papers
More filters
Proceedings ArticleDOI
10 Oct 2009
TL;DR: A new localization system “Selective Attention System”, implemented into a humanoid robot, and the experimental validation is successfully verified even when the robot microphones move dynamically, addressing sound source localization working in dynamic environments for robots.
Abstract: As robotic technology plays an increasing role in human lives, “robot audition”, human-robot communication, is of great interest, and robot audition needs to be robust and adaptable for dynamic environments. This paper addresses sound source localization working in dynamic environments for robots. Previously, noise robustness and dynamic localized sound selection have been enormous issues for practical use. To correct the issues, a new localization system “Selective Attention System” is proposed. The system has four new functions: localization with Generalized EigenValue Decomposition of correlation matrices for noise robustness(“Localization with GEVD”), sound source cancellation and focus (“Target Source Selection”), human-like dynamic Focus of Attention (“Dynamic FoA”), and correlation matrix estimation for robotic head rotation (“Correlation Matrix Estimation”). All are achieved by the dynamic design of correlation matrices. The system is implemented into a humanoid robot, and the experimental validation is successfully verified even when the robot microphones move dynamically.

99 citations

Proceedings ArticleDOI
24 Dec 2012
TL;DR: A prototype system for auditory scene analysis based on the proposed MUltiple SIgnal Classification based on incremental Generalized EigenValue Decomposition (iGEVD-MUSIC) showed that dynamically-changing noise is properly suppressed with the proposed method and multiple human voice sources are able to be localized even when the AR.Drone is moving in an outdoor environment.
Abstract: This paper addresses auditory scene analysis, especially, sound source localization using an aerial vehicle with a microphone array in an outdoor environment. Since such a vehicle is able to search sound sources quickly and widely, it is useful to detect outdoor sound sources, for instance, to find distressed people in a disaster situation. In such an environment, noise is quite loud and dynamically-changing, and conventional microphone array techniques studied in the field of indoor robot audition are of less use. We, thus, proposed MUltiple SIgnal Classification based on incremental Generalized EigenValue Decomposition (iGEVD-MUSIC). It can deal with high power noise by introducing a noise correlation matrix and GEVD even when the signal-to-noise ratio is less than 0 dB. In addition, the noise correlation matrix is incrementally estimated to adapt to dynamic changes in noise. We developed a prototype system for auditory scene analysis based on the proposed method using the Parrot AR.Drone with a microphone array and a Kinect device. Experimental results using the prototype system showed that dynamically-changing noise is properly suppressed with the proposed method and multiple human voice sources are able to be localized even when the AR.Drone is moving in an outdoor environment.

95 citations

Journal ArticleDOI
TL;DR: The state-of-the-art human-centered RL algorithms are described and become a starting point for researchers who are initiating their endeavors in human- centered RL and references to the most interesting and successful works are provided.
Abstract: Human-centered reinforcement learning (RL), in which an agent learns how to perform a task from evaluative feedback delivered by a human observer, has become more and more popular in recent years. The advantage of being able to learn from human feedback for a RL agent has led to increasing applicability to real-life problems. This paper describes the state-of-the-art human-centered RL algorithms and aims to become a starting point for researchers who are initiating their endeavors in human-centered RL. Moreover, the objective of this paper is to present a comprehensive survey of the recent breakthroughs in this field and provide references to the most interesting and successful works. After starting with an introduction of the concepts of RL from environmental reward, this paper discusses the origins of human-centered RL and its difference from traditional RL. Then we describe different interpretations of human evaluative feedback, which have produced many human-centered RL algorithms in the past decade. In addition, we describe research on agents learning from both human evaluative feedback and environmental rewards as well as on improving the efficiency of human-centered RL. Finally, we conclude with an overview of application areas and a discussion of future work and open questions.

82 citations

Proceedings ArticleDOI
06 Nov 2014
TL;DR: Experimental results showed that the combination of iGSVD-MUSIC and CMS improves sound source detection performance drastically and achieves real-time processing.
Abstract: This paper addresses sound source detection in an outdoor environment using a quadrotor with a microphone array. Since the previously reported method has a high computational cost, we proposed a sound source detection algorithm called MUltiple SIgnal Classification based on incremental Generalized Singular Value Decomposition (iGSVD-MUSIC), which detects sound source location and temporal activity with low computational cost. In addition, to relax an over-esitimation problem of noise correlation matrix which is used in iGSVD-MUSIC, we proposed Correlation Matrix Scaling (CMS), which realizes soft whitening of noise. The protptype system based on the proposed methods were evaluated with two types of microphone arrays in an outdoor environment. Experimental results showed that the combination of iGSVD-MUSIC and CMS improves sound source detection performance drastically and achieves real-time processing.

64 citations

Proceedings ArticleDOI
24 Dec 2012
TL;DR: This work proposes two methods, MUSIC based on Generalized Singular Value Decomposition (GSVD-MUSIC), and Hierarchical SSL (H-SSL), which drastically reduces the computational cost while maintaining noise-robustness in localization.
Abstract: Sound Source Localization (SSL) is an essential function for robot audition and yields the location and number of sound sources, which are utilized for post-processes such as sound source separation. SSL for a robot in a real environment mainly requires noise-robustness, high resolution and real-time processing. A technique using microphone array processing, that is, Multiple Signal Classification based on Standard Eigen-Value Decomposition (SEVD-MUSIC) is commonly used for localization. We improved its robustness against noise with high power by incorporating Generalized EigenValue Decomposition (GEVD). However, GEVD-based MUSIC (GEVD-MUSIC) has mainly two issues: 1) the resolution of pre-measured Transfer Functions (TFs) determines the resolution of SSL, 2) its computational cost is expensive for real-time processing. For the first issue, we propose a TF interpolation method integrating time-domain-based and frequency-domain-based interpolation. The interpolation achieves super-resolution SSL, whose resolution is higher than that of the pre-measured TFs. For the second issue, we propose two methods, MUSIC based on Generalized Singular Value Decomposition (GSVD-MUSIC), and Hierarchical SSL (H-SSL). GSVD-MUSIC drastically reduces the computational cost while maintaining noise-robustness in localization. H-SSL also reduces the computational cost by introducing a hierarchical search algorithm instead of using greedy search in localization. These techniques are integrated into an SSL system using a robot embedded microphone array. The experimental result showed: the proposed interpolation achieved approximately 1 degree resolution although we have only TFs at 30 degree intervals, GSVD-MUSIC attained 46.4% and 40.6% of the computational cost compared to SEVD-MUSIC and GEVD-MUSIC, respectively, H-SSL reduces 59.2% computational cost in localization of a single sound source.

64 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

01 Jan 1979
TL;DR: This special issue aims at gathering the recent advances in learning with shared information methods and their applications in computer vision and multimedia analysis and addressing interesting real-world computer Vision and multimedia applications.
Abstract: In the real world, a realistic setting for computer vision or multimedia recognition problems is that we have some classes containing lots of training data and many classes contain a small amount of training data. Therefore, how to use frequent classes to help learning rare classes for which it is harder to collect the training data is an open question. Learning with Shared Information is an emerging topic in machine learning, computer vision and multimedia analysis. There are different level of components that can be shared during concept modeling and machine learning stages, such as sharing generic object parts, sharing attributes, sharing transformations, sharing regularization parameters and sharing training examples, etc. Regarding the specific methods, multi-task learning, transfer learning and deep learning can be seen as using different strategies to share information. These learning with shared information methods are very effective in solving real-world large-scale problems. This special issue aims at gathering the recent advances in learning with shared information methods and their applications in computer vision and multimedia analysis. Both state-of-the-art works, as well as literature reviews, are welcome for submission. Papers addressing interesting real-world computer vision and multimedia applications are especially encouraged. Topics of interest include, but are not limited to: • Multi-task learning or transfer learning for large-scale computer vision and multimedia analysis • Deep learning for large-scale computer vision and multimedia analysis • Multi-modal approach for large-scale computer vision and multimedia analysis • Different sharing strategies, e.g., sharing generic object parts, sharing attributes, sharing transformations, sharing regularization parameters and sharing training examples, • Real-world computer vision and multimedia applications based on learning with shared information, e.g., event detection, object recognition, object detection, action recognition, human head pose estimation, object tracking, location-based services, semantic indexing. • New datasets and metrics to evaluate the benefit of the proposed sharing ability for the specific computer vision or multimedia problem. • Survey papers regarding the topic of learning with shared information. Authors who are unsure whether their planned submission is in scope may contact the guest editors prior to the submission deadline with an abstract, in order to receive feedback.

1,758 citations

Patent
Jong Hwan Kim1
13 Mar 2015
TL;DR: In this article, a mobile terminal including a body; a touchscreen provided to a front and extending to side of the body and configured to display content; and a controller configured to detect one side of a body when it comes into contact with a side of an external terminal, display a first area on the touchscreen corresponding to a contact area of body and the external terminal and a second area including the content.
Abstract: A mobile terminal including a body; a touchscreen provided to a front and extending to side of the body and configured to display content; and a controller configured to detect one side of the body comes into contact with one side of an external terminal, display a first area on the touchscreen corresponding to a contact area of the body and the external terminal and a second area including the content, receive an input of moving the content displayed in the second area to the first area, display the content in the first area, and share the content in the first area with the external terminal.

1,441 citations