Showing papers by "Kazuya Takeda published in 2015"

PDF

Open Access

Journal Article•DOI•

[...]

Shinpei Kato¹, Eijiro Takeuchi¹, Yoshio Ishiguro¹, Yoshiki Ninomiya¹, Kazuya Takeda¹, Tsuyoshi Hamada² - Show less +2 more•Institutions (2)

Nagoya University¹, Nagasaki University²

01 Nov 2015-IEEE Micro

TL;DR: An open platform using commodity vehicles and sensors is introduced to facilitate the development of autonomous vehicles and presents algorithms, software libraries, and datasets required for scene recognition, path planning, and vehicle control.

...read moreread less

Abstract: Autonomous vehicles are an emerging application of automotive technology. They can recognize the scene, plan the path, and control the motion by themselves while interacting with drivers. Although they receive considerable attention, components of autonomous vehicles are not accessible to the public but instead are developed as proprietary assets. To facilitate the development of autonomous vehicles, this article introduces an open platform using commodity vehicles and sensors. Specifically, the authors present algorithms, software libraries, and datasets required for scene recognition, path planning, and vehicle control. This open platform allows researchers and developers to study the basis of autonomous vehicles, design new algorithms, and test their performance using the common interface.

...read moreread less

432 citations

Proceedings Article•DOI•

Exploring multi-channel features for denoising-autoencoder-based speech enhancement

[...]

Shoko Araki¹, Tomoki Hayashi¹, Marc Delcroix¹, Masakiyo Fujimoto¹, Kazuya Takeda², Tomohiro Nakatani¹ - Show less +2 more•Institutions (2)

Nippon Telegraph and Telephone¹, Nagoya University²

19 Apr 2015

TL;DR: Experimental results show that certain multi-channel features outperform both a monaural DAE and a conventional time-frequency-mask-based speech enhancement method.

...read moreread less

Abstract: This paper investigates a multi-channel denoising autoencoder (DAE)-based speech enhancement approach. In recent years, deep neural network (DNN)-based monaural speech enhancement and robust automatic speech recognition (ASR) approaches have attracted much attention due to their high performance. Although multi-channel speech enhancement usually outperforms single channel approaches, there has been little research on the use of multi-channel processing in the context of DAE. In this paper, we explore the use of several multi-channel features as DAE input to confirm whether multi-channel information can improve performance. Experimental results show that certain multi-channel features outperform both a monaural DAE and a conventional time-frequency-mask-based speech enhancement method.

...read moreread less

101 citations

Proceedings Article•DOI•

Integration of deep bottleneck features for audio-visual speech recognition

[...]

Hiroshi Ninomiya¹, Norihide Kitaoka², Satoshi Tamura³, Yurie Iribe⁴, Kazuya Takeda¹ - Show less +1 more•Institutions (4)

Nagoya University¹, University of Tokushima², Gifu University³, Aichi Prefectural University⁴

06 Sep 2015

TL;DR: This paper proposes a method of integrating DBNFs using multi-stream HMMs in order to improve the performance of AVSRs under both clean and noisy conditions and evaluates the method using a continuously spo-ken, Japanese digit recognition task under matched and mismatched conditions.

...read moreread less

Abstract: Recent interest in “deep learning”, which can be defined as the use of algorithms to model high-level abstractions in data, using models composed of multiple non-linear transformations, has resulted in an increase in the number of studies investigating the use of deep learning with automatic speech recognition (ASR) systems. Some of these studies have found that bottleneck features extracted from deep neural networks (DNNs), sometimes called “deep bottleneck features” (DBNFs), can reduce the word error rates of ASR systems. However, there has been little research on audio-visual speech recognition (AVSR) systems using DBNFs. In this paper, we propose a method of integrating DBNFs using multi-stream HMMs in order to improve the performance of AVSRs under both clean and noisy conditions. We evaluate our method using a continuously spoken, Japanese digit recognition task under matched and mismatched conditions. Relative word error reduction rates of roughly 68.7%, 47.4%, and 51.9% were achieved, compared with an audio-only ASR system and two feature-fusion models, which employed DBNFs and single-stream HMMs, respectively.

...read moreread less

49 citations

Proceedings Article•DOI•

Audio-visual speech recognition using deep bottleneck features and high-performance lipreading

[...]

Satoshi Tamura¹, Hiroshi Ninomiya², Norihide Kitaoka³, Shin Osuga⁴, Yurie Iribe⁵, Kazuya Takeda², Satoru Hayamizu¹ - Show less +3 more•Institutions (5)

Gifu University¹, Nagoya University², University of Tokushima³, Aisin Seiki Co.⁴, Aichi Prefectural University⁵

01 Dec 2015

TL;DR: It is found VAD is useful in both audio and visual modalities, for better lipreading and AVSR, and investigating effectiveness of voice activity detection in a visual modality.

...read moreread less

Abstract: This paper develops an Audio-Visual Speech Recognition (AVSR) method, by (1) exploring high-performance visual features, (2) applying audio and visual deep bottleneck features to improve AVSR performance, and (3) investigating effectiveness of voice activity detection in a visual modality. In our approach, many kinds of visual features are incorporated, subsequently converted into bottleneck features by deep learning technology. By using proposed features, we successfully achieved 73.66% lipreading accuracy in speaker-independent open condition, and about 90% AVSR accuracy on average in noisy environments. In addition, we extracted speech segments from visual features, resulting 77.80% lipreading accuracy. It is found VAD is useful in both audio and visual modalities, for better lipreading and AVSR.

...read moreread less

44 citations

Proceedings Article•DOI•

Daily activity recognition based on DNN using environmental sound and acceleration signals

[...]

Tomoki Hayashi¹, Masafumi Nishida¹, Norihide Kitaoka², Kazuya Takeda¹•Institutions (2)

Nagoya University¹, University of Tokushima²

28 Dec 2015

TL;DR: The proposed method outperformed the SVM-based method when an additional "Other" activity category was included and it is demonstrated that DNNs are a robust method of daily activity recognition.

...read moreread less

Abstract: We propose a new method of recognizing daily human activities based on a Deep Neural Network (DNN), using multimodal signals such as environmental sound and subject acceleration. We conduct recognition experiments to compare the proposed method to other methods such as a Support Vector Machine (SVM), using real-world data recorded continuously over 72 hours. Our proposed method achieved a frame accuracy rate of 85.5% and a sample accuracy rate of 91.7% when identifying nine different types of daily activities. Furthermore, the proposed method outperformed the SVM-based method when an additional "Other" activity category was included. Therefore, we demonstrate that DNNs are a robust method of daily activity recognition.

...read moreread less

41 citations

Proceedings Article•DOI•

Automatic lane change extraction based on temporal patterns of symbolized driving behavioral data

[...]

Masataka Mori¹, Kazuhito Takenaka¹, Takashi Bando¹, Tadahiro Taniguchi², Chiyomi Miyajima³, Kazuya Takeda³ - Show less +2 more•Institutions (3)

Denso¹, Ritsumeikan University², Nagoya University³

27 Aug 2015

TL;DR: This paper proposes a method of automatically extracting lane change situations from large-scale driving corpora using an unsupervised symbolization method and topic representation to driving data and shows effectiveness of symbols with topic proportions for representing characteristics of driving situations.

...read moreread less

Abstract: This paper proposes a method of automatically extracting lane change situations from large-scale driving corpora. Naturalistic driving data stored in large-scale corpora has a potential of contributing for developing novel advanced driver-assistance systems based on estimated information about driver's intent and/or potential risk of accidents. However, direct estimation of such kind of information from stream data is difficult. To address the issue, we apply an unsupervised symbolization method and topic representation to driving data. Driving stream data is converted to sequences of discrete symbols by a non-parametric symbolization method, and then the symbols are characterized by topics which represent typical distribution of driving behavior observed during the symbols. Because these symbols are separated on changing points of driving behavior, similar driving situations are effectively retrieved from sequences of the symbols. For evaluating effectiveness of the symbolization approach, we extract lane change situations based on the topic proportions and their temporal patterns. Distinctive elements of topic proportions and their temporal patterns for lane change situations are extracted by AdaBoost classifier. As a result, proposed approach outperforms baselines with neither topic proportions nor their temporal patterns in terms of extracting lane change situations. This result shows effectiveness of symbols with topic proportions for representing characteristics of driving situations.

...read moreread less

16 citations

Proceedings Article•DOI•

Analyzing driver gaze behavior and consistency of decision making during automated driving

[...]

Chiyomi Miyajima¹, Suguru Yamazaki¹, Takashi Bando², Kentarou Hitomi², Hitoshi Terai¹, Hiroyuki Okuda¹, Takatsugu Hirayama¹, Masumi Egawa², Tatsuya Suzuki¹, Kazuya Takeda¹ - Show less +6 more•Institutions (2)

Nagoya University¹, Denso²

27 Aug 2015

TL;DR: Experimental results show that drivers who pay less attention to the road ahead during automated driving tend to be less sensitive to risk factors in the surrounding environment and also tend to make inconsistent lane change decisions during automateddriving.

...read moreread less

Abstract: We investigate a possible method for detecting a driver's negative adaptation to an automated driving system by analyzing consistency of driver decision making and driver gaze behavior during automated driving. We focus on an automated driving system equivalent to Level 2 automation per the NHTSA's definition. At this level of automation, drivers must be ready to take control of the vehicle in critical situations by monitoring the driving environment and vehicle behavior. Since drivers are not required to operate the pedals or steering wheel during automated driving, a driver's negative adaptation to an automated system needs to be detected from behavior other than vehicle operation. In this study, we focus on driver gaze behavior. We conduct a simulator study to compare the gaze behavior of fifteen drivers during conventional and automated driving. We also analyze the consistency of driver decision making when changing lanes during conventional and automated driving. Experimental results show that drivers who pay less attention to the road ahead during automated driving tend to be less sensitive to risk factors in the surrounding environment and also tend to make inconsistent lane change decisions during automated driving.

...read moreread less

15 citations

Journal Article•DOI•

Modeling of Physical Characteristics of Speech under Stress

[...]

Xiao Yao¹, Takatoshi Jitsuhiro², Chiyomi Miyajima³, Norihide Kitaoka³, Kazuya Takeda³ - Show less +1 more•Institutions (3)

Hohai University¹, Aichi University of Technology², Nagoya University³

18 May 2015-IEEE Signal Processing Letters

TL;DR: A physical model is proposed to model airflow patterns in the physiological system in order to represent the process of speech production under psychological stress, and physical parameters characterizing airflow variations in the vocal folds, the vocal tract, and laryngeal ventricle are explored.

...read moreread less

Abstract: This letter presents a method to perform the classification of speech under stress based on physical characteristics. A physical model is proposed to model airflow patterns in the physiological system in order to represent the process of speech production under psychological stress, and physical parameters characterizing airflow variations in the vocal folds, the vocal tract, and laryngeal ventricle are explored. Experimental evaluations show that the physical parameters are effective for the classification of stressed speech.

...read moreread less

10 citations

Journal Article•DOI•

Evaluation Method for Aggressiveness of Driving Behavior Using Drive Recorders

[...]

Yiyang Li¹, Chiyomi Miyajima¹, Norihide Kitaoka¹, Kazuya Takeda¹•Institutions (1)

Nagoya University¹

01 Jan 2015-IEEJ journal of industry applications

8 citations

Proceedings Article•DOI•

Traffic trajectory history and drive path generation using GPS data cloud

[...]

Ekim Yurtsever¹, Kazuya Takeda¹, Chiyomi Miyajima¹•Institutions (1)

Nagoya University¹

01 Jun 2015

TL;DR: Experimental results show that the predictions made with categorized traffic trajectory history have less errors than the predictionsmade with road shape curvature.

...read moreread less

Abstract: This paper proposes a novel approach for extracting the traffic trajectory history, with the use of GPS data collected over a certain period of time, to be used as an input for driver models. In this approach, driving curvature is distinguished from actual road shape curvature with the use of real driving data. After sufficient amount of drive data has been collected, high degree polynomials are fitted to GPS point cloud. Traffic trajectory history is the tangential unit vectors and curvature values that are calculated from these polynomials. Then a single drivers driving path has been predicted with using traffic trajectory history and road shape curvature for comparison and validation. Experimental results show that the predictions made with categorized traffic trajectory history have less errors than the predictions made with road shape curvature.

...read moreread less

5 citations

Journal Article•DOI•

An Experimental Study on the Difference in Drivers’ Decision-making Behavior During Manual and Supported Driving☆

[...]

Hitoshi Terai¹, Hiroyuki Okuda², Hitomi Kentaro³, Takashi Bando³, Chiyomi Miyajima², Takatsugu Hirayama², Yuki Shinohara³, Masumi Egawa³, Kazuya Takeda² - Show less +5 more•Institutions (3)

Kindai University¹, Nagoya University², Denso³

01 Jan 2015-Procedia Manufacturing

TL;DR: In this paper, the authors identify differences in characteristics of drivers' decision-making when driving a vehicle with manual operation or with an automatic driving assistance system by using a high fidelity driving simulator.

...read moreread less

Proceedings Article•DOI•

Daily activity recognition based on acoustic signals and acceleration signals estimated with Gaussian process

[...]

Masafumi Nishida¹, Norihide Kitaoka², Kazuya Takeda¹•Institutions (2)

Nagoya University¹, University of Tokushima²

01 Dec 2015

TL;DR: Experimental results showed that the proposed method can improve recognition accuracy compared to a conventional method, and demonstrates the effectiveness of estimating acceleration signals with a Gaussian process to recognize daily activities.

...read moreread less

Abstract: We have created corpus of daily activities using wearable sensors. The corpus consists of sound and image data from a camera and motion signals from a smartphone for both indoor and outdoor activities over 72 continuous hours. We propose a method that can interpolate acceleration signals to any sample points with a Gaussian process in order to recognize daily activities. We conducted recognition experiments of daily activities using our corpus. Experimental results showed that the proposed method can improve recognition accuracy compared to a conventional method. This demonstrates the effectiveness of estimating acceleration signals with a Gaussian process to recognize daily activities.

...read moreread less

Driving Signature Extraction

[...]

Ekim Yurtsever, Chiyomi Miyajima, Selpi Selpi, Kazuya Takeda

01 Jan 2015

TL;DR: This study proposes a method to extract the unique driving signatures of individual drivers from sensor data and suggests that drivers with similar driving signatures can be categorized into driving style classes such as aggressive or careful driving.

...read moreread less

Abstract: This study proposes a method to extract the unique driving signatures of individual drivers We assume that each driver has a unique driving signature that can be represented in a k dimensional principal driving component (PDC) space We propose a method to extract this signature from sensor data Furthermore, we suggest that drivers with similar driving signatures can be categorized into driving style classes such as aggressive or careful driving In our experiments, 122 different drivers have driven the same path on Nagoya city express highway with the same instrumented car GPS, speed, acceleration, steering wheel position and pedal operations have been recorded Clustering methods have been used to identify driving signatures

...read moreread less

Journal Article•DOI•

Strong Neck Accumulation of 131I Is a Predictor of Incomplete Low-Dose Radioiodine Remnant Ablation Using Recombinant Human Thyroid-Stimulating Hormone.

[...]

Keisuke Enomoto, Yoshiharu Sakata, Kazuyuki Izumi, Yukinori Takenaka, Miki Nagai, Kazuya Takeda, Yukie Enomoto, Atsuhiko Uno - Show less +4 more

01 Sep 2015-Medicine

TL;DR: The rate of adverse events at DxWBS was significantly higher in patients with adverse events seen at RRA than in those who did not, and strong neck accumulation of 131I is significant independent predictor of incomplete low-dose RRA.

...read moreread less

Effect of automatic lane changing on driver's behaviour decision process

[...]

Kentarou Hitomi, Hitoshi Terai, Hiroyuki Okuda, Takashi Bando, Chiyomi Miyajima, Takatsugu Hirayama, Yuki Shinohara, Masumi Egawa, Kazuya Takeda - Show less +5 more

01 Jan 2015

TL;DR: The driver's behavioural change about which the usage of automated driving system brings, focusing on the behaviours at changing lanes, is analyzed, and the correlation between risk sensitivity and gaze behaviour is shown.

...read moreread less

Abstract: This paper analyses the driver's behavioural change about which the usage of automated driving system brings, focusing on the behaviours at changing lanes. Especially the relation between drivers' sensitivity to risk factors in surrounding environment and their gaze behaviour were analysed. We assumed, in this research, an automated driving of level 2 in the definition provided by NHTSA. At this level of automation, the drivers are required to monitor the driving situation and, when necessary, interrupt the system's automatic control and thereby recover the safety of the driving. We have conducted a simulation experiment with fifteen drivers, and compared their behaviours in two conditions; the conventional manual driving and the driving where automated driving system automatically changes lanes. By analysing collectively the risk factors at changing lanes, shift of each driver's sensitivity to risk at changing lanes were estimated. The experimental data shows the correlation between risk sensitivity and gaze behaviour.

...read moreread less

Proceedings Article•DOI•

Relationship between speaker/listener similarity and information transmission quality in speech communication

[...]

Bohan Chen¹, Norihide Kitaoka², Kazuya Takeda¹•Institutions (2)

Nagoya University¹, University of Tokushima²

01 Dec 2015

TL;DR: The correlation between similarity in speaker characteristics and information transmission quality using a map task dialogue corpus and a linear regression prediction model is investigated.

...read moreread less

Abstract: We investigate the correlation between similarity in speaker characteristics and information transmission quality using a map task dialogue corpus. Similarity between the prosodic features and lexical styles of different speakers are analyzed, and most of these similarity measurements are shown to have significant correlations with information transmission quality as measured by a direction following task. We also combine these similarity measurements using a linear regression prediction model and assess information transmission quality. Prediction scores show a significant 0.37 correlation coefficient between the combined similarity measurement and information transmission quality scores.

...read moreread less

Proceedings Article•DOI•

Tracking driver signage observation using local feature matching and optical flow

[...]

Chiyomi Miyajima¹, Katsuya Sakoyama², Kazuya Takeda¹•Institutions (2)

Nagoya University¹, Sony Broadcast & Professional Research Laboratories²

01 Dec 2015

TL;DR: A method for identifying objects observed by drivers and tracking the driver's observation of signage while driving is investigated, and driver and signage location information are used to limit candidate signboards for reducing computational cost for image matching.

...read moreread less

Abstract: We investigate a method for identifying objects observed by drivers. Here we focus on roadside signage as an example, and track the driver's observation of signage while driving. A gaze tracking system and a forward-directed video camera are used to determine the driver's region of interest (ROI). The driver's observation of signage is detected by tracking the driver's ROI using optical flow, and by matching the driver's ROI with template images of signboards in a signage database using local feature matching. Driver and signage location information are used to limit candidate signboards for reducing computational cost for image matching. We conduct an experiment to evaluate our method and achieve a 66.2% detection rate of drivers' signboard observation with a false positive rate of 6.6%.

...read moreread less