Home
/
Authors
/
Masanobu Abe

Author

Masanobu Abe

Other affiliations: Nippon Telegraph and Telephone, Spacelabs Healthcare

Bio: Masanobu Abe is an academic researcher from Okayama University. The author has contributed to research in topics: Speech synthesis & Speech processing. The author has an hindex of 20, co-authored 167 publications receiving 1914 citations. Previous affiliations of Masanobu Abe include Nippon Telegraph and Telephone & Spacelabs Healthcare.

Papers published on a yearly basis

2022
2021
2020
2019
2018
2017
2016
2015
2014
2012
2011
2010
2009
2008
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Voice conversion through vector quantization

[...]

Masanobu Abe, Satoshi Nakamura, Kiyohiro Shikano, Hisao Kuwabara

11 Apr 1988

TL;DR: The authors propose a new voice conversion technique through vector quantization and spectrum mapping which makes it possible to precisely control voice individuality.

...read moreread less

Abstract: The authors propose a new voice conversion technique through vector quantization and spectrum mapping. The basic idea of this technique is to make mapping codebooks which represent the correspondence between different speakers' codebooks. The mapping codebooks for spectrum parameters, power values and pitch frequencies are separately generated using training utterances. This technique makes it possible to precisely control voice individuality. To evaluate the performance of this technique, hearing tests are carried out on two kinds of voice conversions. One is a conversion between male and female speakers, the other is a conversion between male speakers. In the male-to-female conversion experiment, all converted utterances are judged as female, and in the male-to-male conversion, 65% of them are identified as the target speaker. >

...read moreread less

554 citations

Patent•DOI•

Reconstruction of wideband speech from narrowband speech using codebooks

[...]

Masanobu Abe¹, Yuki Yoshida¹•Institutions (1)

Nippon Telegraph and Telephone¹

29 Sep 1993-Journal of the Acoustical Society of America

TL;DR: In this article, a wideband speech signal (8 kHz) of high quantity is reconstructed from a narrowband speech signals (300 Hz to 3.4 kHz) by LPC-analyzing to obtain spectrum information parameters.

...read moreread less

Abstract: A wideband speech signal (8 kHz, for example) of high quantity is reconstructed from a narrowband speech signal (300 Hz to 3.4 kHz). The input narrowband speech signal is LPC-analyzed to obtain spectrum information parameters, and the parameters are vector-quantized using a narrowband speech signal codebook. For each code number of the narrowband speech signal codebook, the wideband speech waveform corresponding to the codevector concerned is extracted by one pitch for voiced speech and by one frame for unvoiced speech and prestored in a representative waveform codebook. Representative waveform segments corresponding to the respective output codevector numbers of the quantizer are extracted from the representative waveform codebook. Voiced speech is synthesized by pitch-synchronous overlapping of the extracted representative waveform segments and unvoiced speech is synthesized by randomly using waveforms of one frame length. By this, a wideband speech signal is produced. Then, frequency components below 300 Hz and above 3.4 kHz are extracted from the wideband speech signal and are added to an up-sampled version of the input narrowband speech signal to thereby reconstruct the wideband speech signal.

...read moreread less

219 citations

Journal Article•DOI•

Voice conversion through vector quantization

[...]

Masanobu Abe, Satoshi Nakamura, Kiyohiro Shikano, Hisao Kuwabara

01 Mar 1990-The Journal of The Acoustical Society of Japan (e)

TL;DR: In this article, a new voice conversion technique through vector quantization and spectrum mapping is proposed, which is based on mapping codebooks which represent the correspondence between different speakers' codebooks.

...read moreread less

Abstract: A new voice conversion technique through vector quantization and spectrum mapping is proposed. This technique is based on mapping codebooks which represent the cor respondencebetween different speakers' codebooks. The mapping codebooks for spectrum parameters, power values, and pitch frequencies are separately generated using training utterances. This technique makes it possible to precisely control voice individuality. The performance of this technique is confirmed by spectrum distortion and pitch frequency difference. To evaluate the overall performance of this technique, listening tests are carried out on two kinds of voice conversions: one between male and female speakers, the other between male speakers. In the male-to-female conversion experiment, all converted utterances are judged as female, and in the male-to-male conversion, 57% of them are identified as the target speaker.

...read moreread less

116 citations

Journal Article•DOI•

Voice conversion algorithm based on piecewise linear conversion rules of formant frequency and spectrum tilt

[...]

Hideyuki Mizuno, Masanobu Abe

01 Feb 1995-Speech Communication

TL;DR: A new algorithm used in order to convert the speech of one speaker so that it sounds like that of another speaker using piecewise linear voice conversion rules that provides the ability to produce speech with the desired formant structure by controlling formant frequencies, formant bandwidths and spectral intensity.

...read moreread less

66 citations

Proceedings Article•

An algorithm to reconstruct wideband speech from narrowband speech based on codebook mapping

[...]

Yuki Yoshida, Masanobu Abe

01 Jan 1994

53 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34

Collapse

Cited by

PDF

Open Access

More filters

Patent•

Intelligent Automated Assistant

[...]

Thomas R. Gruber¹, Adam Cheyer¹, Dag Kittlaus¹, Didier Rene Guzzoni¹, Christopher Dean Brigham¹, Richard Donald Giuli¹, Marcello Bastea-Forte¹, Harry J. Saddler¹ - Show less +4 more•Institutions (1)

Apple Inc.¹

11 Jan 2011

TL;DR: In this article, an intelligent automated assistant system engages with the user in an integrated, conversational manner using natural language dialog, and invokes external services when appropriate to obtain information or perform various actions.

...read moreread less

Abstract: An intelligent automated assistant system engages with the user in an integrated, conversational manner using natural language dialog, and invokes external services when appropriate to obtain information or perform various actions. The system can be implemented using any of a number of different platforms, such as the web, email, smartphone, and the like, or any combination thereof. In one embodiment, the system is based on sets of interrelated domains and tasks, and employs additional functionally powered by external services with which the system can interact.

...read moreread less

1,462 citations

Journal Article•DOI•

Continuous probabilistic transform for voice conversion

[...]

Yannis Stylianou¹, Olivier Cappé², Eric Moulines²•Institutions (2)

Bell Labs¹, École Normale Supérieure²

01 Mar 1998-IEEE Transactions on Speech and Audio Processing

TL;DR: The design of a new methodology for representing the relationship between two sets of spectral envelopes and the proposed transform greatly improves the quality and naturalness of the converted speech signals compared with previous proposed conversion methods.

...read moreread less

Abstract: Voice conversion, as considered in this paper, is defined as modifying the speech signal of one speaker (source speaker) so that it sounds as if it had been pronounced by a different speaker (target speaker). Our contribution includes the design of a new methodology for representing the relationship between two sets of spectral envelopes. The proposed method is based on the use of a Gaussian mixture model of the source speaker spectral envelopes. The conversion itself is represented by a continuous parametric function which takes into account the probabilistic classification provided by the mixture model. The parameters of the conversion function are estimated by least squares optimization on the training data. This conversion method is implemented in the context of the HNM (harmonic+noise model) system, which allows high-quality modifications of speech signals. Compared to earlier methods based on vector quantization, the proposed conversion scheme results in a much better match between the converted envelopes and the target envelopes. Evaluation by objective tests and formal listening tests shows that the proposed transform greatly improves the quality and naturalness of the converted speech signals compared with previous proposed conversion methods.

...read moreread less

1,109 citations

Journal Article•DOI•

Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory

[...]

Tomoki Toda¹, Alan W. Black², Keiichi Tokuda³•Institutions (3)

Nara Institute of Science and Technology¹, Carnegie Mellon University², Nagoya Institute of Technology³

01 Nov 2007-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: In this article, a Gaussian mixture model (GMM) of the joint probability density of source and target features is employed for performing spectral conversion between speakers, and a conversion method based on the maximum-likelihood estimation of a spectral parameter trajectory is proposed.

...read moreread less

Abstract: In this paper, we describe a novel spectral conversion method for voice conversion (VC). A Gaussian mixture model (GMM) of the joint probability density of source and target features is employed for performing spectral conversion between speakers. The conventional method converts spectral parameters frame by frame based on the minimum mean square error. Although it is reasonably effective, the deterioration of speech quality is caused by some problems: 1) appropriate spectral movements are not always caused by the frame-based conversion process, and 2) the converted spectra are excessively smoothed by statistical modeling. In order to address those problems, we propose a conversion method based on the maximum-likelihood estimation of a spectral parameter trajectory. Not only static but also dynamic feature statistics are used for realizing the appropriate converted spectrum sequence. Moreover, the oversmoothing effect is alleviated by considering a global variance feature of the converted spectra. Experimental results indicate that the performance of VC can be dramatically improved by the proposed method in view of both speech quality and conversion accuracy for speaker individuality.

...read moreread less

914 citations

Proceedings Article•DOI•

Low-delay voice conversion based on maximum likelihood estimation of spectral parameter trajectory.

[...]

Takashi Muramatsu¹, Yamato Ohtani¹, Tomoki Toda¹, Hiroshi Saruwatari¹, Kiyohiro Shikano¹ - Show less +1 more•Institutions (1)

Nara Institute of Science and Technology¹

22 Sep 2008

TL;DR: The 9th Annual Conference of the International Speech Communication Association, September 22-26, 2008, Brisbane, Australia as discussed by the authors, was held at the University of Queensland, Queensland, Australia.

...read moreread less

Abstract: INTERSPEECH2008: 9th Annual Conference of the International Speech Communication Association, September 22-26, 2008, Brisbane, Australia.

...read moreread less

796 citations

Patent•

Integration of music from a personal library with real-time information

[...]

Max Abecassis

19 Oct 1999

TL;DR: An apparatus capable of, and a method of, playing audio, the apparatus comprising communicating, processing, and playing means for, and the method comprising the steps of: communicating a user's information preferences to an information provider; receiving, from the information provider, informational items that are responsive to the user references; interleaving and sequencing, for the user, a playing of the received informational items with a plurality of musical items included in an audio library of the user; and playing, for a user and responsive to interleaved and sequenceto the interleaves, the

...read moreread less

Abstract: An apparatus capable of, and a method of, playing audio, the apparatus comprising communicating, processing, and playing means for, and the method comprising the steps of: communicating a user's information preferences to an information provider; receiving, from the information provider, informational items that are responsive to the user's information references; interleaving and sequencing, for the user, a playing of the received informational items with a playing of a plurality of musical items included in an audio library of the user; and playing, for the user and responsive to the interleaving and sequencing, the received informational items within a playing of the plurality of musical items; and wherein the playing comprises a voice synthesizing of an at least one of informational item; wherein the playing is responsive to a schedule preferences of the user; wherein a verified apparent listening of a playing of an informational item is associated with a credit; and/or wherein a user's reception of a communication unrelated to the informational items is integrated within a playing of musical items.

...read moreread less

735 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse