Home
/
Authors
/
Richard J. Mammone

Author

Richard J. Mammone

Other affiliations: Iowa State University

Bio: Richard J. Mammone is an academic researcher from Rutgers University. The author has contributed to research in topics: Speaker recognition & Artificial neural network. The author has an hindex of 35, co-authored 163 publications receiving 4127 citations. Previous affiliations of Richard J. Mammone include Iowa State University.

Papers published on a yearly basis

2021
2020
2019
2018
2017
2016
2013
2012
2011
2010
2008
2006
2005
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1983

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Robust speaker recognition: a feature-based approach

[...]

Richard J. Mammone, Xiaoyu Zhang, Ravi P. Ramachandran¹•Institutions (1)

Rutgers University¹

01 Jan 1996-IEEE Signal Processing Magazine

TL;DR: Linear predictive (LP) analysis, the first step of feature extraction, is discussed, and various robust cepstral features derived from LP coefficients are described, including the afJine transform, which is a feature transformation approach that integrates mismatch to simultaneously combat both channel and noise distortion.

...read moreread less

Abstract: The future commercialization of speaker- and speech-recognition technology is impeded by the large degradation in system performance due to environmental differences between training and testing conditions. This is known as the "mismatched condition." Studies have shown [l] that most contemporary systems achieve good recognition performance if the conditions during training are similar to those during operation (matched conditions). Frequently, mismatched conditions axe present in which the performance is dramatically degraded as compared to the ideal matched conditions. A common example of this mismatch is when training is done on clean speech and testing is performed on noise- or channel-corrupted speech. Robust speech techniques [2] attempt to maintain the performance of a speech processing system under such diverse conditions of operation. This article presents an overview of current speaker-recognition systems and the problems encountered in operation, and it focuses on the front-end feature extraction process of robust speech techniques as a method of improvement. Linear predictive (LP) analysis, the first step of feature extraction, is discussed, and various robust cepstral features derived from LP coefficients are described. Also described is the afJine transform, which is a feature transformation approach that integrates mismatch to simultaneously combat both channel and noise distortion.

...read moreread less

344 citations

Journal Article•DOI•

Speaker recognition using neural networks and conventional classifiers

[...]

Kevin R. Farrell¹, Richard J. Mammone¹, Khaled Assaleh¹•Institutions (1)

Rutgers University¹

01 Jan 1994-IEEE Transactions on Speech and Audio Processing

TL;DR: The modified neural tree network (MNTN) is a hierarchical classifier that combines the properties of decision trees and feedforward neural networks that is found to perform better than full-search VQ classifiers for both of these applications.

...read moreread less

Abstract: An evaluation of various classifiers for text-independent speaker recognition is presented. In addition, a new classifier is examined for this application. The new classifier is called the modified neural tree network (MNTN). The MNTN is a hierarchical classifier that combines the properties of decision trees and feedforward neural networks. The MNTN differs from the standard NTN in both the new learning rule used and the pruning criteria. The MNTN is evaluated for several speaker recognition experiments. These include closed- and open-set speaker identification and speaker verification. The database used is a subset of the TIMIT database consisting of 38 speakers from the same dialect region. The MNTN is compared with nearest neighbor classifiers, full-search, and tree-structured vector quantization (VQ) classifiers, multilayer perceptrons (MLPs), and decision trees. For closed-set speaker identification experiments, the full-search VQ classifier and MNTN demonstrate comparable performance. Both methods perform significantly better than the other classifiers for this task. The MNTN and full-search VQ classifiers are also compared for several speaker verification and open-set speaker-identification experiments. The MNTN is found to perform better than full-search VQ classifiers for both of these applications. In addition to matching or exceeding the performance of the VQ classifier for these applications, the MNTN also provides a logarithmic saving for retrieval. >

...read moreread less

295 citations

Proceedings Article•

IBM's Statistical Question Answering System.

[...]

Abraham Ittycheriah, Martin Franz, Wei-Jing Zhu, Adwait Ratnaparkhi, Richard J. Mammone - Show less +1 more

01 Jan 2000

TL;DR: The authors des ribe the IBM Statisti al Question Answering for TREC-9 system in detail and look at several examples and errors and results at the 250 byte and 50 byte levels for the overall system as well as results on ea h sub omponent.

...read moreread less

Abstract: Abraham Itty heriah, Martin Franz, Wei-Jing Zhu, Adwait Ratnaparkhi P.O.Box 218, Yorktown Heights, NY 10598 fabei,franzm,wjzhu,adwaitrg watson.ibm. om Ri hard J. Mammone Dept. of Ele tri al Engineering, Rutgers University, Pis ataway, NJ 08854 mammone aip.rutgers.edu Abstra t We des ribe the IBM Statisti al Question Answering for TREC-9 system in detail and look at several examples and errors. The system is an appli ation of maximum entropy lassi ation for question/answer type predi tion and named entity marking. We des ribe our system for information retrieval whi h in the rst step did do ument retrieval from a lo al en y lopedia, and in the se ond step performed an expansion of the query words and nally did passage retrieval from the TREC olle tion. We will also dis uss the answer sele tion algorithm whi h determines the best senten e given both the question and the o urren e of a phrase belonging to the answer lass desired by the question. Results at the 250 byte and 50 byte levels for the overall system as well as results on ea h sub omponent are presented. 1 System Des ription Systems that perform question answering automati ally by omputer have been around for some time as des ribed by (Green et al., 1963). Only re ently though have systems been developed to handle huge databases and a slightly ri her set of questions. The types of questions that an be dealt with today are restri ted to be short answer fa t based questions. In TREC-8, a number of sites parti ipated in the rst question-answering evaluation (Voorhees and Ti e, 1999) and the best systems identi ed four major subomponents: Question/Answer Type Classi ation Query expansion/Information Retrieval Named Entity Marking Answer Sele tion Our system ar hite ture for this year was built around these four major omponents as shown in Fig. 1. Here, the question is input and lassi ed as asking for an answer whose ategory is one of the named entity lasses to be des ribed below. Additionally, the question is presented to the information retrieval (IR) engine for query expansion and do ument retrieval. This engine, given the query, looks at the database of do uments and outputs the best do uments or passages annotated with the named entities. The nal stage is to sele t the exa t answer, given the information about the answer lass and the top s oring passages. Minimizing various distan e metri s applied over phrases or windows of text results in the best s oring se tion that has a phrase belonging to answer lass. This then represents the best s oring answer.

...read moreread less

224 citations

Proceedings Article•DOI•

Meta-neural networks that learn by learning

[...]

Devang Naik¹, Richard J. Mammone¹•Institutions (1)

Rutgers University¹

07 Jun 1992

TL;DR: A novel method for training neural networks using an additional observing neural network called a meta-neural network (MNN) to direct the training of the basic neural network and the MNN is shown to help solve the problem of sensitivity to initial weight vectors.

...read moreread less

Abstract: A novel method for training neural networks is introduced. The method uses an additional observing neural network called a meta-neural network (MNN) to direct the training of the basic neural network. The MNN provides the basic neural network with a step size and a direction vector which is optimal based on successful training strategies learned from problems solved previously. The combination of the MNN with the basic neural network is shown to improve learning rates for several problems when the MNN is trained on a similar problem. The MNN is shown to help solve the problem of sensitivity to initial weight vectors. In addition, computer simulations demonstrate the improvement in the learning rate of the enhanced neural network on a 4-b parity problem, when it has been trained on a different nonlinear Boolean function. >

...read moreread less

212 citations

Patent•

System and method for detecting a recorded voice

[...]

Manish Sharma, Richard J. Mammone

29 Jan 1998

TL;DR: In this article, a user speaks into a microphone (600) and the input speech is analyzed in an automatic speaker recognition system to extract parameters (25A-25D). Comparisons of multiple input patterns and recorded reference patterns are conducted (610, 620) to detect whether the input was from a recorded source (150) or not from an audio source (20).

...read moreread less

Abstract: A user speaks into a microphone (600) and the input speech is analyzed in an automatic speaker recognition system to extract parameters (25A-25D). Comparisons of multiple input patterns and recorded reference patterns are conducted (610, 620) to detect wheter the input was from a recorded source (150) or not from a recorded source (20).

...read moreread less

185 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33

Collapse

Cited by

PDF

Open Access

More filters

Proceedings Article•

Model-agnostic meta-learning for fast adaptation of deep networks

[...]

Chelsea Finn¹, Pieter Abbeel¹, Sergey Levine¹•Institutions (1)

University of California, Berkeley¹

06 Aug 2017

TL;DR: An algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning is proposed.

...read moreread less

Abstract: We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning. The goal of meta-learning is to train a model on a variety of learning tasks, such that it can solve new learning tasks using only a small number of training samples. In our approach, the parameters of the model are explicitly trained such that a small number of gradient steps with a small amount of training data from a new task will produce good generalization performance on that task. In effect, our method trains the model to be easy to fine-tune. We demonstrate that this approach leads to state-of-the-art performance on two few-shot image classification benchmarks, produces good results on few-shot regression, and accelerates fine-tuning for policy gradient reinforcement learning with neural network policies.

...read moreread less

7,027 citations

Journal Article•DOI•

Surrogate-based Analysis and Optimization

[...]

Nestor V. Queipo¹, Raphael T. Haftka¹, Wei Shyy¹, Tushar Goel¹, Rajkumar Vaidyanathan¹, P. Kevin Tucker² - Show less +2 more•Institutions (2)

University of Florida¹, Marshall Space Flight Center²

01 Jan 2005-Progress in Aerospace Sciences

TL;DR: The multi-objective optimal design of a liquid rocket injector is presented to highlight the state of the art and to help guide future efforts.

...read moreread less

2,152 citations

Journal Article•DOI•

Ensembling neural networks: many could be better than all

[...]

Zhi-Hua Zhou¹, Jianxin Wu¹, Wei Tang¹•Institutions (1)

Nanjing University¹

01 May 2002-Artificial Intelligence

TL;DR: The bias-variance decomposition of the error is provided in this paper, which shows that the success of GASEN may lie in that it can significantly reduce the bias as well as the variance.

...read moreread less

1,898 citations

Journal Article•DOI•

Pruning algorithms-a survey

[...]

R. Reed¹•Institutions (1)

University of Washington¹

01 Sep 1993-IEEE Transactions on Neural Networks

TL;DR: The approach taken by the methods described here is to train a network that is larger than necessary and then remove the parts that are not needed.

...read moreread less

Abstract: A rule of thumb for obtaining good generalization in systems trained by examples is that one should use the smallest system that will fit the data. Unfortunately, it usually is not obvious what size is best; a system that is too small will not be able to learn the data while one that is just big enough may learn very slowly and be very sensitive to initial conditions and learning parameters. This paper is a survey of neural network pruning algorithms. The approach taken by the methods described here is to train a network that is larger than necessary and then remove the parts that are not needed. >

...read moreread less

1,705 citations

Journal Article•DOI•

Speaker recognition: a tutorial

[...]

Jr. J.P. Campbell¹•Institutions (1)

Johns Hopkins University¹

01 Sep 1997

TL;DR: A tutorial on the design and development of automatic speaker-recognition systems is presented and a new automatic speakers recognition system is given that performs with 98.9% correct decalcification.

...read moreread less

Abstract: A tutorial on the design and development of automatic speaker-recognition systems is presented. Automatic speaker recognition is the use of a machine to recognize a person from a spoken phrase. These systems can operate in two modes: to identify a particular person or to verify a person's claimed identity. Speech processing and the basic components of automatic speaker-recognition systems are shown and design tradeoffs are discussed. Then, a new automatic speaker-recognition system is given. This recognizer performs with 98.9% correct decalcification. Last, the performances of various systems are compared.

...read moreread less

1,686 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse