Home
/
Authors
/
John J. Miller

Author

John J. Miller

Other affiliations: United States Department of Veterans Affairs, Scripps Health, Baidu ...read more

Bio: John J. Miller is an academic researcher from University of California, Berkeley. The author has contributed to research in topics: Medicine & Ophthalmology. The author has an hindex of 34, co-authored 109 publications receiving 4265 citations. Previous affiliations of John J. Miller include United States Department of Veterans Affairs & Scripps Health.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2015
2014
2013
2003
1998
1996
1992
1990
1988
1987
1986
1985
1984
1983
1982
1980
1979
1977
1976
1975
1974
1973
1972
1971
1970
1968
1967
1966
1964
1951
1950
1949
1948
1944
1943
1942
1941
1939
1934

Papers

PDF

Open Access

More filters

Proceedings Article•

Deep Voice: Real-time Neural Text-to-Speech

[...]

Sercan O. Arik¹, Mike Chrzanowski², Adam Coates³, Gregory Diamos², Andrew Gibiansky², Yongguo Kang, Xian Li, John J. Miller², Andrew Y. Ng¹, Jonathan Raiman², Shubho Sengupta², Mohammad Shoeybi² - Show less +8 more•Institutions (3)

Stanford University¹, Baidu², Manchester Metropolitan University³

25 Feb 2017

TL;DR: Deep Voice lays the groundwork for truly end-to-end neural speech synthesis and shows that inference with the system can be performed faster than real time and describes optimized WaveNet inference kernels on both CPU and GPU that achieve up to 400x speedups over existing implementations.

...read moreread less

Abstract: We present Deep Voice, a production-quality text-to-speech system constructed entirely from deep neural networks. Deep Voice lays the groundwork for truly end-to-end neural speech synthesis. The system comprises five major building blocks: a segmentation model for locating phoneme boundaries, a grapheme-to-phoneme conversion model, a phoneme duration prediction model, a fundamental frequency prediction model, and an audio synthesis model. For the segmentation model, we propose a novel way of performing phoneme boundary detection with deep neural networks using connectionist temporal classification (CTC) loss. For the audio synthesis model, we implement a variant of WaveNet that requires fewer parameters and trains faster than the original. By using a neural network for each component, our system is simpler and more flexible than traditional text-to-speech systems, where each component requires laborious feature engineering and extensive domain expertise. Finally, we show that inference with our system can be performed faster than real time and describe optimized WaveNet inference kernels on both CPU and GPU that achieve up to 400x speedups over existing implementations.

...read moreread less

423 citations

Posted Content•

Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning

[...]

Wei Ping¹, Kainan Peng¹, Andrew Gibiansky¹, Sercan O. Arik¹, Ajay Kannan², Sharan Narang¹, Jonathan Raiman¹, John J. Miller³ - Show less +4 more•Institutions (3)

Baidu¹, College of Engineering, Guindy², University of California, Berkeley³

20 Oct 2017-arXiv: Sound

TL;DR: Deep Voice 3 is presented, a fully-convolutional attention-based neural text-to-speech (TTS) system that matches state-of-the-art neural speech synthesis systems in naturalness while training ten times faster.

...read moreread less

Abstract: We present Deep Voice 3, a fully-convolutional attention-based neural text-to-speech (TTS) system. Deep Voice 3 matches state-of-the-art neural speech synthesis systems in naturalness while training ten times faster. We scale Deep Voice 3 to data set sizes unprecedented for TTS, training on more than eight hundred hours of audio from over two thousand speakers. In addition, we identify common error modes of attention-based speech synthesis networks, demonstrate how to mitigate them, and compare several different waveform synthesis methods. We also describe how to scale inference to ten million queries per day on one single-GPU server.

...read moreread less

279 citations

Proceedings Article•

Deep Voice 2: Multi-Speaker Neural Text-to-Speech.

[...]

Andrew Gibiansky¹, Sercan O. Arik¹, Gregory Diamos¹, John J. Miller¹, Kainan Peng¹, Wei Ping¹, Jonathan Raiman¹, Yanqi Zhou² - Show less +4 more•Institutions (2)

Baidu¹, Princeton University²

01 Jan 2017

TL;DR: In this paper, a technique for augmenting neural text-to-speech (TTS) with low-dimensional trainable speaker embeddings to generate different voices from a single model was proposed.

...read moreread less

Abstract: We introduce a technique for augmenting neural text-to-speech (TTS) with low-dimensional trainable speaker embeddings to generate different voices from a single model. As a starting point, we show improvements over the two state-of-the-art approaches for single-speaker neural TTS: Deep Voice 1 and Tacotron. We introduce Deep Voice 2, which is based on a similar pipeline with Deep Voice 1, but constructed with higher performance building blocks and demonstrates a significant audio quality improvement over Deep Voice 1. We improve Tacotron by introducing a post-processing neural vocoder, and demonstrate a significant audio quality improvement. We then demonstrate our technique for multi-speaker speech synthesis for both Deep Voice 2 and Tacotron on two multi-speaker TTS datasets. We show that a single neural TTS system can learn hundreds of unique voices from less than half an hour of data per speaker, while achieving high audio quality synthesis and preserving the speaker identities almost perfectly.

...read moreread less

278 citations

Posted Content•

Traversing Knowledge Graphs in Vector Space

[...]

Kelvin Guu¹, John J. Miller¹, Percy Liang¹•Institutions (1)

Stanford University¹

03 Jun 2015-arXiv: Computation and Language

TL;DR: It is demonstrated that compositional training acts as a novel form of structural regularization, reliably improving performance across all base models (reducing errors by up to 43%) and achieving new state-of-the-art results.

...read moreread less

Abstract: Path queries on a knowledge graph can be used to answer compositional questions such as "What languages are spoken by people living in Lisbon?". However, knowledge graphs often have missing facts (edges) which disrupts path queries. Recent models for knowledge base completion impute missing facts by embedding knowledge graphs in vector spaces. We show that these models can be recursively applied to answer path queries, but that they suffer from cascading errors. This motivates a new "compositional" training objective, which dramatically improves all models' ability to answer path queries, in some cases more than doubling accuracy. On a standard knowledge base completion task, we also demonstrate that compositional training acts as a novel form of structural regularization, reliably improving performance across all base models (reducing errors by up to 43%) and achieving new state-of-the-art results.

...read moreread less

254 citations

Proceedings Article•DOI•

Traversing Knowledge Graphs in Vector Space

[...]

Kelvin Guu¹, John J. Miller¹, Percy Liang¹•Institutions (1)

Stanford University¹

01 Sep 2015

TL;DR: This paper proposed a compositional training objective for knowledge base completion, which can be recursively applied to answer path queries, and achieved state-of-the-art results on knowledge base completeness.

...read moreread less

Abstract: Path queries on a knowledge graph can be used to answer compositional questions such as “What languages are spoken by people living in Lisbon?”. However, knowledge graphs often have missing facts (edges) which disrupts path queries. Recent models for knowledge base completion impute missing facts by embedding knowledge graphs in vector spaces. We show that these models can be recursively applied to answer path queries, but that they suffer from cascading errors. This motivates a new “compositional” training objective, which dramatically improves all models’ ability to answer path queries, in some cases more than doubling accuracy. On a standard knowledge base completion task, we also demonstrate that compositional training acts as a novel form of structural regularization, reliably improving performance across all base models (reducing errors by up to 43%) and achieving new state-of-the-art results.

...read moreread less

251 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29

Collapse

Cited by

PDF

Open Access

More filters

Book Chapter•DOI•

Modeling Relational Data with Graph Convolutional Networks

[...]

Michael Sejr Schlichtkrull¹, Thomas Kipf¹, Peter Bloem², Rianne van den Berg¹, Ivan Titov³, Ivan Titov¹, Max Welling⁴, Max Welling¹ - Show less +4 more•Institutions (4)

University of Amsterdam¹, VU University Amsterdam², University of Edinburgh³, Canadian Institute for Advanced Research⁴

03 Jun 2018

TL;DR: It is shown that factorization models for link prediction such as DistMult can be significantly improved through the use of an R-GCN encoder model to accumulate evidence over multiple inference steps in the graph, demonstrating a large improvement of 29.8% on FB15k-237 over a decoder-only baseline.

...read moreread less

Abstract: Knowledge graphs enable a wide variety of applications, including question answering and information retrieval. Despite the great effort invested in their creation and maintenance, even the largest (e.g., Yago, DBPedia or Wikidata) remain incomplete. We introduce Relational Graph Convolutional Networks (R-GCNs) and apply them to two standard knowledge base completion tasks: Link prediction (recovery of missing facts, i.e. subject-predicate-object triples) and entity classification (recovery of missing entity attributes). R-GCNs are related to a recent class of neural networks operating on graphs, and are developed specifically to handle the highly multi-relational data characteristic of realistic knowledge bases. We demonstrate the effectiveness of R-GCNs as a stand-alone model for entity classification. We further show that factorization models for link prediction such as DistMult can be significantly improved through the use of an R-GCN encoder model to accumulate evidence over multiple inference steps in the graph, demonstrating a large improvement of 29.8% on FB15k-237 over a decoder-only baseline.

...read moreread less

3,168 citations

Posted Content•

An Overview of Multi-Task Learning in Deep Neural Networks

[...]

Sebastian Ruder

15 Jun 2017-arXiv: Learning

TL;DR: This article seeks to help ML practitioners apply MTL by shedding light on how MTL works and providing guidelines for choosing appropriate auxiliary tasks, particularly in deep neural networks.

...read moreread less

Abstract: Multi-task learning (MTL) has led to successes in many applications of machine learning, from natural language processing and speech recognition to computer vision and drug discovery. This article aims to give a general overview of MTL, particularly in deep neural networks. It introduces the two most common methods for MTL in Deep Learning, gives an overview of the literature, and discusses recent advances. In particular, it seeks to help ML practitioners apply MTL by shedding light on how MTL works and providing guidelines for choosing appropriate auxiliary tasks.

...read moreread less

2,202 citations

Reference Entry•DOI•

IEEE Transactions on Pattern Analysis and Machine Intelligence

[...]

King-Sun Fu

15 Oct 2004

2,118 citations

Journal Article•DOI•

Knowledge Graph Embedding: A Survey of Approaches and Applications

[...]

Quan Wang¹, Zhendong Mao¹, Bin Wang¹, Li Guo¹•Institutions (1)

Chinese Academy of Sciences¹

01 Dec 2017-IEEE Transactions on Knowledge and Data Engineering

TL;DR: This article provides a systematic review of existing techniques of Knowledge graph embedding, including not only the state-of-the-arts but also those with latest trends, based on the type of information used in the embedding task.

...read moreread less

Abstract: Knowledge graph (KG) embedding is to embed components of a KG including entities and relations into continuous vector spaces, so as to simplify the manipulation while preserving the inherent structure of the KG. It can benefit a variety of downstream tasks such as KG completion and relation extraction, and hence has quickly gained massive attention. In this article, we provide a systematic review of existing techniques, including not only the state-of-the-arts but also those with latest trends. Particularly, we make the review based on the type of information used in the embedding task. Techniques that conduct embedding using only facts observed in the KG are first introduced. We describe the overall framework, specific model design, typical training procedures, as well as pros and cons of such techniques. After that, we discuss techniques that further incorporate additional information besides facts. We focus specifically on the use of entity types, relation paths, textual descriptions, and logical rules. Finally, we briefly introduce how KG embedding can be applied to and benefit a wide variety of downstream tasks such as KG completion, relation extraction, question answering, and so forth.

...read moreread less

1,905 citations

DOI•

Sarcoidosis

[...]

陶仲为

05 Nov 2009

TL;DR: 结节病易误诊,据王洪武等~([1])收集国内18篇关于此第一印象中拟诊结核5例,为此应引起临床对本病诊

...read moreread less

Abstract: 结节病易误诊,据王洪武等~([1])收集国内18篇关于此病误诊的文献,误诊率高达63.2%,当然有误诊就会有误治,如孙永昌等~([2])报道26例结节病在影像学检查诊断的第一印象中拟诊结核5例,其中就有2例完成规范的抗结核治疗,为此应引起临床对本病诊治的重视。

...read moreread less

1,821 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse