Home
/
Authors
/
Michael I. Jordan

Author

Michael I. Jordan

Other affiliations: Stanford University, Princeton University, Broad Institute ...read more

Bio: Michael I. Jordan is an academic researcher from University of California, Berkeley. The author has contributed to research in topics: Computer science & Inference. The author has an hindex of 176, co-authored 1016 publications receiving 216204 citations. Previous affiliations of Michael I. Jordan include Stanford University & Princeton University.

Topics: Computer science, Inference, Cluster analysis, Graphical model, Mathematics ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1986

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Predicted effects of the introduction of long-acting injectable cabotegravir pre-exposure prophylaxis in sub-Saharan Africa: a modelling study.

[...]

Jennifer A. Smith, Loveleen Bansi-Matharu, Valentina Cambiano, Dobromir T. Dimitrov, Anna Bershteyn, David A. M. C. van de Vijver, Katharine Kripke, Paul Revill, Marie-Claude Boily, Gesine Meyer-Rath, Isaac Taramusi, Jens D Lundgren, Joep J. van Oosterhout, Daniel R. Kuritzkes, Robin Maximilian Schaefer, Mark J. Siedner, Jonathan M. Schapiro, Sinead Delany-Moretlwe, Raphael J. Landovitz, Charles Flexner, Michael I. Jordan, Francois Venter, Mopo Radebe, David Ripin, Sarah Jenkins, Danielle Resar, C Amole, Maryam Shahmanesh, Ravindra K. Gupta, Elliot Raizes, Cheryl Johnson, Seth C Inzaule, Robert W. Shafer, Mitchell Warren, Sarah E. Stansfield, Roger Paredes, Andrew N. Phillips - Show less +33 more

01 Jan 2023-The Lancet HIV

5 citations

Posted Content•

Probabilistic Multilevel Clustering via Composite Transportation Distance

[...]

Nhat Ho¹, Viet Huynh, Dinh Phung, Michael I. Jordan¹•Institutions (1)

University of California, Berkeley¹

29 Oct 2018-arXiv: Learning

TL;DR: A novel probabilistic approach to multilevel clustering problems based on composite transportation distance, which is a variant of transportation distance where the underlying metric is Kullback-Leibler divergence, which develops fast and efficient optimization algorithms even for potentially large-scale multileVEL datasets.

...read moreread less

Abstract: We propose a novel probabilistic approach to multilevel clustering problems based on composite transportation distance, which is a variant of transportation distance where the underlying metric is Kullback-Leibler divergence. Our method involves solving a joint optimization problem over spaces of probability measures to simultaneously discover grouping structures within groups and among groups. By exploiting the connection of our method to the problem of finding composite transportation barycenters, we develop fast and efficient optimization algorithms even for potentially large-scale multilevel datasets. Finally, we present experimental results with both synthetic and real data to demonstrate the efficiency and scalability of the proposed approach.

...read moreread less

5 citations

Proceedings Article•

Provable Meta-Learning of Linear Representations

[...]

Nilesh Tripuraneni¹, Chi Jin², Michael I. Jordan¹•Institutions (2)

University of California, Berkeley¹, Princeton University²

18 Jul 2021

TL;DR: In this paper, the authors focus on the problem of multi-task linear regression, in which multiple linear regression models share a common, low-dimensional linear representation, and provide provably fast, sample-efficient algorithms to address the dual challenges of learning a common set of features from multiple, related tasks, and transferring this knowledge to new, unseen tasks.

...read moreread less

Abstract: Meta-learning, or learning-to-learn, seeks to design algorithms that can utilize previous experience to rapidly learn new skills or adapt to new environments. Representation learning -- a key tool for performing meta-learning -- learns a data representation that can transfer knowledge across multiple tasks, which is essential in regimes where data is scarce. Despite a recent surge of interest in the practice of meta-learning, the theoretical underpinnings of meta-learning algorithms are lacking, especially in the context of learning transferable representations. In this paper, we focus on the problem of multi-task linear regression -- in which multiple linear regression models share a common, low-dimensional linear representation. Here, we provide provably fast, sample-efficient algorithms to address the dual challenges of (1) learning a common set of features from multiple, related tasks, and (2) transferring this knowledge to new, unseen tasks. Both are central to the general problem of meta-learning. Finally, we complement these results by providing information-theoretic lower bounds on the sample complexity of learning these linear features.

...read moreread less

5 citations

Journal Article•DOI•

Individual finger movement decoding using a novel ultra-high-density electroencephalography-based brain-computer interface system

[...]

Hyemin S. Lee, Leonhard Schreiner, Seong Hyeon Jo, Sebastian Sieghartsleitner, Michael I. Jordan, Harald Pretl, Christoph Guger, Hyung-Soon Park - Show less +4 more

19 Oct 2022-Frontiers in neuroscience

TL;DR: New proposed flexible electrode grids attached directly to the scalp provided ultra-high-density EEG (uHD EEG), which explored the performance of the novel system by decoding individual finger movements using a total of 256 channels distributed over the contralateral sensorimotor cortex.

...read moreread less

Abstract: Brain-Computer Interface (BCI) technology enables users to operate external devices without physical movement. Electroencephalography (EEG) based BCI systems are being actively studied due to their high temporal resolution, convenient usage, and portability. However, fewer studies have been conducted to investigate the impact of high spatial resolution of EEG on decoding precise body motions, such as finger movements, which are essential in activities of daily living. Low spatial sensor resolution, as found in common EEG systems, can be improved by omitting the conventional standard of EEG electrode distribution (the international 10–20 system) and ordinary mounting structures (e.g., flexible caps). In this study, we used newly proposed flexible electrode grids attached directly to the scalp, which provided ultra-high-density EEG (uHD EEG). We explored the performance of the novel system by decoding individual finger movements using a total of 256 channels distributed over the contralateral sensorimotor cortex. Dense distribution and small-sized electrodes result in an inter-electrode distance of 8.6 mm (uHD EEG), while that of conventional EEG is 60 to 65 mm on average. Five healthy subjects participated in the experiment, performed single finger extensions according to a visual cue, and received avatar feedback. This study exploits mu (8–12 Hz) and beta (13–25 Hz) band power features for classification and topography plots. 3D ERD/S activation plots for each frequency band were generated using the MNI-152 template head. A linear support vector machine (SVM) was used for pairwise finger classification. The topography plots showed regular and focal post-cue activation, especially in subjects with optimal signal quality. The average classification accuracy over subjects was 64.8 (6.3)%, with the middle versus ring finger resulting in the highest average accuracy of 70.6 (9.4)%. Further studies are required using the uHD EEG system with real-time feedback and motor imagery tasks to enhance classification performance and establish the basis for BCI finger movement control of external devices.

...read moreread less

5 citations

Posted Content•

Distributed Low-rank Subspace Segmentation

[...]

Ameet Talwalkar¹, Lester Mackey², Yadong Mu³, Shih-Fu Chang³, Michael I. Jordan¹ - Show less +1 more•Institutions (3)

University of California, Berkeley¹, Stanford University², Columbia University³

20 Apr 2013-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work introduces novel applications of LRR-based subspace segmentation to large-scale semi-supervised learning for multimedia event detection, concept detection, and image tagging and proposes a novel divide-and-conquer algorithm that can cope with LRR's non-decomposable constraints and maintains L RR's strong recovery guarantees.

...read moreread less

Abstract: Vision problems ranging from image clustering to motion segmentation to semi-supervised learning can naturally be framed as subspace segmentation problems, in which one aims to recover multiple low-dimensional subspaces from noisy and corrupted input data. Low-Rank Representation (LRR), a convex formulation of the subspace segmentation problem, is provably and empirically accurate on small problems but does not scale to the massive sizes of modern vision datasets. Moreover, past work aimed at scaling up low-rank matrix factorization is not applicable to LRR given its non-decomposable constraints. In this work, we propose a novel divide-and-conquer algorithm for large-scale subspace segmentation that can cope with LRR's non-decomposable constraints and maintains LRR's strong recovery guarantees. This has immediate implications for the scalability of subspace segmentation, which we demonstrate on a benchmark face recognition dataset and in simulations. We then introduce novel applications of LRR-based subspace segmentation to large-scale semi-supervised learning for multimedia event detection, concept detection, and image tagging. In each case, we obtain state-of-the-art results and order-of-magnitude speed ups.

...read moreread less

4 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
…
171
172
173
174
175
176
177
…
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

Going deeper with convolutions

[...]

Christian Szegedy¹, Wei Liu², Yangqing Jia¹, Pierre Sermanet¹, Scott Reed³, Dragomir Anguelov¹, Dumitru Erhan¹, Vincent Vanhoucke¹, Andrew Rabinovich - Show less +5 more•Institutions (3)

Google¹, University of North Carolina at Chapel Hill², University of Michigan³

07 Jun 2015

TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).

...read moreread less

Abstract: We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC14 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

...read moreread less

40,257 citations

Book•

Deep Learning

[...]

Ian Goodfellow¹, Yoshua Bengio², Aaron Courville²•Institutions (2)

Google¹, Université de Montréal²

18 Nov 2016

TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.

...read moreread less

Abstract: Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning. The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models. Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

...read moreread less

38,208 citations

Book•

Reinforcement Learning: An Introduction

[...]

Richard S. Sutton¹, Andrew G. Barto•Institutions (1)

Massachusetts Institute of Technology¹

01 Jan 1988

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.

...read moreread less

Abstract: Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning. Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications. The only necessary mathematical background is familiarity with elementary concepts of probability. The book is divided into three parts. Part I defines the reinforcement learning problem in terms of Markov decision processes. Part II provides basic solution methods: dynamic programming, Monte Carlo methods, and temporal-difference learning. Part III presents a unified view of the solution methods and incorporates artificial neural networks, eligibility traces, and planning; the two final chapters present case studies and consider the future of reinforcement learning.

...read moreread less

37,989 citations

Journal Article•DOI•

Latent dirichlet allocation

[...]

David M. Blei¹, Andrew Y. Ng², Michael I. Jordan¹•Institutions (2)

University of California, Berkeley¹, Stanford University²

01 Mar 2003-Journal of Machine Learning Research

TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.

...read moreread less

Abstract: We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. In the context of text modeling, the topic probabilities provide an explicit representation of a document. We present efficient approximate inference techniques based on variational methods and an EM algorithm for empirical Bayes parameter estimation. We report results in document modeling, text classification, and collaborative filtering, comparing to a mixture of unigrams model and the probabilistic LSI model.

...read moreread less

30,570 citations

Proceedings Article•

Latent Dirichlet Allocation

[...]

David M. Blei¹, Andrew Y. Ng¹, Michael I. Jordan¹•Institutions (1)

University of California, Berkeley¹

03 Jan 2001

TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).

...read moreread less

Abstract: We propose a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams [6], and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI) [3]. In the context of text modeling, our model posits that each document is generated as a mixture of topics, where the continuous-valued mixture proportions are distributed as a latent Dirichlet random variable. Inference and learning are carried out efficiently via variational algorithms. We present empirical results on applications of this model to problems in text modeling, collaborative filtering, and text classification.

...read moreread less

25,546 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse