scispace - formally typeset
Search or ask a question
Author

Michael I. Jordan

Other affiliations: Stanford University, Princeton University, Broad Institute  ...read more
Bio: Michael I. Jordan is an academic researcher from University of California, Berkeley. The author has contributed to research in topics: Computer science & Inference. The author has an hindex of 176, co-authored 1016 publications receiving 216204 citations. Previous affiliations of Michael I. Jordan include Stanford University & Princeton University.


Papers
More filters
Journal Article
TL;DR: Several of the "open problems" identified by the authors have in fact been the subject of a significant literature, a literature that may have been missed because it has been aimed not only at the SVM but at a broader family of algorithms.
Abstract: The support vector machine (SVM) has played an important role in bringing certain themes to the fore in computationally oriented statistics. However, it is important to place the SVM in context as but one mem ber of a class of closely related algorithms for nonlin ear classification. As we discuss, several of the "open problems" identified by the authors have in fact been the subject of a significant literature, a literature that may have been missed because it has been aimed not only at the SVM but at a broader family of algorithms. Keeping the broader class of algorithms in mind also helps to make clear that the SVM involves certain specific algorithmic choices, some of which have fa vorable consequences and others of which have unfa vorable consequences?both in theory and in practice. The broader context helps to clarify the ties of the SVM to the surrounding statistical literature. We have at least two broader contexts in mind for the

2 citations

Journal ArticleDOI
TL;DR: In this article, the authors investigated factors that control the rate of pore insertion into polymer membrane and analyte translocation through the channel of viral DNA packaging motors of Phi29, T3 and T7.
Abstract: Biological nanopores for single-pore sensing have the advantage of size homogeneity, structural reproducibility, and channel amenability. In order to translate this to clinical applications, the functional biological nanopore must be inserted into a stable system for high-throughput analysis. Here we report factors that control the rate of pore insertion into polymer membrane and analyte translocation through the channel of viral DNA packaging motors of Phi29, T3 and T7. The hydrophobicity of aminol or carboxyl terminals and their relation to the analyte translocation were investigated. It was found that both the size and the hydrophobicity of the pore terminus are critical factors for direct membrane insertion. An N-terminus or C-terminus hydrophobic mutation is crucial for governing insertion orientation and subsequent macromolecule translocation due to the one-way traffic property. The N- or C-modification led to two different modes of application. The C-terminal insertion permits translocation of analytes such as peptides to enter the channel through the N terminus, while N-terminus insertion prevents translocation but offers the measurement of gating as a sensing parameter, thus generating a tool for detection of markers. A urokinase-type Plasminogen Activator Receptor (uPAR) binding peptide was fused into the C-terminal of Phi29 nanopore to serve as a probe for uPAR protein detection. The uPAR has proven to be a predictive biomarker in several types of cancer, including breast cancer. With an N-terminal insertion, the binding of the uPAR antigen to individual peptide probe induced discretive steps of current reduction due to the induction of channel gating. The distinctive current signatures enabled us to distinguish uPAR positive and negative tumor cell lines. This finding provides a theoretical basis for a robust biological nanopore sensing system for high-throughput macromolecular sensing and tumor biomarker detection.

2 citations

Posted Content
TL;DR: It is proved that the Kendall kernel has exactly two irreducible representations at which the Fourier transform is non-zero, and moreover, the associated matrices are rank one, which implies thatThe Kendall kernel is nearly degenerate, with limited expressive and discriminative power.
Abstract: Kernel methods provide an attractive framework for aggregating and learning from ranking data, and so understanding the fundamental properties of kernels over permutations is a question of broad interest. We provide a detailed analysis of the Fourier spectra of the standard Kendall and Mallows kernels, and a new class of polynomial-type kernels. We prove that the Kendall kernel has exactly two irreducible representations at which the Fourier transform is non-zero, and moreover, the associated matrices are rank one. This implies that the Kendall kernel is nearly degenerate, with limited expressive and discriminative power. In sharp contrast, we prove that the Fourier transform of the Mallows kernel is a strictly positive definite matrix at all irreducible representations. This property guarantees that the Mallows kernel is both characteristic and universal. We introduce a family of normalized polynomial kernels of degree p that interpolates between the Kendall (degree one) and Mallows (infinite degree) kernels, and show that for d-dimensional permutations, the p-th degree kernel is characteristic when p is greater or equal than d - 1, unlike the Euclidean case in which no finite-degree polynomial kernel is characteristic.

1 citations

Book
01 Jun 2002

1 citations


Cited by
More filters
Proceedings ArticleDOI
07 Jun 2015
TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).
Abstract: We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC14 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

40,257 citations

Book
18 Nov 2016
TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.
Abstract: Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning. The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models. Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

38,208 citations

Book
01 Jan 1988
TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Abstract: Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning. Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications. The only necessary mathematical background is familiarity with elementary concepts of probability. The book is divided into three parts. Part I defines the reinforcement learning problem in terms of Markov decision processes. Part II provides basic solution methods: dynamic programming, Monte Carlo methods, and temporal-difference learning. Part III presents a unified view of the solution methods and incorporates artificial neural networks, eligibility traces, and planning; the two final chapters present case studies and consider the future of reinforcement learning.

37,989 citations

Journal ArticleDOI
TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.
Abstract: We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. In the context of text modeling, the topic probabilities provide an explicit representation of a document. We present efficient approximate inference techniques based on variational methods and an EM algorithm for empirical Bayes parameter estimation. We report results in document modeling, text classification, and collaborative filtering, comparing to a mixture of unigrams model and the probabilistic LSI model.

30,570 citations

Proceedings Article
03 Jan 2001
TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).
Abstract: We propose a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams [6], and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI) [3]. In the context of text modeling, our model posits that each document is generated as a mixture of topics, where the continuous-valued mixture proportions are distributed as a latent Dirichlet random variable. Inference and learning are carried out efficiently via variational algorithms. We present empirical results on applications of this model to problems in text modeling, collaborative filtering, and text classification.

25,546 citations