Top 6 papers published by Jeffrey Dean from Google in 2015

Posted Content•

Distilling the Knowledge in a Neural Network

[...]

Geoffrey E. Hinton, Oriol Vinyals, Jeffrey Dean

09 Mar 2015-arXiv: Machine Learning

TL;DR: This work shows that it can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model and introduces a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse.

...read moreread less

Abstract: A very simple way to improve the performance of almost any machine learning algorithm is to train many different models on the same data and then to average their predictions. Unfortunately, making predictions using a whole ensemble of models is cumbersome and may be too computationally expensive to allow deployment to a large number of users, especially if the individual models are large neural nets. Caruana and his collaborators have shown that it is possible to compress the knowledge in an ensemble into a single model which is much easier to deploy and we develop this approach further using a different compression technique. We achieve some surprising results on MNIST and we show that we can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model. We also introduce a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse. Unlike a mixture of experts, these specialist models can be trained rapidly and in parallel.

...read moreread less

12,857 citations

Posted Content•

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

[...]

01 Jan 2015-arXiv: Distributed, Parallel, and Cluster Computing

TL;DR: The TensorFlow interface and an implementation of that interface that is built at Google are described, which has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields.

...read moreread less

Abstract: TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational devices such as GPU cards. The system is flexible and can be used to express a wide variety of algorithms, including training and inference algorithms for deep neural network models, and it has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields, including speech recognition, computer vision, robotics, information retrieval, natural language processing, geographic information extraction, and computational drug discovery. This paper describes the TensorFlow interface and an implementation of that interface that we have built at Google. The TensorFlow API and a reference implementation were released as an open-source package under the Apache 2.0 license in November, 2015 and are available at www.tensorflow.org.

...read moreread less

10,447 citations

Patent•

Analyzing health events using recurrent neural networks

[...]

Greg S. Corrado¹, Jeffrey Dean¹•Institutions (1)

Google¹

27 Jul 2015

TL;DR: In this article, the first temporal sequence of health events is generated using a recurrent neural network (RNN) and health analysis data that characterizes future health events that may occur after a last time step in the temporal sequence.

...read moreread less

Abstract: The invention provides methods, systems, and apparatus, including computer programs encoded on computer storage media, for using recurrent neural networks to analyze health events. One of the methodsincludes obtaining a first temporal sequence of health events, wherein the first temporal sequence comprises respective health-related data associated with a particular patient at each of a pluralityof time steps; processing the first temporal sequence of health events using a recurrent neural network to generate a neural network output for the first temporal sequence; and generating, from the neural network output for the first temporal sequence, health analysis data that characterizes future health events that may occur after a last time step in the temporal sequence.

...read moreread less

16 citations

Patent•

Training distilled machine learning models

[...]

Oriol Vinyals¹, Jeffrey Dean¹, Geoffrey E. Hinton¹•Institutions (1)

Google¹

04 Jun 2015

13 citations

Proceedings Article•DOI•

The rise of cloud computing systems

[...]

Jeffrey Dean¹•Institutions (1)

Google¹

04 Oct 2015

TL;DR: The development of systems that underlie modern cloud computing systems, benefiting from the economies of scale of large datacenters and the ability to grow and shrink computing resources on demand across millions of customers, are described.

...read moreread less

Abstract: In this talk I will describe the development of systems that underlie modern cloud computing systems. This development shares much of its motivation with the related fields of transaction processing systems and high performance computing, but because of scale, these systems tend to have more emphasis on fault tolerance using software techniques. Important developments in the development of modern cloud systems include very high performance distributed file system, such as the Google File System (Ghemawat et al., SOSP 2003), reliable computational frameworks such as MapReduce (Dean & Ghemawat, OSDI 2004) and Dryad (Isard et al., 2007), and large scale structured storage systems such as BigTable (Chang et al. 2006), Dynamo (DeCandia et al., 2007), and Spanner (Corbett et al., 2012). Scheduling computations can either be done using virtual machines (exemplified by VMWare's products), or as individual processes or containers. The development of public cloud platforms such as AWS, Microsoft Azure, and Google Cloud Platform, allow external developers to utilize these large-scale services to build new and interesting services and products, benefiting from the economies of scale of large datacenters and the ability to grow and shrink computing resources on demand across millions of customers.

...read moreread less

6 citations

Patent•

Translating terms using numeric representations

[...]

Ilya Sutskever¹, Tomas Mikolov¹, Jeffrey Dean¹, Quoc V. Le¹•Institutions (1)

Google¹

17 Sep 2015

4 citations

Showing papers by "Jeffrey Dean published in 2015"