scispace - formally typeset
Search or ask a question

Showing papers by "Jeffrey Dean published in 2012"


Proceedings Article
03 Dec 2012
TL;DR: This paper considers the problem of training a deep network with billions of parameters using tens of thousands of CPU cores and develops two algorithms for large-scale distributed training, Downpour SGD and Sandblaster L-BFGS, which increase the scale and speed of deep network training.
Abstract: Recent work in unsupervised feature learning and deep learning has shown that being able to train large models can dramatically improve performance. In this paper, we consider the problem of training a deep network with billions of parameters using tens of thousands of CPU cores. We have developed a software framework called DistBelief that can utilize computing clusters with thousands of machines to train large models. Within this framework, we have developed two algorithms for large-scale distributed training: (i) Downpour SGD, an asynchronous stochastic gradient descent procedure supporting a large number of model replicas, and (ii) Sandblaster, a framework that supports a variety of distributed batch optimization procedures, including a distributed implementation of L-BFGS. Downpour SGD and Sandblaster L-BFGS both increase the scale and speed of deep network training. We have successfully used our system to train a deep network 30x larger than previously reported in the literature, and achieves state-of-the-art performance on ImageNet, a visual object recognition task with 16 million images and 21k categories. We show that these same techniques dramatically accelerate the training of a more modestly- sized deep network for a commercial speech recognition service. Although we focus on and report performance of these methods as applied to training large neural networks, the underlying algorithms are applicable to any gradient-based machine learning algorithm.

3,475 citations


Proceedings ArticleDOI
08 Oct 2012
TL;DR: This article describes how Spanner is structured, its feature set, the rationale underlying various design decisions, and a novel time API that exposes clock uncertainty, critical to supporting external consistency and a variety of powerful features.
Abstract: Spanner is Google's scalable, multi-version, globally-distributed, and synchronously-replicated database. It is the first system to distribute data at global scale and support externally-consistent distributed transactions. This paper describes how Spanner is structured, its feature set, the rationale underlying various design decisions, and a novel time API that exposes clock uncertainty. This API and its implementation are critical to supporting external consistency and a variety of powerful features: nonblocking reads in the past, lock-free read-only transactions, and atomic schema changes, across all of Spanner.

1,366 citations


Proceedings Article
26 Jun 2012
TL;DR: In this paper, a 9-layered locally connected sparse autoencoder with pooling and local contrast normalization was used to learn high-level, class-specific feature detectors from only unlabeled data.
Abstract: We consider the problem of building high-level, class-specific feature detectors from only unlabeled data. For example, is it possible to learn a face detector using only unlabeled images using unlabeled images? To answer this, we train a 9-layered locally connected sparse autoencoder with pooling and local contrast normalization on a large dataset of images (the model has 1 billion connections, the dataset has 10 million 200×200 pixel images downloaded from the Internet). We train this network using model parallelism and asynchronous SGD on a cluster with 1,000 machines (16,000 cores) for three days. Contrary to what appears to be a widely-held intuition, our experimental results reveal that it is possible to train a face detector without having to label images as containing a face or not. Control experiments show that this feature detector is robust not only to translation but also to scaling and out-of-plane rotation. We also find that the same network is sensitive to other high-level concepts such as cat faces and human bodies. Starting with these learned features, we trained our network to obtain 15.8% accuracy in recognizing 20,000 object categories from ImageNet, a leap of 70% relative improvement over the previous state-of-the-art.

786 citations


Patent
Rob Pike1, Sean Quinlan1, Sean Dorward1, Jeffrey Dean1, Sanjay Ghemawat1 
28 Feb 2012
TL;DR: In this paper, a method and system for analyzing data records includes allocating groups of records to respective processes of a first plurality of processes executing in parallel, for each record in the group of records allocated to the respective process, a query is applied to the record so as to produce zero or more values.
Abstract: A method and system for analyzing data records includes allocating groups of records to respective processes of a first plurality of processes executing in parallel. In each respective process of the first plurality of processes, for each record in the group of records allocated to the respective process, a query is applied to the record so as to produce zero or more values. Zero or more emit operators are applied to each of the zero or more produced values so as to add corresponding information to an intermediate data structure. Information from a plurality of the intermediate data structures is aggregated to produce output data.

115 citations



01 Jan 2012
TL;DR: In this paper, the authors discuss more details regarding the algorithm, its implementation, test set for 3D-transformed faces, experimental results for parameter sensitivity, and further visualizations for the learned neurons.
Abstract: In this appendix, we discuss more details regarding the algorithm, its implementation, test set for 3D-transformed faces, experimental results for parameter sensitivity. We also present further visualizations for the learned neurons.

36 citations


Patent
23 Oct 2012
TL;DR: In this paper, an engaging post identifier for identifying and retrieving engaging posts, an extended network post identifier to identify extended posts from an extended social network, and a combining module for creating a combined list of added posts from the engaging post and the extended posts, the combining module generating one or more ranked posts by ranking the list of adding posts by relevance to a user.
Abstract: A system includes: an engaging post identifier for identifying and retrieving engaging posts; an extended network post identifier for identifying extended posts from an extended network; a combining module for creating a combined list of added posts from the engaging post and the extended posts, the combining module generating one or more ranked posts by ranking the list of added posts by relevance to a user; and a user interface module for providing the one or more ranked posts. The disclosure also includes a method for finding and providing engaging posts that includes determining engaging posts; determining extended posts from an extended social network using a social graph of the user; adding the engaging posts and the extended posts to create a combined list of added posts; ranking the added posts by relevance to a user; and providing one or more of the ranked posts.

12 citations


Patent
19 Jul 2012
TL;DR: In this paper, the present disclosure can be embodied in a method that includes identifying a collection of entities from one or more data sources, calculating a score for subsets of entities, assigning the calculated score to the identified entities from the respective subset, and ranking the entities based on the assigned score, so as to identify entities in the collection that are related to the seed entities.
Abstract: In one aspect, the present disclosure can be embodied in a method that includes identifying a collection of entities from one or more data sources, calculating a score for subsets of entities from the collection based on one or more seed entities associated with the collection, identifying one or more entities from each of the subsets based on the calculated score, assigning the calculated score to the identified one or more entities from the respective subset, and ranking the one or more entities based on the assigned score, so as to identify entities in the collection that are related to the one or more seed entities

9 citations


Patent
30 Aug 2012
TL;DR: In this article, the first document in a plurality of documents is selected on the basis of a query independent score associated with a fingerprint that indicates that the document has substantially identical content to every other document in the plurality.
Abstract: Systems and methods for indexing a representative document from a set of duplicate documents are disclosed. Disclosed systems and methods comprise selecting a first document in a plurality of documents on the basis that the first document is associated with a query independent score. Each respective document in the plurality of documents has a fingerprint that indicates that the respective document has substantially identical content to every other document in the plurality of documents. Disclosed systems and methods further comprise indexing, in accordance with the query independent score, the first document thereby producing an indexed first document. With respect to the plurality of documents, only the indexed first document is included in a document index.

3 citations


Patent
26 Nov 2012
TL;DR: In this article, a search engine server system receives from a client system a search query and identifies a set of documents in accordance with the search query, and a content snippet corresponding to content in a respective document of the identified sets of documents is generated, the content snippet associated with at least one query term of the one or more query terms in the query.
Abstract: A search engine server system receives from a client system a search query and identifies a set of documents in accordance with the search query. A content snippet corresponding to content in a respective document of the identified set of documents is generated, the content snippet associated with at least one query term of the one or more query terms in the search query. A response to the search query is returned to the client system, the response including information identifying at least the respective document and including the content snippet. Generating the content snippet includes performing a first decompression operation on first token identifiers, from a compressed document repository, to provide a set of second token identifiers, and performing a second decompression operation on the set of second token identifiers to recover uncompressed content comprising a portion of the respective document.

3 citations


Patent
John Newlin1, Jeffrey Dean1
11 Jul 2012
TL;DR: In this article, the first and second data are stored in the cache in an order based on the relative values of the first retrieval times of the second and second retrieval times, where the first metadata includes a first retrieval time and the second metadata including a second retrieval time.
Abstract: An example method for passive compaction of a cache includes determining first metadata associated with first data and second metadata associated with second data. The first metadata includes a first retrieval time, and the second metadata includes a second retrieval time. The example method further includes obtaining a first metadata key including a first unique identifier and obtaining a second metadata key including a second unique identifier. The example method also includes generating a first data key and generating a second data key. The example method further includes writing, at a client device, the first and second data to the cache. Each of the first and second data occupy one or more contiguous blocks of physical memory in the cache, and the first and second data are stored in the cache in an order based on the relative values of the first and second retrieval times.