Institution
Amazon.com
Company•Seattle, Washington, United States•
About: Amazon.com is a company organization based out in Seattle, Washington, United States. It is known for research contribution in the topics: Computer science & Service (business). The organization has 13363 authors who have published 17317 publications receiving 266589 citations.
Topics: Computer science, Service (business), Service provider, Context (language use), Virtual machine
Papers published on a yearly basis
Papers
More filters
••
Georgia Institute of Technology1, University of Tennessee2, Oak Ridge National Laboratory3, University of Alabama in Huntsville4, Brookhaven National Laboratory5, University of L'Aquila6, University of Kentucky7, Scripps Health8, National Center for Computational Sciences9, Amazon.com10, Nvidia11, Argonne National Laboratory12, New York City College of Technology13
TL;DR: A supercomputer-driven pipeline for in silico drug discovery using enhanced sampling molecular dynamics (MD) and ensemble docking is presented, including the use of quantum mechanical, machine learning, and artificial intelligence methods to cluster MD trajectories and rescore docking poses.
Abstract: We present a supercomputer-driven pipeline for in silico drug discovery using enhanced sampling molecular dynamics (MD) and ensemble docking. Ensemble docking makes use of MD results by docking compound databases into representative protein binding-site conformations, thus taking into account the dynamic properties of the binding sites. We also describe preliminary results obtained for 24 systems involving eight proteins of the proteome of SARS-CoV-2. The MD involves temperature replica exchange enhanced sampling, making use of massively parallel supercomputing to quickly sample the configurational space of protein drug targets. Using the Summit supercomputer at the Oak Ridge National Laboratory, more than 1 ms of enhanced sampling MD can be generated per day. We have ensemble docked repurposing databases to 10 configurations of each of the 24 SARS-CoV-2 systems using AutoDock Vina. Comparison to experiment demonstrates remarkably high hit rates for the top scoring tranches of compounds identified by our ensemble approach. We also demonstrate that, using Autodock-GPU on Summit, it is possible to perform exhaustive docking of one billion compounds in under 24 h. Finally, we discuss preliminary results and planned improvements to the pipeline, including the use of quantum mechanical (QM), machine learning, and artificial intelligence (AI) methods to cluster MD trajectories and rescore docking poses.
120 citations
••
TL;DR: In this article, the authors estimated committed carbon emissions from deforestation and fragmentation in Amazonia, using three simulated models of landscape change: a ''Rondonia scenario'' which mimicked settlement schemes of small farmers in the southern Amazon; a ''Para scenario'' that imitated large cattle ranches in the eastern Amazon; and a ''random scenario'' in which forest tracts were cleared randomly.
120 citations
•
TL;DR: The results show that DistDGL achieves linear speedup without compromising model accuracy and requires only 13 seconds to complete a training epoch for a graph with 100 million nodes and 3 billion edges on a cluster with 16 machines.
Abstract: Graph neural networks (GNN) have shown great success in learning from graph-structured data. They are widely used in various applications, such as recommendation, fraud detection, and search. In these domains, the graphs are typically large, containing hundreds of millions of nodes and several billions of edges. To tackle this challenge, we develop DistDGL, a system for training GNNs in a mini-batch fashion on a cluster of machines. DistDGL is based on the Deep Graph Library (DGL), a popular GNN development framework. DistDGL distributes the graph and its associated data (initial features and embeddings) across the machines and uses this distribution to derive a computational decomposition by following an owner-compute rule. DistDGL follows a synchronous training approach and allows ego-networks forming the mini-batches to include non-local nodes. To minimize the overheads associated with distributed computations, DistDGL uses a high-quality and light-weight min-cut graph partitioning algorithm along with multiple balancing constraints. This allows it to reduce communication overheads and statically balance the computations. It further reduces the communication by replicating halo nodes and by using sparse embedding updates. The combination of these design choices allows DistDGL to train high-quality models while achieving high parallel efficiency and memory scalability. We demonstrate our optimizations on both inductive and transductive GNN models. Our results show that DistDGL achieves linear speedup without compromising model accuracy and requires only 13 seconds to complete a training epoch for a graph with 100 million nodes and 3 billion edges on a cluster with 16 machines.
120 citations
••
TL;DR: The discovery of a new species of a river dolphin from the Araguaia River basin of Brazil, the first such discovery in nearly 100 years, is reported, which is diagnosable by a series of molecular and morphological characters and diverged from its Amazonian sister taxon 2.08 million years ago.
Abstract: True river dolphins are some of the rarest and most endangered of all vertebrates. They comprise relict evolutionary lineages of high taxonomic distinctness and conservation value, but are afforded little protection. We report the discovery of a new species of a river dolphin from the Araguaia River basin of Brazil, the first such discovery in nearly 100 years. The species is diagnosable by a series of molecular and morphological characters and diverged from its Amazonian sister taxon 2.08 million years ago. The estimated time of divergence corresponds to the separation of the Araguaia-Tocantins basin from the Amazon basin. This discovery highlights the immensity of the deficit in our knowledge of Neotropical biodiversity, as well as vulnerability of biodiversity to anthropogenic actions in an increasingly threatened landscape. We anticipate that this study will provide an impetus for the taxonomic and conservation reanalysis of other taxa shared between the Araguaia and Amazon aquatic ecosystems, as well as stimulate historical biogeographical analyses of the two basins.
119 citations
•
26 Aug 2013TL;DR: In this paper, a system, method, and computer readable medium for managing CDN service providers is provided, where a network storage provider storing one or more resources on behalf of a content provider obtains client computing device requests for content.
Abstract: A system, method, and computer readable medium for managing CDN service providers are provided. A network storage provider storing one or more resources on behalf of a content provider obtains client computing device requests for content. The network storage provider processes the client computing device requests and determines whether a subsequent request for the resource should be directed to a CDN service provider as a function of the updated or processed by the network storage provider storage component.
119 citations
Authors
Showing all 13498 results
Name | H-index | Papers | Citations |
---|---|---|---|
Jiawei Han | 168 | 1233 | 143427 |
Bernhard Schölkopf | 148 | 1092 | 149492 |
Christos Faloutsos | 127 | 789 | 77746 |
Alexander J. Smola | 122 | 434 | 110222 |
Rama Chellappa | 120 | 1031 | 62865 |
William F. Laurance | 118 | 470 | 56464 |
Andrew McCallum | 113 | 472 | 78240 |
Michael J. Black | 112 | 429 | 51810 |
David Heckerman | 109 | 483 | 62668 |
Larry S. Davis | 107 | 693 | 49714 |
Chris M. Wood | 102 | 795 | 43076 |
Pietro Perona | 102 | 414 | 94870 |
Guido W. Imbens | 97 | 352 | 64430 |
W. Bruce Croft | 97 | 426 | 39918 |
Chunhua Shen | 93 | 681 | 37468 |