Author

David Rogers

Bio: David Rogers is an academic researcher from Symyx Technologies. The author has an hindex of 1, co-authored 1 publications receiving 2865 citations.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Extended-Connectivity Fingerprints

[...]

David Rogers¹, Mathew Hahn¹•Institutions (1)

Symyx Technologies¹

28 Apr 2010-Journal of Chemical Information and Modeling

TL;DR: A description of their implementation has not previously been presented in the literature, and ECFPs can be very rapidly calculated and can represent an essentially infinite number of different molecular features.

...read moreread less

Abstract: Extended-connectivity fingerprints (ECFPs) are a novel class of topological fingerprints for molecular characterization. Historically, topological fingerprints were developed for substructure and similarity searching. ECFPs were developed specifically for structure−activity modeling. ECFPs are circular fingerprints with a number of useful qualities: they can be very rapidly calculated; they are not predefined and can represent an essentially infinite number of different molecular features (including stereochemical information); their features represent the presence of particular substructures, allowing easier interpretation of analysis results; and the ECFP algorithm can be tailored to generate different types of circular fingerprints, optimized for different uses. While the use of ECFPs has been widely adopted and validated, a description of their implementation has not previously been presented in the literature.

...read moreread less

4,173 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Geometric Deep Learning: Going beyond Euclidean data

[...]

Michael M. Bronstein¹, Joan Bruna, Yann LeCun², Arthur Szlam³, Pierre Vandergheynst⁴ - Show less +1 more•Institutions (4)

University of Lugano¹, New York University², Facebook³, École Polytechnique Fédérale de Lausanne⁴

11 Jul 2017-IEEE Signal Processing Magazine

TL;DR: In many applications, such geometric data are large and complex (in the case of social networks, on the scale of billions) and are natural targets for machine-learning techniques as mentioned in this paper.

...read moreread less

Abstract: Many scientific fields study data with an underlying structure that is non-Euclidean. Some examples include social networks in computational social sciences, sensor networks in communications, functional networks in brain imaging, regulatory networks in genetics, and meshed surfaces in computer graphics. In many applications, such geometric data are large and complex (in the case of social networks, on the scale of billions) and are natural targets for machine-learning techniques. In particular, we would like to use deep neural networks, which have recently proven to be powerful tools for a broad range of problems from computer vision, natural-language processing, and audio analysis. However, these tools have been most successful on data with an underlying Euclidean or grid-like structure and in cases where the invariances of these structures are built into networks used to model them.

...read moreread less

2,565 citations

Posted Content•

Neural Message Passing for Quantum Chemistry

[...]

Justin Gilmer¹, Samuel S. Schoenholz², Patrick Riley¹, Oriol Vinyals¹, George E. Dahl¹ - Show less +1 more•Institutions (2)

Google¹, University of Pennsylvania²

04 Apr 2017-arXiv: Learning

TL;DR: Using MPNNs, state of the art results on an important molecular property prediction benchmark are demonstrated and it is believed future work should focus on datasets with larger molecules or more accurate ground truth labels.

...read moreread less

Abstract: Supervised learning on molecules has incredible potential to be useful in chemistry, drug discovery, and materials science. Luckily, several promising and closely related neural network models invariant to molecular symmetries have already been described in the literature. These models learn a message passing algorithm and aggregation procedure to compute a function of their entire input graph. At this point, the next step is to find a particularly effective variant of this general approach and apply it to chemical prediction benchmarks until we either solve them or reach the limits of the approach. In this paper, we reformulate existing models into a single common framework we call Message Passing Neural Networks (MPNNs) and explore additional novel variations within this framework. Using MPNNs we demonstrate state of the art results on an important molecular property prediction benchmark; these results are strong enough that we believe future work should focus on datasets with larger molecules or more accurate ground truth labels.

...read moreread less

2,184 citations

Journal Article•DOI•

ZINC 15 – Ligand Discovery for Everyone

[...]

Teague Sterling¹, John J. Irwin¹•Institutions (1)

University of California, San Francisco¹

09 Nov 2015-Journal of Chemical Information and Modeling

TL;DR: A suite of ligand annotation, purchasability, target, and biology association tools, incorporated into ZINC and meant for investigators who are not computer specialists, offer new analysis tools that are easy for nonspecialists yet with few limitations for experts.

...read moreread less

Abstract: Many questions about the biological activity and availability of small molecules remain inaccessible to investigators who could most benefit from their answers. To narrow the gap between chemoinformatics and biology, we have developed a suite of ligand annotation, purchasability, target, and biology association tools, incorporated into ZINC and meant for investigators who are not computer specialists. The new version contains over 120 million purchasable “drug-like” compounds – effectively all organic molecules that are for sale – a quarter of which are available for immediate delivery. ZINC connects purchasable compounds to high-value ones such as metabolites, drugs, natural products, and annotated compounds from the literature. Compounds may be accessed by the genes for which they are annotated as well as the major and minor target classes to which those genes belong. It offers new analysis tools that are easy for nonspecialists yet with few limitations for experts. ZINC retains its original 3D roots – ...

...read moreread less

2,115 citations

Journal Article•DOI•

Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules

[...]

Rafael Gómez-Bombarelli, Jennifer N. Wei¹, David Duvenaud², José Miguel Hernández-Lobato³, Benjamin Sanchez-Lengeling¹, Dennis Sheberla¹, Jorge Aguilera-Iparraguirre, Timothy D. Hirzel, Ryan P. Adams⁴, Alán Aspuru-Guzik⁵, Alán Aspuru-Guzik¹ - Show less +7 more•Institutions (5)

Harvard University¹, University of Toronto², University of Cambridge³, Google⁴, Canadian Institute for Advanced Research⁵

12 Jan 2018-ACS central science

TL;DR: In this article, a deep neural network was trained on hundreds of thousands of existing chemical structures to construct three coupled functions: an encoder, a decoder, and a predictor, which can generate new molecules for efficient exploration and optimization through open-ended spaces of chemical compounds.

...read moreread less

Abstract: We report a method to convert discrete representations of molecules to and from a multidimensional continuous representation. This model allows us to generate new molecules for efficient exploration and optimization through open-ended spaces of chemical compounds. A deep neural network was trained on hundreds of thousands of existing chemical structures to construct three coupled functions: an encoder, a decoder, and a predictor. The encoder converts the discrete representation of a molecule into a real-valued continuous vector, and the decoder converts these continuous vectors back to discrete molecular representations. The predictor estimates chemical properties from the latent continuous vector representation of the molecule. Continuous representations of molecules allow us to automatically generate novel chemical structures by performing simple operations in the latent space, such as decoding random vectors, perturbing known chemical structures, or interpolating between molecules. Continuous represent...

...read moreread less

1,884 citations

Proceedings Article•

Convolutional networks on graphs for learning molecular fingerprints

[...]

David Duvenaud¹, Dougal Maclaurin¹, Jorge Aguilera-Iparraguirre¹, Rafael Gómez-Bombarelli¹, Timothy D. Hirzel¹, Alán Aspuru-Guzik¹, Ryan P. Adams¹ - Show less +3 more•Institutions (1)

Harvard University¹

07 Dec 2015

TL;DR: In this paper, a convolutional neural network that operates directly on graphs is proposed to learn end-to-end learning of prediction pipelines whose inputs are graphs of arbitrary size and shape.

...read moreread less

Abstract: We introduce a convolutional neural network that operates directly on graphs. These networks allow end-to-end learning of prediction pipelines whose inputs are graphs of arbitrary size and shape. The architecture we present generalizes standard molecular feature extraction methods based on circular fingerprints. We show that these data-driven features are more interpretable, and have better predictive performance on a variety of tasks.

...read moreread less

1,857 citations

Collapse