Top 5 papers published by Thomas Unterthiner from Google in 2015

Posted Content•

Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)

[...]

Djork-Arné Clevert¹, Thomas Unterthiner¹, Sepp Hochreiter¹•Institutions (1)

23 Nov 2015-arXiv: Learning

TL;DR: The Exponential Linear Unit (ELU) as mentioned in this paper was proposed to alleviate the vanishing gradient problem via the identity for positive values, which has improved learning characteristics compared to the units with other activation functions.

...read moreread less

Abstract: We introduce the "exponential linear unit" (ELU) which speeds up learning in deep neural networks and leads to higher classification accuracies. Like rectified linear units (ReLUs), leaky ReLUs (LReLUs) and parametrized ReLUs (PReLUs), ELUs alleviate the vanishing gradient problem via the identity for positive values. However, ELUs have improved learning characteristics compared to the units with other activation functions. In contrast to ReLUs, ELUs have negative values which allows them to push mean unit activations closer to zero like batch normalization but with lower computational complexity. Mean shifts toward zero speed up learning by bringing the normal gradient closer to the unit natural gradient because of a reduced bias shift effect. While LReLUs and PReLUs have negative values, too, they do not ensure a noise-robust deactivation state. ELUs saturate to a negative value with smaller inputs and thereby decrease the forward propagated variation and information. Therefore, ELUs code the degree of presence of particular phenomena in the input, while they do not quantitatively model the degree of their absence. In experiments, ELUs lead not only to faster learning, but also to significantly better generalization performance than ReLUs and LReLUs on networks with more than 5 layers. On CIFAR-100 ELUs networks significantly outperform ReLU networks with batch normalization while batch normalization does not improve ELU networks. ELU networks are among the top 10 reported CIFAR-10 results and yield the best published result on CIFAR-100, without resorting to multi-view evaluation or model averaging. On ImageNet, ELU networks considerably speed up learning compared to a ReLU network with the same architecture, obtaining less than 10% classification error for a single crop, single model network.

...read moreread less

3,309 citations

Posted Content•

Toxicity Prediction using Deep Learning

[...]

Thomas Unterthiner¹, Andreas Mayr¹, Günter Klambauer¹, Sepp Hochreiter¹•Institutions (1)

Johannes Kepler University of Linz¹

04 Mar 2015-arXiv: Machine Learning

TL;DR: The Deep Learning approach won both of the panel-challenges (nuclear receptors and stress response) as well as the overall Grand Challenge, and thereby sets a new standard in tox prediction.

...read moreread less

Abstract: Everyday we are exposed to various chemicals via food additives, cleaning and cosmetic products and medicines — and some of them might be toxic. However testing the toxicity of all existing compounds by biological experiments is neither financially nor logistically feasible. Therefore the government agencies NIH, EPA and FDA launched the Tox21 Data Challenge within the “Toxicology in the 21st Century” (Tox21) initiative. The goal of this challenge was to assess the performance of computational methods in predicting the toxicity of chemical compounds. State of the art toxicity prediction methods build upon specifically-designed chemical descriptors developed over decades. Though Deep Learning is new to the field and was never applied to toxicity prediction before, it clearly outperformed all other participating methods. In this application paper we show that deep nets automatically learn features resembling well-established toxicophores. In total, our Deep Learning approach won both of the panel-challenges (nuclear receptors and stress response) as well as the overall Grand Challenge, and thereby sets a new standard in tox prediction.

...read moreread less

120 citations

Journal Article•DOI•

Rchemcpp: a web service for structural analoging in ChEMBL, Drugbank and the Connectivity Map

[...]

Günter Klambauer¹, Martin Wischenbart¹, Michael Mahr¹, Thomas Unterthiner¹, Andreas Mayr¹, Sepp Hochreiter¹ - Show less +2 more•Institutions (1)

Johannes Kepler University of Linz¹

15 Oct 2015-Bioinformatics

TL;DR: Rchemcpp is a web service that identifies structurally similar compounds (structural analogs) in large-scale molecule databases and was used in the DeepTox pipeline that has won the Tox21 Data Challenge and is frequently used by researchers in pharmaceutical companies.

...read moreread less

Abstract: UNLABELLED We have developed Rchempp, a web service that identifies structurally similar compounds (structural analogs) in large-scale molecule databases. The service allows compounds to be queried in the widely used ChEMBL, DrugBank and the Connectivity Map databases. Rchemcpp utilizes the best performing similarity functions, i.e. molecule kernels, as measures for structural similarity. Molecule kernels have proven superior performance over other similarity measures and are currently excelling at machine learning challenges. To considerably reduce computational time, and thereby make it feasible as a web service, a novel efficient prefiltering strategy has been developed, which maintains the sensitivity of the method. By exploiting information contained in public databases, the web service facilitates many applications crucial for the drug development process, such as prioritizing compounds after screening or reducing adverse side effects during late phases. Rchemcpp was used in the DeepTox pipeline that has won the Tox21 Data Challenge and is frequently used by researchers in pharmaceutical companies. AVAILABILITY AND IMPLEMENTATION The web service and the R package are freely available via http://shiny.bioinf.jku.at/Analoging/ and via Bioconductor. CONTACT hochreit@bioinf.jku.at SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

...read moreread less

16 citations

Proceedings Article•

Rectified factor networks

[...]

Djork-Arné Clevert¹, Andreas Mayr¹, Thomas Unterthiner¹, Sepp Hochreiter¹•Institutions (1)

Johannes Kepler University of Linz¹

07 Dec 2015

TL;DR: In this paper, rectified factor networks (RFNs) are proposed to efficiently construct very sparse, non-linear, high-dimensional representations of the input, which have a low reconstruction error and explain the data covariance structure.

...read moreread less

Abstract: We propose rectified factor networks (RFNs) to efficiently construct very sparse, non-linear, high-dimensional representations of the input. RFN models identify rare and small events in the input, have a low interference between code units, have a small reconstruction error, and explain the data covariance structure. RFN learning is a generalized alternating minimization algorithm derived from the posterior regularization method which enforces non-negative and normalized posterior means. We proof convergence and correctness of the RFN learning algorithm. On benchmarks, RFNs are compared to other unsupervised methods like autoencoders, RBMs, factor analysis, ICA, and PCA. In contrast to previous sparse coding methods, RFNs yield sparser codes, capture the data's covariance structure more precisely, and have a significantly smaller reconstruction error. We test RFNs as pretraining technique for deep networks on different vision datasets, where RFNs were superior to RBMs and autoencoders. On gene expression data from two pharmaceutical drug discovery studies, RFNs detected small and rare gene modules that revealed highly relevant new biological insights which were so far missed by other unsupervised methods. RFN package for GPU/CPU is available at http://www.bioinf.jku.at/software/rfn.

...read moreread less

16 citations

Posted Content•

Rectified Factor Networks

[...]

Djork-Arné Clevert¹, Andreas Mayr¹, Thomas Unterthiner¹, Sepp Hochreiter¹•Institutions (1)

Johannes Kepler University of Linz¹

23 Feb 2015-arXiv: Learning

TL;DR: On gene expression data from two pharmaceutical drug discovery studies, RFNs detected small and rare gene modules that revealed highly relevant new biological insights which were so far missed by other unsupervised methods.

...read moreread less

Abstract: We propose rectified factor networks (RFNs) to efficiently construct very sparse, non-linear, high-dimensional representations of the input. RFN models identify rare and small events in the input, have a low interference between code units, have a small reconstruction error, and explain the data covariance structure. RFN learning is a generalized alternating minimization algorithm derived from the posterior regularization method which enforces non-negative and normalized posterior means. We proof convergence and correctness of the RFN learning algorithm. On benchmarks, RFNs are compared to other unsupervised methods like autoencoders, RBMs, factor analysis, ICA, and PCA. In contrast to previous sparse coding methods, RFNs yield sparser codes, capture the data's covariance structure more precisely, and have a significantly smaller reconstruction error. We test RFNs as pretraining technique for deep networks on different vision datasets, where RFNs were superior to RBMs and autoencoders. On gene expression data from two pharmaceutical drug discovery studies, RFNs detected small and rare gene modules that revealed highly relevant new biological insights which were so far missed by other unsupervised methods.

...read moreread less

4 citations

Showing papers by "Thomas Unterthiner published in 2015"