Home
/
Authors
/
Jeremy Howard

Author

Jeremy Howard

Other affiliations: North Carolina State University, University of Nebraska–Lincoln, Salk Institute for Biological Studies ...read more

Bio: Jeremy Howard is an academic researcher from University of San Francisco. The author has contributed to research in topics: Population & Language model. The author has an hindex of 17, co-authored 54 publications receiving 3580 citations. Previous affiliations of Jeremy Howard include North Carolina State University & University of Nebraska–Lincoln.

Topics: Population, Language model, Runs of Homozygosity, Inbreeding, Deep learning ...read more

Papers published on a yearly basis

2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Universal Language Model Fine-tuning for Text Classification

[...]

Jeremy Howard¹, Sebastian Ruder²•Institutions (2)

North Carolina State University¹, National University of Ireland, Galway²

18 Jan 2018

TL;DR: Universal Language Model Fine-tuning (ULMFiT) as mentioned in this paper is an effective transfer learning method that can be applied to any task in NLP, and introduces techniques that are key for finetuning a language model.

...read moreread less

Abstract: Inductive transfer learning has greatly impacted computer vision, but existing approaches in NLP still require task-specific modifications and training from scratch. We propose Universal Language Model Fine-tuning (ULMFiT), an effective transfer learning method that can be applied to any task in NLP, and introduce techniques that are key for fine-tuning a language model. Our method significantly outperforms the state-of-the-art on six text classification tasks, reducing the error by 18-24% on the majority of datasets. Furthermore, with only 100 labeled examples, it matches the performance of training from scratch on 100 times more data. We open-source our pretrained models and code.

...read moreread less

2,128 citations

Journal Article•DOI•

An evidence review of face masks against COVID-19.

[...]

Jeremy Howard¹, Austin Huang², Zhiyuan Li³, Zeynep Tufekci⁴, Vladimir Zdimal⁵, Helene-Mari van der Westhuizen⁶, Arne von Delft⁷, Amy Price⁸, Lex Fridman⁹, Lei-Han Tang¹⁰, Viola Tang¹¹, Gregory L. Watson¹², Christina E. Bax¹³, Reshama Shaikh, Frederik Questier¹⁴, Danny Hernandez¹⁵, Larry F. Chu⁸, Christina M. Ramirez¹², Anne W. Rimoin¹² - Show less +15 more•Institutions (15)

University of San Francisco¹, Brown University², Peking University³, University of North Carolina at Chapel Hill⁴, Academy of Sciences of the Czech Republic⁵, University of Oxford⁶, University of Cape Town⁷, Stanford University⁸, Massachusetts Institute of Technology⁹, Hong Kong Baptist University¹⁰, Hong Kong University of Science and Technology¹¹, University of California, Los Angeles¹², University of Pennsylvania¹³, Vrije Universiteit Brussel¹⁴, OpenAI¹⁵

26 Jan 2021-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: In this article, the authors developed an analytical framework to examine mask usage, synthesizing the relevant literature to inform multiple areas: population impact, transmission characteristics, source control, wearer protection, sociological considerations, and implementation considerations.

...read moreread less

Abstract: The science around the use of masks by the public to impede COVID-19 transmission is advancing rapidly. In this narrative review, we develop an analytical framework to examine mask usage, synthesizing the relevant literature to inform multiple areas: population impact, transmission characteristics, source control, wearer protection, sociological considerations, and implementation considerations. A primary route of transmission of COVID-19 is via respiratory particles, and it is known to be transmissible from presymptomatic, paucisymptomatic, and asymptomatic individuals. Reducing disease spread requires two things: limiting contacts of infected individuals via physical distancing and other measures and reducing the transmission probability per contact. The preponderance of evidence indicates that mask wearing reduces transmissibility per contact by reducing transmission of infected respiratory particles in both laboratory and clinical contexts. Public mask wearing is most effective at reducing spread of the virus when compliance is high. Given the current shortages of medical masks, we recommend the adoption of public cloth mask wearing, as an effective form of source control, in conjunction with existing hygiene, distancing, and contact tracing strategies. Because many respiratory particles become smaller due to evaporation, we recommend increasing focus on a previously overlooked aspect of mask usage: mask wearing by infectious people ("source control") with benefits at the population level, rather than only mask wearing by susceptible people, such as health care workers, with focus on individual outcomes. We recommend that public officials and governments strongly encourage the use of widespread face masks in public, including the use of appropriate regulation.

...read moreread less

679 citations

Journal Article•DOI•

Fastai: A Layered API for Deep Learning

[...]

Jeremy Howard, Sylvain Gugger

11 Feb 2020-Information-an International Interdisciplinary Journal

TL;DR: This paper has used this library to successfully create a complete deep learning course, which was able to write more quickly than using previous approaches, and the code was more clear.

...read moreread less

Abstract: fastai is a deep learning library which provides practitioners with high-level components that can quickly and easily provide state-of-the-art results in standard deep learning domains, and provides researchers with low-level components that can be mixed and matched to build new approaches. It aims to do both things without substantial compromises in ease of use, flexibility, or performance. This is possible thanks to a carefully layered architecture, which expresses common underlying patterns of many deep learning and data processing techniques in terms of decoupled abstractions. These abstractions can be expressed concisely and clearly by leveraging the dynamism of the underlying Python language and the flexibility of the PyTorch library. fastai includes: a new type dispatch system for Python along with a semantic type hierarchy for tensors; a GPU-optimized computer vision library which can be extended in pure Python; an optimizer which refactors out the common functionality of modern optimizers into two basic pieces, allowing optimization algorithms to be implemented in 4–5 lines of code; a novel 2-way callback system that can access any part of the data, model, or optimizer and change it at any point during training; a new data block API; and much more. We used this library to successfully create a complete deep learning course, which we were able to write more quickly than using previous approaches, and the code was more clear. The library is already in wide use in research, industry, and teaching.

...read moreread less

533 citations

Posted Content•

Fine-tuned Language Models for Text Classification.

[...]

Jeremy Howard, Sebastian Ruder

18 Jan 2018-arXiv: Computation and Language

TL;DR: Fine-tuned Language Models (FitLaM) is proposed, an effective transfer learning method that can be applied to any task in NLP, and techniques that are key for fine-tuning a state-of-the-art language model are introduced.

...read moreread less

Abstract: Transfer learning has revolutionized computer vision, but existing approaches in NLP still require task-specific modifications and training from scratch. We propose Fine-tuned Language Models (FitLaM), an effective transfer learning method that can be applied to any task in NLP, and introduce techniques that are key for fine-tuning a state-of-the-art language model. Our method significantly outperforms the state-of-the-art on five text classification tasks, reducing the error by 18-24% on the majority of datasets. We open-source our pretrained models and code to enable adoption by the community.

...read moreread less

405 citations

Posted Content•DOI•

Face Masks Against COVID-19: An Evidence Review

[...]

Jeremy Howard, Austin Huang, Zhiyuan Li, Zeynep Tufekci, Vladimir Zdimal, van der Westhuizen H-M., A von Delft, Amy Price, Lex Fridman, Tang L-H., Viola Tang, Gregory L. Watson, Christina E. Bax, Reshama Shaikh, Frederik Questier, Danny Hernandez, Larry F. Chu, Christina M. Ramirez, Anne W. Rimoin - Show less +15 more

12 Apr 2020

TL;DR: The preponderance of evidence indicates that mask wearing reduces the transmissibility per contact by reducing transmission of infected droplets in both laboratory and clinical contexts, and recommends the adoption of public cloth mask wearing, as an effective form of source control.

...read moreread less

Abstract: The science around the use of masks by the general public to impede COVID-19 transmission is advancing rapidly. Policymakers need guidance on how masks should be used by the general population to combat the COVID-19 pandemic. In this narrative review, we develop an analytical framework to examine mask usage, considering and synthesizing the relevant literature to inform multiple areas: population impact; transmission characteristics; source control; PPE; sociological considerations; and implementation considerations. A primary route of transmission of COVID-19 is via respiratory droplets, and is known to be transmissible from presymptomatic and asymptomatic individuals. Reducing disease spread requires two things: first, limit contacts of infected individuals via physical distancing and other measures, and second, reduce the transmission probability per contact. The preponderance of evidence indicates that mask wearing reduces the transmissibility per contact by reducing transmission of infected droplets in both laboratory and clinical contexts. Public mask wearing is most effective at reducing spread of the virus when compliance is high. The decreased transmissibility could substantially reduce the death toll and economic impact while the cost of the intervention is low. Given the current shortages of medical masks we recommend the adoption of public cloth mask wearing, as an effective form of source control, in conjunction with existing hygiene, distancing, and contact tracing strategies. Because many respiratory droplets become smaller due to evaporation, we recommend increasing focus on a previously overlooked aspect of mask usage: mask-wearing by infectious people ("source control") with benefits at the population-level, rather than mask-wearing by susceptible people, such as health-care workers, with focus on individual outcomes. We recommend that public officials and governments strongly encourage the use of widespread face masks in public, including the use of appropriate regulation.

...read moreread less

251 citations

1
2
3
4
…
5
6
7
8
9
10
11
12

Collapse

Cited by

PDF

Open Access

More filters

Posted Content•

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

[...]

Jacob Devlin¹, Ming-Wei Chang¹, Kenton Lee¹, Kristina Toutanova¹•Institutions (1)

Google¹

11 Oct 2018-arXiv: Computation and Language

TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

Abstract: We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT model can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications. BERT is conceptually simple and empirically powerful. It obtains new state-of-the-art results on eleven natural language processing tasks, including pushing the GLUE score to 80.5% (7.7% point absolute improvement), MultiNLI accuracy to 86.7% (4.6% absolute improvement), SQuAD v1.1 question answering Test F1 to 93.2 (1.5 point absolute improvement) and SQuAD v2.0 Test F1 to 83.1 (5.1 point absolute improvement).

...read moreread less

29,480 citations

Proceedings Article•DOI•

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

[...]

Jacob Devlin¹, Ming-Wei Chang¹, Kenton Lee¹, Kristina Toutanova¹•Institutions (1)

Google¹

11 Oct 2018

TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

Abstract: We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models (Peters et al., 2018a; Radford et al., 2018), BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT model can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications. BERT is conceptually simple and empirically powerful. It obtains new state-of-the-art results on eleven natural language processing tasks, including pushing the GLUE score to 80.5 (7.7 point absolute improvement), MultiNLI accuracy to 86.7% (4.6% absolute improvement), SQuAD v1.1 question answering Test F1 to 93.2 (1.5 point absolute improvement) and SQuAD v2.0 Test F1 to 83.1 (5.1 point absolute improvement).

...read moreread less

24,672 citations

Posted Content•

RoBERTa: A Robustly Optimized BERT Pretraining Approach

[...]

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Michael Lewis, Luke Zettlemoyer, Veselin Stoyanov - Show less +6 more

26 Jul 2019-arXiv: Computation and Language

TL;DR: It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.

...read moreread less

Abstract: Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging. Training is computationally expensive, often done on private datasets of different sizes, and, as we will show, hyperparameter choices have significant impact on the final results. We present a replication study of BERT pretraining (Devlin et al., 2019) that carefully measures the impact of many key hyperparameters and training data size. We find that BERT was significantly undertrained, and can match or exceed the performance of every model published after it. Our best model achieves state-of-the-art results on GLUE, RACE and SQuAD. These results highlight the importance of previously overlooked design choices, and raise questions about the source of recently reported improvements. We release our models and code.

...read moreread less

13,994 citations

Journal Article•

Human biochemical genetics

[...]

Grüneberg H

01 Jul 1960-The Eugenics Review

TL;DR: For the next few weeks the course is going to be exploring a field that’s actually older than classical population genetics, although the approach it’ll be taking to it involves the use of population genetic machinery.

...read moreread less

Abstract: So far in this course we have dealt entirely with the evolution of characters that are controlled by simple Mendelian inheritance at a single locus. There are notes on the course website about gametic disequilibrium and how allele frequencies change at two loci simultaneously, but we didn’t discuss them. In every example we’ve considered we’ve imagined that we could understand something about evolution by examining the evolution of a single gene. That’s the domain of classical population genetics. For the next few weeks we’re going to be exploring a field that’s actually older than classical population genetics, although the approach we’ll be taking to it involves the use of population genetic machinery. If you know a little about the history of evolutionary biology, you may know that after the rediscovery of Mendel’s work in 1900 there was a heated debate between the “biometricians” (e.g., Galton and Pearson) and the “Mendelians” (e.g., de Vries, Correns, Bateson, and Morgan). Biometricians asserted that the really important variation in evolution didn’t follow Mendelian rules. Height, weight, skin color, and similar traits seemed to

...read moreread less

9,847 citations

Posted Content•

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

[...]

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu - Show less +5 more

23 Oct 2019-arXiv: Learning

TL;DR: This systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks and achieves state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more.

...read moreread less

Abstract: Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

...read moreread less

6,953 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse