scispace - formally typeset
Search or ask a question
Author

Miles Brundage

Other affiliations: OpenAI, University of Oxford
Bio: Miles Brundage is an academic researcher from Arizona State University. The author has contributed to research in topics: Computer science & Deep learning. The author has an hindex of 12, co-authored 22 publications receiving 2666 citations. Previous affiliations of Miles Brundage include OpenAI & University of Oxford.

Papers
More filters
Journal ArticleDOI
TL;DR: Deep reinforcement learning (DRL) is poised to revolutionize the field of artificial intelligence (AI) and represents a step toward building autonomous systems with a higher-level understanding of the visual world as discussed by the authors.
Abstract: Deep reinforcement learning (DRL) is poised to revolutionize the field of artificial intelligence (AI) and represents a step toward building autonomous systems with a higherlevel understanding of the visual world. Currently, deep learning is enabling reinforcement learning (RL) to scale to problems that were previously intractable, such as learning to play video games directly from pixels. DRL algorithms are also applied to robotics, allowing control policies for robots to be learned directly from camera inputs in the real world. In this survey, we begin with an introduction to the general field of RL, then progress to the main streams of value-based and policy-based methods. Our survey will cover central algorithms in deep RL, including the deep Q-network (DQN), trust region policy optimization (TRPO), and asynchronous advantage actor critic. In parallel, we highlight the unique advantages of deep neural networks, focusing on visual understanding via RL. To conclude, we describe several current areas of research within the field.

1,743 citations

Journal ArticleDOI
TL;DR: This survey will cover central algorithms in deep RL, including the deep Q-network (DQN), trust region policy optimization (TRPO), and asynchronous advantage actor critic, and highlight the unique advantages of deep neural networks, focusing on visual understanding via RL.
Abstract: Deep reinforcement learning is poised to revolutionise the field of AI and represents a step towards building autonomous systems with a higher level understanding of the visual world. Currently, deep learning is enabling reinforcement learning to scale to problems that were previously intractable, such as learning to play video games directly from pixels. Deep reinforcement learning algorithms are also applied to robotics, allowing control policies for robots to be learned directly from camera inputs in the real world. In this survey, we begin with an introduction to the general field of reinforcement learning, then progress to the main streams of value-based and policy-based methods. Our survey will cover central algorithms in deep reinforcement learning, including the deep $Q$-network, trust region policy optimisation, and asynchronous advantage actor-critic. In parallel, we highlight the unique advantages of deep neural networks, focusing on visual understanding via reinforcement learning. To conclude, we describe several current areas of research within the field.

1,707 citations

Posted ContentDOI
TL;DR: The following organisations are named on the report: Future of Humanity Institute, University of Oxford, Centre for the Study of Existential Risk, Universityof Cambridge, Center for a New American Security, Electronic Frontier Foundation, OpenAI.
Abstract: Artificial intelligence and machine learning capabilities are growing at an unprecedented rate. These technologies have many widely beneficial applications, ranging from machine translation to medical image analysis. Countless more such applications are being developed and can be expected over the long term. Less attention has historically been paid to the ways in which artificial intelligence can be used maliciously. This report surveys the landscape of potential security threats from malicious uses of artificial intelligence technologies, and proposes ways to better forecast, prevent, and mitigate these threats. We analyze, but do not conclusively resolve, the question of what the long-term equilibrium between attackers and defenders will be. We focus instead on what sorts of attacks we are likely to see soon if adequate defenses are not developed.

445 citations

Posted Content
TL;DR: This report discusses OpenAI's work related to the release of its GPT-2 language model and discusses staged release, which allows time between model releases to conduct risk and benefit analyses as model sizes increased.
Abstract: Large language models have a range of beneficial uses: they can assist in prose, poetry, and programming; analyze dataset biases; and more. However, their flexibility and generative capabilities also raise misuse concerns. This report discusses OpenAI's work related to the release of its GPT-2 language model. It discusses staged release, which allows time between model releases to conduct risk and benefit analyses as model sizes increased. It also discusses ongoing partnership-based research and provides recommendations for better coordination and responsible publication in AI.

221 citations

Posted Content
TL;DR: This report suggests various steps that different stakeholders can take to improve the verifiability of claims made about AI systems and their associated development processes, with a focus on providing evidence about the safety, security, fairness, and privacy protection of AI systems.
Abstract: With the recent wave of progress in artificial intelligence (AI) has come a growing awareness of the large-scale impacts of AI systems, and recognition that existing regulations and norms in industry and academia are insufficient to ensure responsible AI development. In order for AI developers to earn trust from system users, customers, civil society, governments, and other stakeholders that they are building AI responsibly, they will need to make verifiable claims to which they can be held accountable. Those outside of a given organization also need effective means of scrutinizing such claims. This report suggests various steps that different stakeholders can take to improve the verifiability of claims made about AI systems and their associated development processes, with a focus on providing evidence about the safety, security, fairness, and privacy protection of AI systems. We analyze ten mechanisms for this purpose--spanning institutions, software, and hardware--and make recommendations aimed at implementing, exploring, or improving those mechanisms.

191 citations


Cited by
More filters
Proceedings Article
28 May 2020
TL;DR: GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic.
Abstract: Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. At the same time, we also identify some datasets where GPT-3's few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora. Finally, we find that GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans. We discuss broader societal impacts of this finding and of GPT-3 in general.

10,132 citations

Journal ArticleDOI
Eric J. Topol1
TL;DR: Over time, marked improvements in accuracy, productivity, and workflow will likely be actualized, but whether that will be used to improve the patient–doctor relationship or facilitate its erosion remains to be seen.
Abstract: The use of artificial intelligence, and the deep-learning subtype in particular, has been enabled by the use of labeled big data, along with markedly enhanced computing power and cloud storage, across all sectors. In medicine, this is beginning to have an impact at three levels: for clinicians, predominantly via rapid, accurate image interpretation; for health systems, by improving workflow and the potential for reducing medical errors; and for patients, by enabling them to process their own data to promote health. The current limitations, including bias, privacy and security, and lack of transparency, along with the future directions of these applications will be discussed in this article. Over time, marked improvements in accuracy, productivity, and workflow will likely be actualized, but whether that will be used to improve the patient-doctor relationship or facilitate its erosion remains to be seen.

2,574 citations

Posted Content
TL;DR: This article showed that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches.
Abstract: Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. At the same time, we also identify some datasets where GPT-3's few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora. Finally, we find that GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans. We discuss broader societal impacts of this finding and of GPT-3 in general.

1,886 citations

Proceedings ArticleDOI
04 Mar 2022
TL;DR: The results show that fine-tuning with human feedback is a promising direction for aligning language models with human intent and showing improvements in truthfulness and reductions in toxic output generation while having minimal performance regressions on public NLP datasets.
Abstract: Making language models bigger does not inherently make them better at following a user's intent. For example, large language models can generate outputs that are untruthful, toxic, or simply not helpful to the user. In other words, these models are not aligned with their users. In this paper, we show an avenue for aligning language models with user intent on a wide range of tasks by fine-tuning with human feedback. Starting with a set of labeler-written prompts and prompts submitted through the OpenAI API, we collect a dataset of labeler demonstrations of the desired model behavior, which we use to fine-tune GPT-3 using supervised learning. We then collect a dataset of rankings of model outputs, which we use to further fine-tune this supervised model using reinforcement learning from human feedback. We call the resulting models InstructGPT. In human evaluations on our prompt distribution, outputs from the 1.3B parameter InstructGPT model are preferred to outputs from the 175B GPT-3, despite having 100x fewer parameters. Moreover, InstructGPT models show improvements in truthfulness and reductions in toxic output generation while having minimal performance regressions on public NLP datasets. Even though InstructGPT still makes simple mistakes, our results show that fine-tuning with human feedback is a promising direction for aligning language models with human intent.

1,704 citations

Journal Article
TL;DR: Haidt as mentioned in this paper argues that the visceral reaction to competing ideologies is a subconscious, rather than leaned, reaction that evolved over human evolution to innate senses of suffering, fairness, cheating and disease, and that moral foundations facilitated intra-group cooperation which in turn conferred survival advantages over other groups.
Abstract: The Righteous Mind: Why Good People Are Divided by Politics and Religion Jonathan Haidt Pantheon Books, 2012One has likely heard that, for the sake of decorum, religion and politics should never be topics of conversation with strangers. Even amongst friends or even when it is known that others hold opposing political or religious views, why is it that discussion of religion and politics leads to visceral-level acrimony and that one's views are right and the other's views are wrong? Professor Jonathan Haidt of the University of Virginia examines the psychological basis of our "righteous minds" without resorting to any of the pejorative labeling that is usually found in a book on politics and religion and eschews a purely comparative approach. Haidt proposes the intriguing hypothesis that our visceral reaction to competing ideologies is a subconscious, rather than leaned, reaction that evolved over human evolution to innate senses of suffering, fairness, cheating and disease, and that moral foundations facilitated intra-group cooperation which in turn conferred survival advantages over other groups. These psychological mechanisms are genetic in origin and not necessarily amenable to rational and voluntary control - this is in part the reason debating one's ideological opposite more often leads to frustration rather than understanding. Haidt also suggests that morality is based on six "psychological systems" or foundations (Moral Foundations Theory), similar to the hypothesized adaptive mental modules which evolved to solve specific problems of survival in the human ancestral environment.While decorum pleads for more civility, it would be better, as Haidt suggests, dragging the issue of partisan politics out into the open in order to understand it and work around our righteous minds. Haidt suggests a few methods by which the level of rhetoric in American politics can be reduced, such that the political parties can at least be cordial as they have been in the past and work together to solve truly pressing social problems.There are a number of fascinating points raised in the current book, but most intriguing is the one that morality, ideology and religion are products of group selection, as adaptations that increased individual cooperation and suppressed selfishness, thereby increasing individual loyalty to the group. That morality, political ideology and religion buttress group survival is probably highly intuitive. However, given the contemporary focus on the individual as the source of adaptations, to the exclusion of all else, to suggest that adaptations such as religion and political ideology arose to enhance survival of groups is heresy or, as Haidt recounts, "foolishness". While previous rejection of group selection itself was due in part to conceptual issues, one could also point out the prevailing individualist social sentiment, "selfish gene" mentality and unrelenting hostility against those who supported the view that group selection did indeed apply to humans and not just to insects. Haidt gives a lengthy and convincing defense of group selection, his main point being that humans can pursue self- interest at the same time they promote self-interest within a group setting - humans are "90 percent chimp, 10 percent bees". One can readily observe in the news and entertainment mediate that religion is a frequent target of derision, even within the scientific community - Haidt points to the strident contempt that the "New Atheists" hold for religion. They claim that religion is purely a by-product of an adaptive psychological trait and as a mere by-product religion serves no useful purpose. However, the religious "sense" has somehow managed to persist in the human psyche. One explanation by the New Atheists of how religion propagated itself is that it is a "parasite" or "virus" which latches onto a susceptible host and induces the host to "infect" others. As a "virus" or "parasite" that is merely interested in its own survival, religion causes people to perform behaviors that do not increase their own reproductive fitness and may even be detrimental to survival, but religion spreads nonetheless. …

1,388 citations