scispace - formally typeset
Search or ask a question
Author

Joon Sung Park

Bio: Joon Sung Park is an academic researcher from Stanford University. The author has contributed to research in topics: Computer science & Social media. The author has an hindex of 5, co-authored 10 publications receiving 104 citations. Previous affiliations of Joon Sung Park include University of Illinois at Urbana–Champaign.

Papers
More filters
Posted Content
Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ B. Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri S. Chatterji, Annie Chen, Kathleen Creel, Jared Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel1, Noah D. Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Ahmad Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, Aditi Raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf H. Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Yang Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang 
TL;DR: The authors provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles(e. g.g. model architectures, training procedures, data, systems, security, evaluation, theory) to their applications.
Abstract: AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles(e.g., model architectures, training procedures, data, systems, security, evaluation, theory) to their applications (e.g., law, healthcare, education) and societal impact (e.g., inequity, misuse, economic and environmental impact, legal and ethical considerations). Though foundation models are based on standard deep learning and transfer learning, their scale results in new emergent capabilities,and their effectiveness across so many tasks incentivizes homogenization. Homogenization provides powerful leverage but demands caution, as the defects of the foundation model are inherited by all the adapted models downstream. Despite the impending widespread deployment of foundation models, we currently lack a clear understanding of how they work, when they fail, and what they are even capable of due to their emergent properties. To tackle these questions, we believe much of the critical research on foundation models will require deep interdisciplinary collaboration commensurate with their fundamentally sociotechnical nature.

76 citations

Journal ArticleDOI
TL;DR: In this paper , the authors describe an architecture that extends a large language model to store a complete record of the agent's experiences using natural language, synthesize those memories over time into higher-level reflections, and retrieve them dynamically to plan behavior.
Abstract: Believable proxies of human behavior can empower interactive applications ranging from immersive environments to rehearsal spaces for interpersonal communication to prototyping tools. In this paper, we introduce generative agents--computational software agents that simulate believable human behavior. Generative agents wake up, cook breakfast, and head to work; artists paint, while authors write; they form opinions, notice each other, and initiate conversations; they remember and reflect on days past as they plan the next day. To enable generative agents, we describe an architecture that extends a large language model to store a complete record of the agent's experiences using natural language, synthesize those memories over time into higher-level reflections, and retrieve them dynamically to plan behavior. We instantiate generative agents to populate an interactive sandbox environment inspired by The Sims, where end users can interact with a small town of twenty five agents using natural language. In an evaluation, these generative agents produce believable individual and emergent social behaviors: for example, starting with only a single user-specified notion that one agent wants to throw a Valentine's Day party, the agents autonomously spread invitations to the party over the next two days, make new acquaintances, ask each other out on dates to the party, and coordinate to show up for the party together at the right time. We demonstrate through ablation that the components of our agent architecture--observation, planning, and reflection--each contribute critically to the believability of agent behavior. By fusing large language models with computational, interactive agents, this work introduces architectural and interaction patterns for enabling believable simulations of human behavior.

69 citations

Proceedings ArticleDOI
07 Feb 2022
TL;DR: A deep learning architecture that models every annotator in a dataset, samples from annotators’ models to populate the jury, then runs inference to classify enables juries that dynamically adapt their composition, explore counterfactuals, and visualize dissent.
Abstract: Whose labels should a machine learning (ML) algorithm learn to emulate? For ML tasks ranging from online comment toxicity to misinformation detection to medical diagnosis, different groups in society may have irreconcilable disagreements about ground truth labels. Supervised ML today resolves these label disagreements implicitly using majority vote, which overrides minority groups’ labels. We introduce jury learning, a supervised ML approach that resolves these disagreements explicitly through the metaphor of a jury: defining which people or groups, in what proportion, determine the classifier’s prediction. For example, a jury learning model for online toxicity might centrally feature women and Black jurors, who are commonly targets of online harassment. To enable jury learning, we contribute a deep learning architecture that models every annotator in a dataset, samples from annotators’ models to populate the jury, then runs inference to classify. Our architecture enables juries that dynamically adapt their composition, explore counterfactuals, and visualize dissent. A field evaluation finds that practitioners construct diverse juries that alter 14% of classification outcomes.

61 citations

Proceedings ArticleDOI
06 May 2021
TL;DR: In this paper, the authors investigate the prevalence and impact of pilot-passenger user relationships in a Mechanical Turk survey and a series of one-hour interviews and conclude with design recommendations to improve passenger and pilot user experience.
Abstract: With a plethora of off-the-shelf smart home devices available commercially, people are increasingly taking a do-it-yourself approach to configuring their smart homes. While this allows for customization, users responsible for smart home configuration often end up with more control over the devices than other household members. This separates those who introduce new functionality to the smart home (pilot users) from those who do not (passenger users). To investigate the prevalence and impact of pilot-passenger user relationships, we conducted a Mechanical Turk survey and a series of one-hour interviews. Our results suggest that pilot-passenger relationships are common in multi-user households and shape how people form habits around devices. We find from interview data that smart homes reflect the values of their pilot users, making it harder for passenger users to incorporate their devices into daily life. We conclude the paper with design recommendations to improve passenger and pilot user experience.

31 citations

Proceedings ArticleDOI
21 Apr 2020
TL;DR: A qualitative study through interviews with finsta users and content analysis of video bloggers exposing their finsta on YouTube found that one way that young people deal with mounting social pressures is by reconfiguring online platforms and changing their purposes, norms, expectations, and currencies.
Abstract: Among many young people, the creation of a finsta-a portmanteau of "fake" and "Instagram" which describes secondary Instagram accounts-provides an outlet to share emotional, low-quality, or indecorous content with their close friends. To study why people create and maintain finstas, we conducted a qualitative study through interviews with finsta users and content analysis of video bloggers exposing their finsta on YouTube. We found that one way that young people deal with mounting social pressures is by reconfiguring online platforms and changing their purposes, norms, expectations, and currencies. Carving out smaller spaces accessible only to close friends allows users the opportunity for a more unguarded, vulnerable, and unserious performance. Drawing on feminist theory, we term this process intimate reconfiguration. Through this reconfiguration finsta users repurpose an existing and widely-used social platform to create opportunities for more meaningful and reciprocal forms of social support.

28 citations


Cited by
More filters
Posted Content
TL;DR: This work conducts mixed-method user studies on three datasets, where an AI with accuracy comparable to humans helps participants solve a task (explaining itself in some conditions), and observes complementary improvements from AI augmentation that were not increased by explanations.
Abstract: Increasingly, organizations are pairing humans with AI systems to improve decision-making and reducing costs Proponents of human-centered AI argue that team performance can even further improve when the AI model explains its recommendations However, a careful analysis of existing literature reveals that prior studies observed improvements due to explanations only when the AI, alone, outperformed both the human and the best human-AI team This raises an important question: can explanations lead to complementary performance, ie, with accuracy higher than both the human and the AI working alone? We address this question by devising comprehensive studies on human-AI teaming, where participants solve a task with help from an AI system without explanations and from one with varying types of AI explanation support We carefully controlled to ensure comparable human and AI accuracy across experiments on three NLP datasets (two for sentiment analysis and one for question answering) While we found complementary improvements from AI augmentation, they were not increased by state-of-the-art explanations compared to simpler strategies, such as displaying the AI's confidence We show that explanations increase the chance that humans will accept the AI's recommendation regardless of whether the AI is correct While this clarifies the gains in team performance from explanations in prior work, it poses new challenges for human-centered AI: how can we best design systems to produce complementary performance? Can we develop explanatory approaches that help humans decide whether and when to trust AI input?

207 citations

Proceedings ArticleDOI
06 May 2021
TL;DR: In this paper, the authors conduct mixed-method user studies on three datasets, where an AI with accuracy comparable to humans helps participants solve a task (explaining itself in some conditions).
Abstract: Many researchers motivate explainable AI with studies showing that human-AI team performance on decision-making tasks improves when the AI explains its recommendations. However, prior studies observed improvements from explanations only when the AI, alone, outperformed both the human and the best team. Can explanations help lead to complementary performance, where team accuracy is higher than either the human or the AI working solo? We conduct mixed-method user studies on three datasets, where an AI with accuracy comparable to humans helps participants solve a task (explaining itself in some conditions). While we observed complementary improvements from AI augmentation, they were not increased by explanations. Rather, explanations increased the chance that humans will accept the AI’s recommendation, regardless of its correctness. Our result poses new challenges for human-centered AI: Can we develop explanatory approaches that encourage appropriate trust in AI, and therefore help generate (or improve) complementary performance?

118 citations

Journal ArticleDOI
28 May 2020
TL;DR: YouTube still has a long way to go to mitigate misinformation on its platform and a filter bubble effect, both in the Top 5 and Up-Next recommendations for all topics, except vaccine controversies; for these topics, watching videos that promote misinformation leads to more misinformative video recommendations.
Abstract: Search engines are the primary gateways of information. Yet, they do not take into account the credibility of search results. There is a growing concern that YouTube, the second largest search engine and the most popular video-sharing platform, has been promoting and recommending misinformative content for certain search topics. In this study, we audit YouTube to verify those claims. Our audit experiments investigate whether personalization (based on age, gender, geolocation, or watch history) contributes to amplifying misinformation. After shortlisting five popular topics known to contain misinformative content and compiling associated search queries representing them, we conduct two sets of audits-Search-and Watch-misinformative audits. Our audits resulted in a dataset of more than 56K videos compiled to link stance (whether promoting misinformation or not) with the personalization attribute audited. Our videos correspond to three major YouTube components: search results, Up-Next, and Top 5 recommendations. We find that demographics, such as, gender, age, and geolocation do not have a significant effect on amplifying misinformation in returned search results for users with brand new accounts. On the other hand, once a user develops a watch history, these attributes do affect the extent of misinformation recommended to them. Further analyses reveal a filter bubble effect, both in the Top 5 and Up-Next recommendations for all topics, except vaccine controversies; for these topics, watching videos that promote misinformation leads to more misinformative video recommendations. In conclusion, YouTube still has a long way to go to mitigate misinformation on its platform.

109 citations

Journal ArticleDOI
22 Apr 2021
TL;DR: In this article, cognitive forcing interventions are designed to make people engage more thoughtfully with the AI-generated explanations, and the results demonstrate that cognitive forcing significantly reduced overreliance compared to the simple explainable AI approaches.
Abstract: People supported by AI-powered decision support tools frequently overrely on the AI: they accept an AI's suggestion even when that suggestion is wrong. Adding explanations to the AI decisions does not appear to reduce the overreliance and some studies suggest that it might even increase it. Informed by the dual-process theory of cognition, we posit that people rarely engage analytically with each individual AI recommendation and explanation, and instead develop general heuristics about whether and when to follow the AI suggestions. Building on prior research on medical decision-making, we designed three cognitive forcing interventions to compel people to engage more thoughtfully with the AI-generated explanations. We conducted an experiment (N=199), in which we compared our three cognitive forcing designs to two simple explainable AI approaches and to a no-AI baseline. The results demonstrate that cognitive forcing significantly reduced overreliance compared to the simple explainable AI approaches. However, there was a trade-off: people assigned the least favorable subjective ratings to the designs that reduced the overreliance the most. To audit our work for intervention-generated inequalities, we investigated whether our interventions benefited equally people with different levels of Need for Cognition (i.e., motivation to engage in effortful mental activities). Our results show that, on average, cognitive forcing interventions benefited participants higher in Need for Cognition more. Our research suggests that human cognitive motivation moderates the effectiveness of explainable AI solutions.

82 citations