Showing papers by "Michael S. Bernstein published in 2012"
•
TL;DR: In this paper, the authors outline a framework that will enable crowd work that is complex, collaborative, and sustainable, and lay out research challenges in twelve major areas: workflow, task assignment, hierarchy, real-time response, synchronous collaboration, quality control, crowds guiding AIs, AIs guiding crowds, platforms, job design, reputation, and motivation.
Abstract: Paid crowd work offers remarkable opportunities for improving productivity, social mobility, and the global economy by engaging a geographically distributed workforce to complete complex tasks on demand and at scale. But it is also possible that crowd work will fail to achieve its potential, focusing on assembly-line piecework. Can we foresee a future crowd workplace in which we would want our children to participate? This paper frames the major challenges that stand in the way of this goal. Drawing on theory from organizational behavior and distributed computing, as well as direct feedback from workers, we outline a framework that will enable crowd work that is complex, collaborative, and sustainable. The framework lays out research challenges in twelve major areas: workflow, task assignment, hierarchy, real-time response, synchronous collaboration, quality control, crowds guiding AIs, AIs guiding crowds, platforms, job design, reputation, and motivation.
803 citations
••
11 Feb 2012
TL;DR: A website that collected the first large corpus of follower ratings on Twitter updates finds that users value information sharing and random thoughts above me-oriented or presence updates, and offers insight into evolving social norms.
Abstract: While microblog readers have a wide variety of reactions to the content they see, studies have tended to focus on extremes such as retweeting and unfollowing. To understand the broad continuum of reactions in-between, which are typically not shared publicly, we designed a website that collected the first large corpus of follower ratings on Twitter updates. Using our dataset of over 43,000 voluntary ratings, we find that nearly 36% of the rated tweets are worth reading, 25% are not, and 39% are middling. These results suggest that users tolerate a large amount of less-desired content in their feeds. We find that users value information sharing and random thoughts above me-oriented or presence updates. We also offer insight into evolving social norms, such as lack of context and misuse of @mentions and hashtags. We discuss implications for emerging practice and tool design.
120 citations
••
05 May 2012TL;DR: Tail Answers is introduced: a large collection of direct answers that are unpopular individually, but together address a large proportion of search traffic and suggest that search engines can be extended to directly respond to a large new class of queries.
Abstract: Web search engines now offer more than ranked results. Queries on topics like weather, definitions, and movies may return inline results called answers that can resolve a searcher's information need without any additional interaction. Despite the usefulness of answers, they are limited to popular needs because each answer type is manually authored. To extend the reach of answers to thousands of new information needs, we introduce Tail Answers: a large collection of direct answers that are unpopular individually, but together address a large proportion of search traffic. These answers cover long-tail needs such as the average body temperature for a dog, substitutes for molasses, and the keyboard shortcut for a right-click. We introduce a combination of search log mining and paid crowdsourcing techniques to create Tail Answers. A user study with 361 participants suggests that Tail Answers significantly improved users' subjective ratings of search quality and their ability to solve needs without clicking through to a result. Our findings suggest that search engines can be extended to directly respond to a large new class of queries.
110 citations
•
TL;DR: In this paper, the authors use queueing theory to analyze the retainer model for real-time crowdsourcing, in particular its expected wait time and cost to requesters, and propose and analyze three techniques to improve performance: push notifications, shared retainer pools, and precruitment.
Abstract: Realtime crowdsourcing research has demonstrated that it is possible to recruit paid crowds within seconds by managing a small, fast-reacting worker pool. Realtime crowds enable crowd-powered systems that respond at interactive speeds: for example, cameras, robots and instant opinion polls. So far, these techniques have mainly been proof-of-concept prototypes: research has not yet attempted to understand how they might work at large scale or optimize their cost/performance trade-offs. In this paper, we use queueing theory to analyze the retainer model for realtime crowdsourcing, in particular its expected wait time and cost to requesters. We provide an algorithm that allows requesters to minimize their cost subject to performance requirements. We then propose and analyze three techniques to improve performance: push notifications, shared retainer pools, and precruitment, which involves recalling retainer workers before a task actually arrives. An experimental validation finds that precruited workers begin a task 500 milliseconds after it is posted, delivering results below the one-second cognitive threshold for an end-user to stay in flow.
91 citations
••
11 Jan 2012TL;DR: The first, TweeQL, provides a streaming SQL-like interface to the Twitter API, making common tweet processing tasks simpler, and the second, TwitInfo, shows how end-users can interact with and understand aggregated data from the tweet stream, in addition to showcasing the power of theTweeQL language.
Abstract: Microblogs such as Twitter provide a valuable stream of diverse user-generated data. While the data extracted from Twitter is generally timely and accurate, the process by which developers extract structured data from the tweet stream is ad-hoc and requires reimplementation of common data manipulation primitives. In this paper, we present two systems for querying and extracting structure from Twitter-embedded data. The first, TweeQL, provides a streaming SQL-like interface to the Twitter API, making common tweet processing tasks simpler. The second, TwitInfo, shows how end-users can interact with and understand aggregated data from the tweet stream, in addition to showcasing the power of the TweeQL language. Together these systems show the richness of content that can be extracted from Twitter.
38 citations
01 Apr 2012
TL;DR: In this paper, the authors use queueing theory to analyze the retainer model for real-time crowdsourcing, in particular its expected wait time and cost to requesters, and propose and analyze three techniques to improve performance: push notications, shared retainer pools, and precruitment, which involves recalling retainer workers before a task actually arrives.
Abstract: Realtime crowdsourcing research has demonstrated that it is possible to recruit paid crowds within seconds by managing a small, fast-reacting worker pool. Realtime crowds enable crowd-powered systems that respond at interactive speeds: for example, cameras, robots and instant opinion polls. So far, these techniques have mainly been proof-of-concept prototypes: research has not yet attempted to understand how they might work at large scale or optimize their cost/performance trade-os. In this paper, we use queueing theory to analyze the retainer model for realtime crowdsourcing, in particular its expected wait time and cost to requesters. We provide an algorithm that allows requesters to minimize their cost subject to performance requirements. We then propose and analyze three techniques to improve performance: push notications, shared retainer pools, and precruitment, which involves recalling retainer workers before a task actually arrives. An experimental validation nds that precruited workers begin a task 500 milliseconds after it is posted, delivering results below the one-second cognitive threshold for an end-user to stay in ow.
38 citations
••
05 May 2012
TL;DR: A new venue is planned for CHI2013, where replicated studies can be submitted, presented, and discussed, and those who have begun using replication as a teaching method since RepliCHI at CHI2011 are invited to participate.
Abstract: At CHI2011 we ran a panel on how the CHI community handles the replicability of research and the reproducibility of findings. Careful scientific scholarship should build on firm foundations, which includes re-examining old evidences in the face of new findings. Yet, as a community that strives for novelty, we have very little motivation to look back and reconsider the validity of previous work. Thus, for CHI2013 we are planning a new venue, where replicated studies can be submitted, presented, and discussed. For CHI2012, we propose a SIG to discuss the preparations for how RepliCHI will work in its first year. We invite participation from those interested in setting an agenda for facilitating replication in HCI, including those who have begun using replication as a teaching method since RepliCHI at CHI2011.
26 citations
••
TL;DR: Computational techniques that decompose complex tasks into simpler, verifiable steps to improve quality, and optimize work to return results in seconds are introduced.
Abstract: Crowd-powered systems combine computation with human intelligence, drawn from large groups of people connecting and coordinating online. These hybrid systems enable applications and experiences that neither crowds nor computation could support alone.
Unfortunately, crowd work is error-prone and slow, making it difficult to incorporate crowds as first-order building blocks in software systems. I introduce computational techniques that decompose complex tasks into simpler, verifiable steps to improve quality, and optimize work to return results in seconds. These techniques develop crowdsourcing as a platform so that it is reliable and responsive enough to be used in interactive systems.
This thesis develops these ideas through a series of crowd-powered systems. The first, Soylent, is a word processor that uses paid micro-contributions to aid writing tasks such as text shortening and proofreading. Using Soylent is like having access to an entire editorial staff as you write. The second system, Adrenaline, is a camera that uses crowds to help amateur photographers capture the exact right moment for a photo. It finds the best smile and catches subjects in mid-air jumps, all in realtime. Moving beyond generic knowledge and paid crowds, I introduce techniques to motivate a social network that has specific expertise, and techniques to data mine crowd activity traces in support of a large number of uncommon user goals.
These systems point to a future where social and crowd intelligence are central elements of interaction, software, and computation. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)
25 citations
••
TL;DR: A snapshot of the most recent crowdsourcing research is provided, which shows how platforms' general availability has enabled researchers to recruit large numbers of participants for user studies, generate third-party content and assessments, or even build novel user experiences.
Abstract: Crowdsourcing involves outsourcing some job to a distributed group of people online, typically by breaking the job down into microtasks. Online markets offer human users payment for completing small tasks, or users can participate in nonpaid platforms such as games and volunteer sites. These platforms' general availability has enabled researchers to recruit large numbers of participants for user studies, generate third-party content and assessments, or even build novel user experiences. This special issue provides a snapshot of the most recent crowdsourcing research.
12 citations
••
05 May 2012
TL;DR: This paper proposes a hands-on event that takes the main benefits of a workshop and allows time to focus on developing ideas into actual outputs: experiment designs, in-depth thoughts on wicked problems, paper or coded prototypes.
Abstract: The field of collective intelligence - encompassing aspects of crowdsourcing, human computation, and social computing - is having tremendous impact on our lives, and the fields are rapidly growing. We propose a hands-on event that takes the main benefits of a workshop - provocative discussion and community building - and allows time to focus on developing ideas into actual outputs: experiment designs, in-depth thoughts on wicked problems, paper or coded prototypes. We will bring together researchers to discuss future visions and make tangible headway on those visions, as well as seeding collaboration. The outputs from brainstorming, discussion, and building will persist after the workshop for attendees and the community to view, and will be written up.
3 citations
••
05 May 2012
TL;DR: This SIG will explore reviewing through a critical and constructive lens, discussing current successes and future opportunities in the CHI review process, and actionable conclusions about ways to improve the system will be drawn.
Abstract: The HCI research community grows bigger each year, refining and expanding its boundaries in new ways. The ability to effectively review submissions is critical to the growth of CHI and related conferences. The review process is designed to produce a consistent supply of fair, high-quality reviews without overloading individual reviewers; yet, after each cycle, concerns are raised about limitations of the process. Every year, participants are left wondering why their papers were not accepted (or why they were). This SIG will explore reviewing through a critical and constructive lens, discussing current successes and future opportunities in the CHI review process. Goals will include actionable conclusions about ways to improve the system, potential alternative peer models, and the creation of materials to educate newcomer reviewers.
••
05 May 2012
TL;DR: This panel brings together scholars who study deviance and failure in diverse social computing systems to examine four design-related themes that contribute to and support these problematic uses: theft, anonymity, deviance, and polarization.
Abstract: Social computing technologies are pervasive in our work, relationships, and culture. Despite their promise for transforming the structure of communication and human interaction, the complex social dimensions of these technological systems often reproduce offline social ills or create entirely novel forms of conflict and deviance. This panel brings together scholars who study deviance and failure in diverse social computing systems to examine four design-related themes that contribute to and support these problematic uses: theft, anonymity, deviance, and polarization.
22 Mar 2012
TL;DR: Taking first steps in the Twitterverse can be a nerve-wrecking experience with new users unsure what thoughts to tweet to the world, so three experts attempt to fill the void and give some insights into what makes interesting and valuable microblog content.
Abstract: Taking first steps in the Twitterverse can be a nerve-wrecking experience with new users unsure what thoughts to tweet to the world. Here, Paul Andre, Michael Bernstein and Kurt Luther attempt to fill the void and give some insights into what makes interesting and valuable microblog content.