scispace - formally typeset
Search or ask a question

Showing papers by "Eric Gilbert published in 2015"


Proceedings Article
21 Apr 2015
TL;DR: CREDBANK is a corpus of tweets, topics, events and associated human credibility judgements designed to bridge the gap between machine and human computation in online information credibility in fields such as social science, data mining and health.
Abstract: Social media has quickly risen to prominence as a news source, yet lingering doubts remain about its ability to spread rumor and misinformation. Systematically studying this phenomenon, however, has been difficult due to the need to collect large-scale, unbiased data along with in-situ judgements of its accuracy. In this paper we present CREDBANK, a corpus designed to bridge this gap by systematically combining machine and human computation. Specifically, CREDBANK is a corpus of tweets, topics, events and associated human credibility judgements. It is based on the real-time tracking of more than 1 billion streaming tweets over a period of more than three months, computational summarizations of those tweets, and intelligent routings of the tweet streams to human annotators — within a few hours of those events unfolding on Twitter. In total CREDBANK comprises more than 60 million tweets grouped into 1049 real-world events, each annotated by 30 human annotators. As an example, with CREDBANK one can quickly calculate that roughly 24% of the events in the global tweet stream are not perceived as credible. We have made CREDBANK publicly available, and hope it will enable new research questions related to online information credibility in fields such as social science, data mining and health.

227 citations


Proceedings ArticleDOI
18 Apr 2015
TL;DR: It is found that screening workers for requisite cognitive aptitudes and providing training in qualitative coding techniques is quite effective, significantly outperforming control and baseline conditions and can improve coder annotation accuracy above and beyond common benchmark strategies such as Bayesian Truth Serum (BTS).
Abstract: In the past half-decade, Amazon Mechanical Turk has radically changed the way many scholars do research. The availability of a massive, distributed, anonymous crowd of individuals willing to perform general human-intelligence micro-tasks for micro-payments is a valuable resource for researchers and practitioners. This paper addresses the challenges of obtaining quality annotations for subjective judgment oriented tasks of varying difficulty. We design and conduct a large, controlled experiment (N=68,000) to measure the efficacy of selected strategies for obtaining high quality data annotations from non-experts. Our results point to the advantages of person-oriented strategies over process-oriented strategies. Specifically, we find that screening workers for requisite cognitive aptitudes and providing training in qualitative coding techniques is quite effective, significantly outperforming control and baseline conditions. Interestingly, such strategies can improve coder annotation accuracy above and beyond common benchmark strategies such as Bayesian Truth Serum (BTS).

105 citations


Proceedings Article
21 Apr 2015
TL;DR: A combination of large-scale data analysis and small scale in-depth interviews is presented to understand filter-work and suggest several practical implications such as designing filters for both serious and casual photographers or designing methods to prioritize and rank content in order to maximize engagement.
Abstract: A variety of simple graphical filters are available to camera phone users to enhance their photos on the fly; these filters often stylize, saturate or age a photo. In this paper, we present a combination of large-scale data analysis and small scale in-depth interviews to understand filter-work. We look at producers’ practices of photo filtering and gain insights in the roles filters play in engaging photo consumers’ by driving their social interactions. We first interviewed 15 Flickr mobile app users (photo producers) to understand their use and perception of filters. Next, we analyzed how filters affect a photo’s engagement (consumers’ perspective) using a corpus of 7.6 million Flickr photos. We find two groups of serious and casual photographers among filter users. The serious see filters as correction tools and prefer milder effects. Casual photographers, by contrast, use filters to significantly transform their photos with bolder effects. We also find that filtered photos are 21% more likely to be viewed and 45% more likely to be commented on by consumers of photographs. Specifically, filters that increase warmth, exposure and contrast boost engagement the most. Towards the ongoing research in social engagement and photo-work, these findings suggest several practical implications such as designing filters for both serious and casual photographers or designing methods to prioritize and rank content in order to maximize engagement.

84 citations


Proceedings ArticleDOI
18 Apr 2015
TL;DR: A 6-stage process for prototyping new social computing systems using existing online systems, such as Twitter or Facebook, which allows researchers to focus on what people do on their system rather than how to attract people to it.
Abstract: We propose a technique we call piggyback prototyping, a prototyping mechanism for designing new social computing systems on top of existing ones. Traditional HCI prototyping techniques do not translate well to large social computing systems. To address this gap, we describe a 6-stage process for prototyping new social computing systems using existing online systems, such as Twitter or Facebook. This allows researchers to focus on what people do on their system rather than how to attract people to it. We illustrate this technique with an instantiation on Twitter to pair people who are different from each other in airports. Even though there were many missed meetings, 53% of survey respondents would be interested in being matched again, and eight people even met in person. Through piggyback prototyping, we gained insight into the future design of this system. We conclude the paper with considerations for privacy, consent, volume of users, and evaluation metrics.

40 citations


Proceedings ArticleDOI
18 Apr 2015
TL;DR: The value of supplementary out-groups support from crowdsourced responders added to in-group support from a community of members is explored and it is found that out-group sources provide relatively rapid, concise responses with direct and structured information, socially appropriate coping strategies without compromising emotional value.
Abstract: Difficulty in navigating daily life can lead to frustration and decrease independence for people with autism. While they turn to online autism communities for information and advice for coping with everyday challenges, these communities may present only a limited perspective because of their in-group nature. Obtaining support from out-group sources beyond the in-group community may prove valuable in dealing with challenging situations such as public anxiety and workplace conflicts. In this paper, we explore the value of supplementary out-group support from crowdsourced responders added to in-group support from a community of members. We find that out-group sources provide relatively rapid, concise responses with direct and structured information, socially appropriate coping strategies without compromising emotional value. Using an autism community as a motivating example, we conclude by providing design implications for combining in-group and out-group resources that may enhance the question-and-answer experience.

32 citations


Journal ArticleDOI
06 Feb 2015-PLOS ONE
TL;DR: Drawing on a corpus of one million images crawled from Pinterest, it is found that color significantly impacts the diffusion of images and adoption of content on image sharing communities such as Pinterest, even after partially controlling for network structure and activity.
Abstract: Many lab studies have shown that colors can evoke powerful emotions and impact human behavior. Might these phenomena drive how we act online? A key research challenge for image-sharing communities is uncovering the mechanisms by which content spreads through the community. In this paper, we investigate whether there is link between color and diffusion. Drawing on a corpus of one million images crawled from Pinterest, we find that color significantly impacts the diffusion of images and adoption of content on image sharing communities such as Pinterest, even after partially controlling for network structure and activity. Specifically, Red, Purple and pink seem to promote diffusion, while Green, Blue, Black and Yellow suppress it. To our knowledge, our study is the first to investigate how colors relate to online user behavior. In addition to contributing to the research conversation surrounding diffusion, these findings suggest future work using sophisticated computer vision techniques. We conclude with a discussion on the theoretical, practical and design implications suggested by this work—e.g. design of engaging image filters.

29 citations


Proceedings Article
21 Apr 2015
TL;DR: A non-deterministic algorithm for generating homophones that create large numbers of false positives for censors, making it difficult to locate banned conversations and to build interactive, client-side tools that promote free speech.
Abstract: Like traditional media, social media in China is subject to censorship. However, in limited cases, activists have employed homophones of censored keywords to avoid detection by keyword matching algorithms. In this paper, we show that it is possible to scale this idea up in ways that make it difficult to defend against. Specifically, we present a non-deterministic algorithm for generating homophones that create large numbers of false positives for censors, making it difficult to locate banned conversations. In two experiments, we show that 1) homophone-transformed weibos posted to Sina Weibo remain on-site three times longer than their previously censored counterparts, and 2) native Chinese speakers can recover the original intent behind the homophone-transformed messages, with 99% of our posts understood by the majority of our participants. Finally, we find that coping with homophone transformations is likely to cost the Sina Weibo censorship apparatus an additional 15 hours of human labor per day, per censored keyword. To conclude, we reflect briefly on the opportunities presented by this algorithm to build interactive, client-side tools that promote free speech.

29 citations


Proceedings ArticleDOI
24 Aug 2015
TL;DR: Preliminary results indicate that providing dog quantimetric data to adopters through the use of a smartphone application could yield reduced rates of re-relinquishment and respondents indicated that they felt using the smartphone application helped them to better meet the activity needs of their dog and increased the bond between themselves and their newly adopted dog.
Abstract: We present the results of an 8-week pilot study with 55 dogs investigating whether using quantimetric monitors and a companion smartphone application can reduce returns and increase the perceived strength of bonds between newly adopted dogs from the Humane Society of Silicon Valley and their adopters. Through this pilot study, we developed guidelines for future research and discovered promising results indicating that providing dog quantimetric data to adopters through the use of a smartphone application could yield reduced rates of re-relinquishment. Additionally, respondents indicated that they felt using the smartphone application helped them to better meet the activity needs of their dog and increased the bond between themselves and their newly adopted dog.

21 citations


Proceedings ArticleDOI
18 Apr 2015
TL;DR: A novel technique called Open Book is introduced, Inspired by how people deal with eavesdroppers offline, that uses data mining and natural language processing to transform CMC messages into ones that are vaguer than the original.
Abstract: Both governments and corporations routinely surveil computer-mediated communication (CMC). Technologists often suggest widespread encryption as a defense mechanism, but CMC encryption schemes have historically faced significant usability and adoption problems. Here, we introduce a novel technique called Open Book designed to address these two problems. Inspired by how people deal with eavesdroppers offline, Open Book uses data mining and natural language processing to transform CMC messages into ones that are vaguer than the original. Specifically, we present: 1) a greedy Open Book algorithm that cloaks messages by transforming them to resemble the average Internet message; 2) an open-source, browser-based instantiation of it called Read Me, designed for Gmail; and, 3) a set of experiments showing that intended recipients can decode Open Book messages, but that unintended human- and machine-recipients cannot. Finally, we reflect on some open questions raised by this approach, such as recognizability and future side-channel attacks.

6 citations