Home
/
Authors
/
Matthew Lease

Author

Matthew Lease

Other affiliations: University of Washington, Brown University, Intel ...read more

Bio: Matthew Lease is an academic researcher from University of Texas at Austin. The author has contributed to research in topics: Crowdsourcing & Relevance (information retrieval). The author has an hindex of 36, co-authored 149 publications receiving 5098 citations. Previous affiliations of Matthew Lease include University of Washington & Brown University.

Papers published on a yearly basis

2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
1999

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

The future of crowd work

[...]

Aniket Kittur¹, Jeffrey V. Nickerson², Michael S. Bernstein³, Elizabeth M. Gerber⁴, Aaron Shaw⁴, John Zimmerman¹, Matthew Lease⁵, John Horton⁶ - Show less +4 more•Institutions (6)

Carnegie Mellon University¹, Stevens Institute of Technology², Stanford University³, Northwestern University⁴, University of Texas at Austin⁵, Harvard University⁶

23 Feb 2013

TL;DR: This paper outlines a framework that will enable crowd work that is complex, collaborative, and sustainable, and lays out research challenges in twelve major areas: workflow, task assignment, hierarchy, real-time response, synchronous collaboration, quality control, crowds guiding AIs, AIs guiding crowds, platforms, job design, reputation, and motivation.

...read moreread less

Abstract: Paid crowd work offers remarkable opportunities for improving productivity, social mobility, and the global economy by engaging a geographically distributed workforce to complete complex tasks on demand and at scale. But it is also possible that crowd work will fail to achieve its potential, focusing on assembly-line piecework. Can we foresee a future crowd workplace in which we would want our children to participate? This paper frames the major challenges that stand in the way of this goal. Drawing on theory from organizational behavior and distributed computing, as well as direct feedback from workers, we outline a framework that will enable crowd work that is complex, collaborative, and sustainable. The framework lays out research challenges in twelve major areas: workflow, task assignment, hierarchy, real-time response, synchronous collaboration, quality control, crowds guiding AIs, AIs guiding crowds, platforms, job design, reputation, and motivation.

...read moreread less

836 citations

Posted Content•

The Future of Crowd Work

[...]

Aniket Kittur¹, Jeffrey V. Nickerson², Michael S. Bernstein³, Elizabeth M. Gerber⁴, Aaron Shaw⁵, Aaron Shaw⁶, John Zimmerman¹, Matthew Lease⁷, John Horton⁸ - Show less +5 more•Institutions (8)

Carnegie Mellon University¹, Stevens Institute of Technology², Stanford University³, Northwestern University⁴, University of California, Berkeley⁵, Harvard University⁶, University of Texas at Austin⁷, New York University⁸

18 Dec 2012-Social Science Research Network

TL;DR: In this paper, the authors outline a framework that will enable crowd work that is complex, collaborative, and sustainable, and lay out research challenges in twelve major areas: workflow, task assignment, hierarchy, real-time response, synchronous collaboration, quality control, crowds guiding AIs, AIs guiding crowds, platforms, job design, reputation, and motivation.

...read moreread less

803 citations

Proceedings Article•DOI•

Improving bug localization using structured information retrieval

[...]

Ripon K. Saha¹, Matthew Lease¹, Sarfraz Khurshid¹, Dewayne E. Perry¹•Institutions (1)

University of Texas at Austin¹

11 Nov 2013

TL;DR: This work provides a thorough grounding of IR-based bug localization research in fundamental IR theoretical and empirical knowledge and practice and presents BLUiR, which embodies this insight, requires only the source code and bug reports, and takes advantage of bug similarity data if available.

...read moreread less

Abstract: Locating bugs is important, difficult, and expensive, particularly for large-scale systems To address this, natural language information retrieval techniques are increasingly being used to suggest potential faulty source files given bug reports While these techniques are very scalable, in practice their effectiveness remains low in accurately localizing bugs to a small number of files Our key insight is that structured information retrieval based on code constructs, such as class and method names, enables more accurate bug localization We present BLUiR, which embodies this insight, requires only the source code and bug reports, and takes advantage of bug similarity data if available We build BLUiR on a proven, open source IR toolkit that anyone can use Our work provides a thorough grounding of IR-based bug localization research in fundamental IR theoretical and empirical knowledge and practice We evaluate BLUiR on four open source projects with approximately 3,400 bugs Results show that BLUiR matches or outperforms a current state-of-the-art tool across applications considered, even when BLUiR does not use bug similarity data used by the other tool

...read moreread less

356 citations

Proceedings Article•

SQUARE: A Benchmark for Research on Computing Crowd Consensus

[...]

Aashish Sheshadri¹, Matthew Lease¹•Institutions (1)

University of Texas at Austin¹

03 Nov 2013

TL;DR: SQUARE, an open source shared task framework including benchmark datasets, defined tasks, standard metrics, and reference implementations with empirical results for several popular methods, is presented.

...read moreread less

Abstract: While many statistical consensus methods now exist, relatively little comparative benchmarking and integration of techniques has made it increasingly difficult to determine the current state-of-the-art, to evaluate the relative benefit of new methods, to understand where specific problems merit greater attention, and to measure field progress over time. To make such comparative evaluation easier for everyone, we present SQUARE, an open source shared task framework including benchmark datasets, defined tasks, standard metrics, and reference implementations with empirical results for several popular methods. In addition to measuring performance on a variety of public, real crowd datasets, the benchmark also varies supervision and noise by manipulating training size and labeling error. We envision SQUARE as dynamic and continually evolving, with new datasets and reference implementations being added according to community needs and interest. We invite community contributions and participation.

...read moreread less

200 citations

Proceedings Article•

Crowdsourcing Document Relevance Assessment with Mechanical Turk

[...]

Catherine Grady¹, Matthew Lease¹•Institutions (1)

University of Texas at Austin¹

06 Jun 2010

TL;DR: While results are largely inconclusive, they identify important obstacles encountered, lessons learned, related work, and interesting ideas for future investigation.

...read moreread less

Abstract: We investigate human factors involved in designing effective Human Intelligence Tasks (HITs) for Amazon's Mechanical Turk. In particular, we assess document relevance to search queries via MTurk in order to evaluate search engine accuracy. Our study varies four human factors and measures resulting experimental outcomes of cost, time, and accuracy of the assessments. While results are largely inconclusive, we identify important obstacles encountered, lessons learned, related work, and interesting ideas for future investigation. Experimental data is also made publicly available for further study by the community.

...read moreread less

153 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

Collapse

Cited by

PDF

Open Access

More filters

Using Multivariate Statistics

[...]

Diana Adler

01 Jan 2016

TL;DR: The using multivariate statistics is universally compatible with any devices to read, allowing you to get the most less latency time to download any of the authors' books like this one.

...read moreread less

Abstract: Thank you for downloading using multivariate statistics. As you may know, people have look hundreds times for their favorite novels like this using multivariate statistics, but end up in infectious downloads. Rather than reading a good book with a cup of tea in the afternoon, instead they juggled with some harmful bugs inside their laptop. using multivariate statistics is available in our digital library an online access to it is set as public so you can download it instantly. Our books collection saves in multiple locations, allowing you to get the most less latency time to download any of our books like this one. Merely said, the using multivariate statistics is universally compatible with any devices to read.

...read moreread less

14,604 citations

Journal Article•DOI•

Phd by thesis

[...]

Richard Lathe¹•Institutions (1)

French Institute of Health and Medical Research¹

01 Apr 1988-Nature

TL;DR: In this paper, a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) is presented.

...read moreread less

Abstract: Deposits of clastic carbonate-dominated (calciclastic) sedimentary slope systems in the rock record have been identified mostly as linearly-consistent carbonate apron deposits, even though most ancient clastic carbonate slope deposits fit the submarine fan systems better. Calciclastic submarine fans are consequently rarely described and are poorly understood. Subsequently, very little is known especially in mud-dominated calciclastic submarine fan systems. Presented in this study are a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) that reveals a >250 m thick calciturbidite complex deposited in a calciclastic submarine fan setting. Seven facies are recognised from core and thin section characterisation and are grouped into three carbonate turbidite sequences. They include: 1) Calciturbidites, comprising mostly of highto low-density, wavy-laminated bioclast-rich facies; 2) low-density densite mudstones which are characterised by planar laminated and unlaminated muddominated facies; and 3) Calcidebrites which are muddy or hyper-concentrated debrisflow deposits occurring as poorly-sorted, chaotic, mud-supported floatstones. These

...read moreread less

9,929 citations

Matrix Factorization Techniques for Recommender Systems

[...]

Patrick Seemann

01 Jan 2014

2,080 citations

Proceedings Article•DOI•

Domain Adaptation with Structural Correspondence Learning

[...]

John Blitzer¹, Ryan McDonald¹, Fernando Pereira¹•Institutions (1)

University of Pennsylvania¹

22 Jul 2006

TL;DR: This work introduces structural correspondence learning to automatically induce correspondences among features from different domains in order to adapt existing models from a resource-rich source domain to aresource-poor target domain.

...read moreread less

Abstract: Discriminative learning methods are widely used in natural language processing. These methods work best when their training and test data are drawn from the same distribution. For many NLP tasks, however, we are confronted with new domains in which labeled data is scarce or non-existent. In such cases, we seek to adapt existing models from a resource-rich source domain to a resource-poor target domain. We introduce structural correspondence learning to automatically induce correspondences among features from different domains. We test our technique on part of speech tagging and show performance gains for varying amounts of source and target training data, as well as improvements in target domain parsing accuracy using our improved tagger.

...read moreread less

1,672 citations

Journal Article•DOI•

Towards an integrated crowdsourcing definition

[...]

Enrique Estellés-Arolas¹, Fernando González-Ladrón-de-Guevara¹•Institutions (1)

Polytechnic University of Valencia¹

01 Apr 2012-Journal of Information Science

TL;DR: In this article, existing definitions of crowdsourcing are analysed to extract common elements and to establish the basic characteristics of any crowdsourcing initiative.

...read moreread less

Abstract: 'Crowdsourcing' is a relatively recent concept that encompasses many practices. This diversity leads to the blurring of the limits of crowdsourcing that may be identified virtually with any type of internet-based collaborative activity, such as co-creation or user innovation. Varying definitions of crowdsourcing exist, and therefore some authors present certain specific examples of crowdsourcing as paradigmatic, while others present the same examples as the opposite. In this article, existing definitions of crowdsourcing are analysed to extract common elements and to establish the basic characteristics of any crowdsourcing initiative. Based on these existing definitions, an exhaustive and consistent definition for crowdsourcing is presented and contrasted in 11 cases.

...read moreread less

1,616 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse