scispace - formally typeset
Open AccessBook

Methods for Evaluating Interactive Information Retrieval Systems with Users

TLDR
In this paper, the authors provide an overview and instruction regarding the evaluation of interactive information retrieval systems with users and present core instruments and data collection techniques and measures, as well as a discussion of outstanding challenges and future research directions.
Abstract
This paper provides overview and instruction regarding the evaluation of interactive information retrieval systems with users The primary goal of this article is to catalog and compile material related to this topic into a single source This article (1) provides historical background on the development of user-centered approaches to the evaluation of interactive information retrieval systems; (2) describes the major components of interactive information retrieval system evaluation; (3) describes different experimental designs and sampling strategies; (4) presents core instruments and data collection techniques and measures; (5) explains basic data analysis techniques; and (4) reviews and discusses previous studies This article also discusses validity and reliability issues with respect to both measures and methods, presents background information on research ethics and discusses some ethical issues which are specific to studies of interactive information retrieval (IIR) Finally, this article concludes with a discussion of outstanding challenges and future research directions

read more

Citations
More filters
Journal ArticleDOI

Crowdsourcing for relevance evaluation

TL;DR: A new approach to evaluation called TERC is described, based on the crowdsourcing paradigm, in which many online users, drawn from a large community, each performs a small evaluation task.
Posted Content

Pretrained Transformers for Text Ranking: BERT and Beyond

TL;DR: This tutorial provides an overview of text ranking with neural network architectures known as transformers, of which BERT (Bidirectional Encoder Representations from Transformers) is the best-known example, and covers a wide range of techniques.
Journal ArticleDOI

Comparing the similarity of responses received from studies in Amazon's Mechanical Turk to studies conducted online and with direct recruitment.

TL;DR: There is a statistical difference between results obtained from participants recruited through AMT compared to the results from the participant recruited on campus or through online forums, and this may suggest that AMT is a viable and economical option for recruiting participants and for conducting studies.
Proceedings ArticleDOI

Towards a Fair Marketplace: Counterfactual Evaluation of the trade-off between Relevance, Fairness & Satisfaction in Recommendation Systems

TL;DR: This work proposes a number of recommendation policies which jointly optimize relevance and fairness, thereby achieving substantial improvement in supplier fairness without noticeable decline in user satisfaction, and considers user disposition towards fair content.
Proceedings ArticleDOI

Time-based calibration of effectiveness measures

TL;DR: This paper introduces a time-biased gain measure, which allows us to evaluate system performance in human terms, while maintaining the simplicity and repeatability of system-oriented tests, and examines properties of the measure, contrasting it to traditional effectiveness measures, and exploring its extension to other aspects and environments.
References
More filters
Book

Statistical Power Analysis for the Behavioral Sciences

TL;DR: The concepts of power analysis are discussed in this paper, where Chi-square Tests for Goodness of Fit and Contingency Tables, t-Test for Means, and Sign Test are used.
Book

Discovery of Grounded Theory: Strategies for Qualitative Research

TL;DR: The Discovery of Grounded Theory as mentioned in this paper is a book about the discovery of grounded theories from data, both substantive and formal, which is a major task confronting sociologists and is understandable to both experts and laymen.
Journal ArticleDOI

Common method biases in behavioral research: a critical review of the literature and recommended remedies.

TL;DR: The extent to which method biases influence behavioral research results is examined, potential sources of method biases are identified, the cognitive processes through which method bias influence responses to measures are discussed, the many different procedural and statistical techniques that can be used to control method biases is evaluated, and recommendations for how to select appropriate procedural and Statistical remedies are provided.

Perceived Usefulness, Perceived Ease of Use, and User

TL;DR: Regression analyses suggest that perceived ease of use may actually be a causal antecdent to perceived usefulness, as opposed to a parallel, direct determinant of system usage.
Related Papers (5)