scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Machine Advice with a Warning about Machine Limitations: Experimentally Testing the Solution Mandated by the Wisconsin Supreme Court

23 Mar 2021-Journal of Legal Analysis (Oxford Academic)-Vol. 13, Iss: 1, pp 284-340
TL;DR: This article found that the decision to grant bail is also unaffected by the warnings mandated by the Supreme Court if participants do not first decide without knowing the machine prediction, but the effect is counterproductive: they follow the advice less if it actually is closer to ground truth.
Abstract: The Wisconsin Supreme Court allows machine advice in the courtroom only if accompanied by a series of warnings. We test 878 US lay participants with jury experience on fifty past cases where we know ground truth. The warnings affect their estimates of the likelihood of recidivism and their confidence, but not their decision whether to grant bail. Participants do not get better at identifying defendants who recidivated during the next two years. Results are essentially the same if participants are warned in easily accessible language, and if they are additionally informed about the low accuracy of machine predictions. The decision to grant bail is also unaffected by the warnings mandated by the Supreme Court if participants do not first decide without knowing the machine prediction. Oversampling cases where defendants committed violent crime does not change results either, whether coupled with machine predictions for general or for violent crime. Giving participants feedback and incentivizing them for finding ground truth has a small, weakly significant effect. The effect becomes significant at conventional levels when additionally using strong graphical warnings. Then participants are less likely to follow the advice. But the effect is counterproductive: they follow the advice less if it actually is closer to ground truth.
Citations
More filters
Journal ArticleDOI
TL;DR: It is argued that certain characteristics of machine- learning generate tensions with rule-of-law principles and that machine-learning predictions can be valuable instruments in some decision-making contexts but constitute a threat to fundamental values in others.
Abstract: The technological tool du jour is known as “machine learning,” a powerful form of data mining that uses mathematical algorithms to construct computer models that provide hidden insights by extracting patterns from enormous historical data sets, often for the purpose of making predictions about the future. Machine learning is all around us — it is used for spam filters, facial recognition, the detection of bank fraud and much more — and it is immensely powerful. It can analyze enormous amounts of information and extract relationships in the data that no human would ever discover. Despite its promise, there are reasons to remain skeptical of using machine learning predictions. Existing critiques of machine learning usually focus on one of two types of concerns — one identifies and aims to address the many potential pitfalls that might result in inaccurate models and the other assesses machine learning’s consistency with norms such as transparency, accountability, and due process. This paper takes a step back from the nuts and bolts questions surrounding the implementation of predictive analytics to consider whether and when it is appropriate to use machine learning to make government decisions in the contexts of national security and law enforcement. It argues that certain characteristics of machine-learning generate tensions with rule-of-law principles and that, as a result, machine-learning predictions can be valuable instruments in some decision-making contexts but constitute a threat to fundamental values in others. The paper concludes that government actors should exploit the benefits of machine learning when they enjoy broad decision-making discretion in making decisions, while eschewing it when government discretion is highly constrained.

11 citations

Proceedings ArticleDOI
08 Sep 2022
TL;DR: It is demonstrated that in practice decision aids that are not complementary, but make errors similar to human ones may have their own benefits, and that people perceive more similar decision aids as more useful, accurate, and predictable.
Abstract: Machine learning algorithms are increasingly used to assist human decision-making. When the goal of machine assistance is to improve the accuracy of human decisions, it might seem appealing to design ML algorithms that complement human knowledge. While neither the algorithm nor the human are perfectly accurate, one could expect that their complementary expertise might lead to improved outcomes. In this study, we demonstrate that in practice decision aids that are not complementary, but make errors similar to human ones may have their own benefits. In a series of human-subject experiments with a total of 901 participants, we study how the similarity of human and machine errors influences human perceptions of and interactions with algorithmic decision aids. We find that (i) people perceive more similar decision aids as more useful, accurate, and predictable, and that (ii) people are more likely to take opposing advice from more similar decision aids, while (iii) decision aids that are less similar to humans have more opportunities to provide opposing advice, resulting in a higher influence on people’s decisions overall.

3 citations

Journal ArticleDOI
14 Sep 2022-Laws
TL;DR: In this paper , a questionnaire was developed, and the results were analyzed using partial least squares structural equation modeling (PLS-SEM) to understand people's attitudes towards legal technologies in courts and to specify the potential differences in the attitudes of people with court experience vs. those without it, in the legal profession vs other, male vs female, and younger vs. older.
Abstract: Courts are high-stakes environments; thus, the impact of implementing legal technologies is not limited to the people directly using the technologies. However, the existing empirical data is insufficient to navigate and anticipate the acceptance of legal technologies in courts. This study aims to provide evidence for a technology acceptance model in order to understand people’s attitudes towards legal technologies in courts and to specify the potential differences in the attitudes of people with court experience vs. those without it, in the legal profession vs. other, male vs. female, and younger vs. older. A questionnaire was developed, and the results were analyzed using partial least squares structural equation modeling (PLS-SEM). Multigroup analyses have confirmed the usefulness of the technology acceptance model (TAM) across age, gender, profession (legal vs. other), and court experience (yes vs. no) groups. Therefore, as in other areas, technology acceptance in courts is primarily related to perceptions of usefulness. Trust emerged as an essential construct, which, in turn, was affected by the perceived risk and knowledge. In addition, the study’s findings prompt us to give more thought to who decides about technologies in courts, as the legal profession, court experience, age, and gender modify different aspects of legal technology acceptance.

2 citations

Journal ArticleDOI
TL;DR: The authors provide a systematic overview of the rich evidence, points out gaps that still exist, and discusses methodological challenges, concluding that the need for evidence is crucial to find a solution when the evidence is inconclusive and contested.
Abstract: Judges are human beings. Is their behavior therefore subject to the same effects that psychology and behavioral economics have documented for convenience samples, like university students? Does that fact that they decide on behalf of third parties moderate their behavior? In which ways does the need matter to find a solution when the evidence is inconclusive and contested? How do the multiple institutional safeguards resulting from procedural law, and the ways how the parties use it, affect judicial decision-making? Many of these questions have been put to the experimental test. The paper provides a systematic overview of the rich evidence, points out gaps that still exist, and discusses methodological challenges.

2 citations

References
More filters
Journal ArticleDOI
TL;DR: The concept of false memories is not new; psychologists have been studying false memories in several laboratory paradigms for years as discussed by the authors and Schacter (in press) provides an historical overview of the study of memory distortions.
Abstract: False memories—either remembering events that never happened, or remembering them quite differently from the way they happened—have recently captured the attention of both psychologists and the public at large. The primary impetus for this recent surge of interest is the increase in the number of cases in which memories of previously unrecognized abuse are reported during the course of therapy. Some researchers have argued that certain therapeutic practices can cause the creation of false memories, and therefore, the apparent "recovery" of memories during the course of therapy may actually represent the creation of memories (Lindsay & Read, 1994; Loftus, 1993). Although the concept of false memories is currently enjoying an increase in publicity, it is not new; psychologists have been studying false memories in several laboratory paradigms for years. Schacter (in press) provides an historical overview of the study of memory distortions. Bartlett (1932) is usually credited with conducting the first experimental investigation of false memories; he had subjects read an Indian folktale, "The War of the Ghosts," and recall it repeatedly. Although he reported no aggregate data, but only sample protocols, his results seemed to show distortions in subjects' memories over repeated attempts to recall the story. Interestingly, Bartlett's repeated reproduction results never have been successfully replicated by later researchers (see Gauld & Stephenson, 1967; Roediger, Wheeler, & Rajaram, 1993); indeed, Wheeler and Roediger (1992) showed that recall of prose passages (including "The War of the Ghosts")

3,277 citations

Journal ArticleDOI
01 Jun 2017
TL;DR: It is demonstrated that the criteria cannot all be simultaneously satisfied when recidivism prevalence differs across groups, and how disparate impact can arise when an RPI fails to satisfy the criterion of error rate balance.
Abstract: Recidivism prediction instruments (RPIs) provide decision-makers with an assessment of the likelihood that a criminal defendant will reoffend at a future point in time. Although such instruments are gaining increasing popularity across the country, their use is attracting tremendous controversy. Much of the controversy concerns potential discriminatory bias in the risk assessments that are produced. This article discusses several fairness criteria that have recently been applied to assess the fairness of RPIs. We demonstrate that the criteria cannot all be simultaneously satisfied when recidivism prevalence differs across groups. We then show how disparate impact can arise when an RPI fails to satisfy the criterion of error rate balance.

1,452 citations

Journal ArticleDOI
TL;DR: This article presents www.prolific.ac and lays out its suitability for recruiting subjects for social and economic science experiments, and traces the platform’s historical development, present its features, and contrast them with requirements for different types of social andEconomic experiments.

1,357 citations

Journal ArticleDOI
TL;DR: This paper showed that people are especially averse to algorithmic forecasters after seeing them perform, even when they see them outperform a human forecaster, and that people more quickly lose confidence in algorithmic than human forecasters when they make the same mistake.
Abstract: Research shows that evidence-based algorithms more accurately predict the future than do human forecasters. Yet when forecasters are deciding whether to use a human forecaster or a statistical algorithm, they often choose the human forecaster. This phenomenon, which we call algorithm aversion, is costly, and it is important to understand its causes. We show that people are especially averse to algorithmic forecasters after seeing them perform, even when they see them outperform a human forecaster. This is because people more quickly lose confidence in algorithmic than human forecasters after seeing them make the same mistake. In 5 studies, participants either saw an algorithm make forecasts, a human make forecasts, both, or neither. They then decided whether to tie their incentives to the future predictions of the algorithm or the human. Participants who saw the algorithm perform were less confident in it, and less likely to choose it over an inferior human forecaster. This was true even among those who saw the algorithm outperform the human.

741 citations