Home
/
Authors
/
Carlos Scheidegger

Author

Carlos Scheidegger

Other affiliations: University of South Florida, University of Utah, AT&T Labs ...read more

Bio: Carlos Scheidegger is an academic researcher from University of Arizona. The author has contributed to research in topics: Visualization & Data visualization. The author has an hindex of 37, co-authored 118 publications receiving 6800 citations. Previous affiliations of Carlos Scheidegger include University of South Florida & University of Utah.

Papers published on a yearly basis

2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Certifying and Removing Disparate Impact

[...]

Michael Feldman¹, Sorelle A. Friedler¹, John Moeller², Carlos Scheidegger³, Suresh Venkatasubramanian² - Show less +1 more•Institutions (3)

Haverford College¹, University of Utah², University of Arizona³

10 Aug 2015

TL;DR: This work links disparate impact to a measure of classification accuracy that while known, has received relatively little attention and proposes a test for disparate impact based on how well the protected class can be predicted from the other attributes.

...read moreread less

Abstract: What does it mean for an algorithm to be biased? In U.S. law, unintentional bias is encoded via disparate impact, which occurs when a selection process has widely different outcomes for different groups, even as it appears to be neutral. This legal determination hinges on a definition of a protected class (ethnicity, gender) and an explicit description of the process.When computers are involved, determining disparate impact (and hence bias) is harder. It might not be possible to disclose the process. In addition, even if the process is open, it might be hard to elucidate in a legal setting how the algorithm makes its decisions. Instead of requiring access to the process, we propose making inferences based on the data it uses.We present four contributions. First, we link disparate impact to a measure of classification accuracy that while known, has received relatively little attention. Second, we propose a test for disparate impact based on how well the protected class can be predicted from the other attributes. Third, we describe methods by which data might be made unbiased. Finally, we present empirical evidence supporting the effectiveness of our test for disparate impact and our approach for both masking bias and preserving relevant information in the data. Interestingly, our approach resembles some actual selection practices that have recently received legal scrutiny.

...read moreread less

1,434 citations

Posted Content•

Certifying and removing disparate impact

[...]

Michael Feldman¹, Sorelle A. Friedler¹, John Moeller², Carlos Scheidegger³, Suresh Venkatasubramanian² - Show less +1 more•Institutions (3)

Haverford College¹, University of Utah², University of Arizona³

11 Dec 2014-arXiv: Machine Learning

TL;DR: In this paper, the authors propose a test for disparate impact based on analyzing the information leakage of the protected class from the other data attributes, and present empirical evidence supporting the effectiveness of their test and their approach for masking bias and preserving relevant information in the data.

...read moreread less

Abstract: What does it mean for an algorithm to be biased? In U.S. law, unintentional bias is encoded via disparate impact, which occurs when a selection process has widely different outcomes for different groups, even as it appears to be neutral. This legal determination hinges on a definition of a protected class (ethnicity, gender, religious practice) and an explicit description of the process. When the process is implemented using computers, determining disparate impact (and hence bias) is harder. It might not be possible to disclose the process. In addition, even if the process is open, it might be hard to elucidate in a legal setting how the algorithm makes its decisions. Instead of requiring access to the algorithm, we propose making inferences based on the data the algorithm uses. We make four contributions to this problem. First, we link the legal notion of disparate impact to a measure of classification accuracy that while known, has received relatively little attention. Second, we propose a test for disparate impact based on analyzing the information leakage of the protected class from the other data attributes. Third, we describe methods by which data might be made unbiased. Finally, we present empirical evidence supporting the effectiveness of our test for disparate impact and our approach for both masking bias and preserving relevant information in the data. Interestingly, our approach resembles some actual selection practices that have recently received legal scrutiny.

...read moreread less

679 citations

Proceedings Article•DOI•

VisTrails: visualization meets data management

[...]

Steven P. Callahan¹, Juliana Freire¹, Emanuele Santos¹, Carlos Scheidegger¹, Cláudio T. Silva¹, Huy T. Vo¹ - Show less +2 more•Institutions (1)

University of Utah¹

27 Jun 2006

TL;DR: The VisTrails system represents the initial attempt to improve the scientific discovery process and reduce the time to insight, and is presented by presenting actual scenarios in which scientific visualization is used and showing how the system improves usability, enables reproducibility, and greatly reduces the time required to create scientific visualizations.

...read moreread less

Abstract: Scientists are now faced with an incredible volume of data to analyze. To successfully analyze and validate various hypothesis, it is necessary to pose several queries, correlate disparate data, and create insightful visualizations of both the simulated processes and observed phenomena. Often, insight comes from comparing the results of multiple visualizations. Unfortunately, today this process is far from interactive and contains many error-prone and time-consuming tasks. As a result, the generation and maintenance of visualizations is a major bottleneck in the scientific process, hindering both the ability to mine scientific data and the actual use of the data. The VisTrails system represents our initial attempt to improve the scientific discovery process and reduce the time to insight. In VisTrails, we address the problem of visualization from a data management perspective: VisTrails manages the data and metadata of a visualization product. In this demonstration, we show the power and flexibility of our system by presenting actual scenarios in which scientific visualization is used and showing how our system improves usability, enables reproducibility, and greatly reduces the time required to create scientific visualizations.

...read moreread less

541 citations

Proceedings Article•DOI•

A comparative study of fairness-enhancing interventions in machine learning

[...]

Sorelle A. Friedler¹, Carlos Scheidegger², Suresh Venkatasubramanian³, Sonam Choudhary³, Evan P. Hamilton¹, Derek Roth¹ - Show less +2 more•Institutions (3)

Haverford College¹, University of Arizona², University of Utah³

29 Jan 2019

TL;DR: It is found that fairness-preserving algorithms tend to be sensitive to fluctuations in dataset composition and to different forms of preprocessing, indicating that fairness interventions might be more brittle than previously thought.

...read moreread less

Abstract: Computers are increasingly used to make decisions that have significant impact on people's lives. Often, these predictions can affect different population subgroups disproportionately. As a result, the issue of fairness has received much recent interest, and a number of fairness-enhanced classifiers have appeared in the literature. This paper seeks to study the following questions: how do these different techniques fundamentally compare to one another, and what accounts for the differences? Specifically, we seek to bring attention to many under-appreciated aspects of such fairness-enhancing interventions that require investigation for these algorithms to receive broad adoption. We present the results of an open benchmark we have developed that lets us compare a number of different algorithms under a variety of fairness measures and existing datasets. We find that although different algorithms tend to prefer specific formulations of fairness preservations, many of these measures strongly correlate with one another. In addition, we find that fairness-preserving algorithms tend to be sensitive to fluctuations in dataset composition (simulated in our benchmark by varying training-test splits) and to different forms of preprocessing, indicating that fairness interventions might be more brittle than previously thought.

...read moreread less

476 citations

Posted Content•

On the (im)possibility of fairness

[...]

Sorelle A. Friedler, Carlos Scheidegger, Suresh Venkatasubramanian

23 Sep 2016-arXiv: Computers and Society

TL;DR: In this article, the authors show that in order to prove desirable properties of the entire decision-making process, different mechanisms for fairness require different assumptions about the nature of the mapping from construct space to decision space.

...read moreread less

Abstract: What does it mean for an algorithm to be fair? Different papers use different notions of algorithmic fairness, and although these appear internally consistent, they also seem mutually incompatible. We present a mathematical setting in which the distinctions in previous papers can be made formal. In addition to characterizing the spaces of inputs (the "observed" space) and outputs (the "decision" space), we introduce the notion of a construct space: a space that captures unobservable, but meaningful variables for the prediction. We show that in order to prove desirable properties of the entire decision-making process, different mechanisms for fairness require different assumptions about the nature of the mapping from construct space to decision space. The results in this paper imply that future treatments of algorithmic fairness should more explicitly state assumptions about the relationship between constructs and observations.

...read moreread less

388 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI

[...]

Alejandro Barredo Arrieta, Natalia Díaz-Rodríguez¹, Javier Del Ser², Javier Del Ser³, Adrien Bennetot⁴, Adrien Bennetot¹, Siham Tabik⁵, Alberto Barbado⁶, Salvador García⁵, Sergio Gil-Lopez, Daniel Molina⁵, Richard Benjamins⁶, Raja Chatila⁴, Francisco Herrera⁵ - Show less +10 more•Institutions (6)

French Institute for Research in Computer Science and Automation¹, University of the Basque Country², Basque Center for Applied Mathematics³, University of Paris⁴, University of Granada⁵, Telefónica⁶

01 Jun 2020-Information Fusion

TL;DR: In this paper, a taxonomy of recent contributions related to explainability of different machine learning models, including those aimed at explaining Deep Learning methods, is presented, and a second dedicated taxonomy is built and examined in detail.

...read moreread less

2,827 citations

Journal Article•DOI•

A Survey of Methods for Explaining Black Box Models

[...]

Riccardo Guidotti¹, Anna Monreale¹, Salvatore Ruggieri¹, Franco Turini¹, Fosca Giannotti², Dino Pedreschi¹ - Show less +2 more•Institutions (2)

University of Pisa¹, Istituto di Scienza e Tecnologie dell'Informazione²

22 Aug 2018-ACM Computing Surveys

TL;DR: In this paper, the authors provide a classification of the main problems addressed in the literature with respect to the notion of explanation and the type of black box decision support systems, given a problem definition, a black box type, and a desired explanation, this survey should help the researcher to find the proposals more useful for his own work.

...read moreread less

Abstract: In recent years, many accurate decision support systems have been constructed as black boxes, that is as systems that hide their internal logic to the user. This lack of explanation constitutes both a practical and an ethical issue. The literature reports many approaches aimed at overcoming this crucial weakness, sometimes at the cost of sacrificing accuracy for interpretability. The applications in which black box decision systems can be used are various, and each approach is typically developed to provide a solution for a specific problem and, as a consequence, it explicitly or implicitly delineates its own definition of interpretability and explanation. The aim of this article is to provide a classification of the main problems addressed in the literature with respect to the notion of explanation and the type of black box system. Given a problem definition, a black box type, and a desired explanation, this survey should help the researcher to find the proposals more useful for his own work. The proposed classification of approaches to open black box models should also be useful for putting the many research open questions in perspective.

...read moreread less

2,805 citations

Proceedings Article•

Equality of opportunity in supervised learning

[...]

Moritz Hardt¹, Eric Price², Nathan Srebro•Institutions (2)

Google¹, University of Texas at Austin²

05 Dec 2016

TL;DR: This work proposes a criterion for discrimination against a specified sensitive attribute in supervised learning, where the goal is to predict some target based on available features and shows how to optimally adjust any learned predictor so as to remove discrimination according to this definition.

...read moreread less

Abstract: We propose a criterion for discrimination against a specified sensitive attribute in supervised learning, where the goal is to predict some target based on available features. Assuming data about the predictor, target, and membership in the protected group are available, we show how to optimally adjust any learned predictor so as to remove discrimination according to our definition. Our framework also improves incentives by shifting the cost of poor classification from disadvantaged groups to the decision maker, who can respond by improving the classification accuracy. We enourage readers to consult the more complete manuscript on the arXiv.

...read moreread less

2,690 citations

Journal Article•DOI•

Semantics derived automatically from language corpora contain human-like biases

[...]

Aylin Caliskan¹, Joanna J. Bryson², Joanna J. Bryson¹, Arvind Narayanan¹•Institutions (2)

Center for Information Technology¹, University of Bath²

14 Apr 2017-Science

TL;DR: This article showed that applying machine learning to ordinary human language results in human-like semantic biases and replicated a spectrum of known biases, as measured by the Implicit Association Test, using a widely used, purely statistical machine-learning model trained on a standard corpus of text from the World Wide Web.

...read moreread less

Abstract: Machine learning is a means to derive artificial intelligence by discovering patterns in existing data. Here, we show that applying machine learning to ordinary human language results in human-like semantic biases. We replicated a spectrum of known biases, as measured by the Implicit Association Test, using a widely used, purely statistical machine-learning model trained on a standard corpus of text from the World Wide Web. Our results indicate that text corpora contain recoverable and accurate imprints of our historic biases, whether morally neutral as toward insects or flowers, problematic as toward race or gender, or even simply veridical, reflecting the status quo distribution of gender with respect to careers or first names. Our methods hold promise for identifying and addressing sources of bias in culture, including technology.

...read moreread less

1,874 citations

Journal Article•DOI•

The Visual Display of Quantitative Information

[...]

B. Marx

01 Mar 1985-Journal of Modern Optics

1,778 citations