Home
/
Authors
/
Stephanie Houde

Author

Stephanie Houde

Bio: Stephanie Houde is an academic researcher from IBM. The author has contributed to research in topics: Service provider & Declaration. The author has an hindex of 4, co-authored 5 publications receiving 247 citations.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

AI Fairness 360: An extensible toolkit for detecting and mitigating algorithmic bias

[...]

Rachel K. E. Bellamy, Kuntal Dey, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lohia, Jacquelyn A. Martino, Shalin Mehta, Aleksandra Mojsilovic, Seema Nagar, K. Natesan Ramamurthy, John T. Richards, Debanjan Saha, Prasanna Sattigeri, Moninder Singh, Kush R. Varshney, Yunfeng Zhang - Show less +14 more

18 Sep 2019-Ibm Journal of Research and Development

TL;DR: A new open-source Python toolkit for algorithmic fairness, AI Fairness 360 (AIF360), released under an Apache v2.0 license, to help facilitate the transition of fairness research algorithms for use in an industrial setting and to provide a common framework for fairness researchers to share and evaluate algorithms.

...read moreread less

Abstract: Fairness is an increasingly important concern as machine learning models are used to support decision making in high-stakes applications such as mortgage lending, hiring, and prison sentencing. This article introduces a new open-source Python toolkit for algorithmic fairness, AI Fairness 360 (AIF360), released under an Apache v2.0 license ( https://github.com/ibm/aif360 ). The main objectives of this toolkit are to help facilitate the transition of fairness research algorithms for use in an industrial setting and to provide a common framework for fairness researchers to share and evaluate algorithms. The package includes a comprehensive set of fairness metrics for datasets and models, explanations for these metrics, and algorithms to mitigate bias in datasets and models. It also includes an interactive Web experience that provides a gentle introduction to the concepts and capabilities for line-of-business users, researchers, and developers to extend the toolkit with their new algorithms and improvements and to use it for performance benchmarking. A built-in testing infrastructure maintains code quality.

...read moreread less

356 citations

Journal Article•DOI•

FactSheets: Increasing trust in AI services through supplier's declarations of conformity

[...]

Matthew Arnold, Rachel K. E. Bellamy, Michael Hind, Stephanie Houde, Sameep Mehta, Aleksandra Mojsilovic, Ravi Nair, K. Natesan Ramamurthy, Alexandra Olteanu, David Piorkowski, Darrell C. Reimer, John T. Richards, Jason Tsay, Kush R. Varshney - Show less +10 more

18 Sep 2019-Ibm Journal of Research and Development

TL;DR: This paper envisiones an SDoC for AI services to contain purpose, performance, safety, security, and provenance information to be completed and voluntarily released by AI service providers for examination by consumers.

...read moreread less

Abstract: Accuracy is an important concern for suppliers of artificial intelligence (AI) services, but considerations beyond accuracy, such as safety (which includes fairness and explainability), security, and provenance, are also critical elements to engender consumers’ trust in a service. Many industries use transparent, standardized, but often not legally required documents called supplier's declarations of conformity (SDoCs) to describe the lineage of a product along with the safety and performance testing it has undergone. SDoCs may be considered multidimensional fact sheets that capture and quantify various aspects of the product and its development to make it worthy of consumers’ trust. In this article, inspired by this practice, we propose FactSheets to help increase trust in AI services. We envision such documents to contain purpose, performance, safety, security, and provenance information to be completed by AI service providers for examination by consumers. We suggest a comprehensive set of declaration items tailored to AI in the Appendix of this article.

...read moreread less

243 citations

Journal Article•DOI•

Think Your Artificial Intelligence Software Is Fair? Think Again

[...]

Rachel K. E. Bellamy¹, Kuntal Dey¹, Michael Hind¹, Samuel C. Hoffman¹, Stephanie Houde¹, Kalapriya Kannan¹, Pranay Lohia¹, Sameep Mehta¹, Aleksandra Mojsilovic¹, Seema Nagar¹, Karthikeyan Natesan Ramamurthy¹, John T. Richards¹, Diptikalyan Saha¹, Prasanna Sattigeri¹, Moninder Singh¹, Kush R. Varshney¹, Yunfeng Zhang¹ - Show less +13 more•Institutions (1)

IBM¹

17 Jun 2019-IEEE Software

TL;DR: While fair model- assisted decision making involves more than the application of unbiased models-consideration of application context, specifics of the decisions being made, resolution of conflicting stakeholder viewpoints, and so forth-mitigating bias from machine-learning software is important and possible but difficult and too often ignored.

...read moreread less

Abstract: Today, machine-learning software is used to help make decisions that affect people's lives. Some people believe that the application of such software results in fairer decisions because, unlike humans, machine-learning software generates models that are not biased. Think again. Machine-learning software is also biased, sometimes in similar ways to humans, often in different ways. While fair model- assisted decision making involves more than the application of unbiased models-consideration of application context, specifics of the decisions being made, resolution of conflicting stakeholder viewpoints, and so forth-mitigating bias from machine-learning software is important and possible but difficult and too often ignored.

...read moreread less

25 citations

Proceedings Article•DOI•

AI explainability 360: hands-on tutorial

[...]

Vijay Arya¹, Rachel K. E. Bellamy¹, Pin-Yu Chen¹, Amit Dhurandhar¹, Michael Hind¹, Samuel C. Hoffman¹, Stephanie Houde¹, Q. Vera Liao¹, Ronny Luss¹, Aleksandra Mojsilovic¹, Sami Mourad¹, Pablo Pedemonte¹, Ramya Raghavendra¹, John T. Richards¹, Prasanna Sattigeri¹, Karthikeyan Shanmugam¹, Moninder Singh¹, Kush R. Varshney¹, Dennis Wei¹, Yunfeng Zhang¹ - Show less +16 more•Institutions (1)

IBM¹

27 Jan 2020

TL;DR: This tutorial will teach participants to use and contribute to a new open-source Python package named AI Explainability 360 (AIX360) (https://aix360.mybluemix.net), a comprehensive and extensible toolkit that supports interpretability and explainability of data and machine learning models.

...read moreread less

Abstract: This tutorial will teach participants to use and contribute to a new open-source Python package named AI Explainability 360 (AIX360) (https://aix360.mybluemix.net), a comprehensive and extensible toolkit that supports interpretability and explainability of data and machine learning models. Motivation for the toolkit. The AIX360 toolkit illustrates that there is no single approach to explainability that works best for all situations. There are many ways to explain: data vs. model, direct vs. post-hoc explanation, local vs. global, etc. The toolkit includes ten state of the art algorithms that cover different dimensions of explanations along with proxy explainability metrics. Moreover, one of our prime objectives is for AIX360 to serve as an educational tool even for non-machine learning experts (viz. social scientists, healthcare experts). To this end, the toolkit has an interactive demonstration, highly descriptive Jupyter notebooks covering diverse real-world use cases, and guidance materials, all helping one navigate the complex explainability space. Compared to existing open-source efforts on AI explainability, AIX360 takes a step forward in focusing on a greater diversity of ways of explaining, usability in industry, and software engineering. By integrating these three aspects, we hope that AIX360 will attract researchers in AI explainability and help translate our collective research results for practicing data scientists and developers deploying solutions in a variety of industries. Regarding the first aspect of diversity, Table 1 in [1] compares AIX360 to existing toolkits in terms of the types of explainability methods offered. The table shows that AIX360 not only covers more types of methods but also has metrics which can act as proxies for judging the quality of explanations. Regarding the second aspect of industry usage, AIX360 illustrates how these explainability algorithms can be applied in specific contexts (please see Audience, goals, and outcomes below). In just a few months since its initial release, the AIX360 toolkit already has a vibrant slack community with over 120 members and has been forked almost 80 times accumulating over 400 stars. This response leads us to believe that there is significant interest in the community in learning more about the toolkit and explainability in general. Audience, goals, and outcomes. The presentations in the tutorial will be aimed at an audience with different backgrounds and computer science expertise levels. For all audience members and especially those unfamiliar with Python programming, AIX360 provides an interactive experience (http://aix360.mybluemix.net/data) centered around a credit approval scenario as a gentle and grounded introduction to the concepts and capabilities of the toolkit. We will also teach all participants which type of explainability algorithm is most appropriate for a given use case, not only for those in the toolkit but also from the broader explainability literature. Knowing which explainability algorithms apply to which contexts and understanding when to use them can benefit most people, regardless of their technical background. The second part of the tutorial will consist of three use cases featuring different industry domains and explanation methods. Data scientists and developers can gain hands-on experience with the toolkit by running and modifying Jupyter notebooks, while others will be able to follow along by viewing rendered versions of the notebooks. Here is a rough agenda of the tutorial: 1) Overture: Provide a brief introduction to the area of explainability as well as introduce common terms. 2) Interactive Web Experience: The AIX360 interactive web experience (http://aix360.mybluemix.net/data) is intended to show a non-computer science audience how different explainability methods may suit different stakeholders in a credit approval scenario (data scientists, loan officers, and bank customers). 3) Taxonomy: We will next present a taxonomy that we have created for organizing the space of explanations and guiding practitioners toward an appropriate choice for their applications. 4) Installation: We will transition into a Python environment and ask participants to install the AIX360 package on their machines using provided instructions. 5) Example Use Cases in Finance, Government, and Healthcare: We will take participants through three use-cases in various application domains in the form of Jupyter notebooks. 6) Metrics: We will briefly showcase the two explainability metrics currently available through the toolkit. 7) Future Directions: The final segment will be to discuss future directions and how participants can contribute to the toolkit.

...read moreread less

18 citations

Proceedings Article•DOI•

AI Explainability 360 Toolkit

[...]

IBM¹

02 Jan 2021

TL;DR: An open-source software toolkit featuring eight diverse state-of-the-art explainability methods, two evaluation metrics, and an extensible software architecture that organizes these methods according to their use in the AI modeling pipeline to improve transparency of machine learning models and provides a platform to integrate new explainability techniques as they are developed.

...read moreread less

Abstract: As machine learning algorithms make inroads into our lives and society, calls are increasing from multiple stakeholders for these algorithms to explain their outputs. Moreover, these stakeholders, whether they be government regulators, affected citizens, domain experts, or developers, present different requirements for explanations. To address these needs, we introduce AI Explainability 3601, an open-source software toolkit featuring eight diverse state-of-the-art explainability methods, two evaluation metrics, and an extensible software architecture that organizes these methods according to their use in the AI modeling pipeline. Additionally, we have implemented enhancements to bring research innovations closer to consumers of explanations, ranging from simplified, accessible versions of algorithms to guidance material to help users navigate the space of explanations along with tutorials and an interactive web demo to introduce AI explainability to practitioners. Together, our toolkit can help improve transparency of machine learning models and provides a platform to integrate new explainability techniques as they are developed.

...read moreread less

5 citations

Cited by

PDF

Open Access

More filters

Posted Content•

Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims

[...]

Miles Brundage¹, Shahar Avin², Jasmine Wang³, Haydn Belfield², Gretchen Krueger¹, Gillian K. Hadfield⁴, Gillian K. Hadfield¹, Heidy Khlaaf, Jingying Yang, Helen Toner, Ruth Fong⁵, Tegan Maharaj, Pang Wei Koh⁶, Sara Hooker⁷, Jade Leung⁵, Andrew Trask⁵, Emma Bluemke⁵, Jonathan Lebensbold³, Cullen O'Keefe¹, Mark Koren⁶, Théo Ryffel⁸, J. B. Rubinovitz, Tamay Besiroglu², Federica Carugati⁶, Jack Clark¹, Peter Eckersley, Sarah de Haas⁷, Maritza Johnson⁷, Ben Laurie⁷, Alex Ingerman⁷, Igor Krawczuk⁹, Amanda Askell¹, Rosario Cammarota¹⁰, Andrew J. Lohn¹¹, David Krueger¹², Charlotte Stix¹³, Peter Henderson⁶, Logan Graham⁵, Carina E. A. Prunkl⁵, Bianca Martin¹, Elizabeth Seger², Noa Zilberman⁵, Seán Ó hÉigeartaigh², Frens Kroeger, Girish Sastry¹, Rebecca Kagan, Adrian Weller², Adrian Weller¹⁴, Brian Tse⁵, Elizabeth A. Barnes¹, Allan Dafoe⁵, Paul Scharre¹⁵, Ariel Herbert-Voss¹, Martijn Rasser¹⁵, Shagun Sodhani¹², Carrick Flynn, Thomas Krendl Gilbert¹⁶, Lisa Dyer, Saif Khan, Yoshua Bengio¹², Markus Anderljung⁵ - Show less +57 more•Institutions (16)

OpenAI¹, University of Cambridge², McGill University³, University of Toronto⁴, University of Oxford⁵, Stanford University⁶, Google⁷, École Normale Supérieure⁸, École Polytechnique Fédérale de Lausanne⁹, Intel¹⁰, RAND Corporation¹¹, Université de Montréal¹², Eindhoven University of Technology¹³, The Turing Institute¹⁴, Center for a New American Security¹⁵, University of California¹⁶

15 Apr 2020-arXiv: Computers and Society

TL;DR: This report suggests various steps that different stakeholders can take to improve the verifiability of claims made about AI systems and their associated development processes, with a focus on providing evidence about the safety, security, fairness, and privacy protection of AI systems.

...read moreread less

Abstract: With the recent wave of progress in artificial intelligence (AI) has come a growing awareness of the large-scale impacts of AI systems, and recognition that existing regulations and norms in industry and academia are insufficient to ensure responsible AI development. In order for AI developers to earn trust from system users, customers, civil society, governments, and other stakeholders that they are building AI responsibly, they will need to make verifiable claims to which they can be held accountable. Those outside of a given organization also need effective means of scrutinizing such claims. This report suggests various steps that different stakeholders can take to improve the verifiability of claims made about AI systems and their associated development processes, with a focus on providing evidence about the safety, security, fairness, and privacy protection of AI systems. We analyze ten mechanisms for this purpose--spanning institutions, software, and hardware--and make recommendations aimed at implementing, exploring, or improving those mechanisms.

...read moreread less

191 citations

Proceedings Article•DOI•

Towards Accountability for Machine Learning Datasets: Practices from Software Engineering and Infrastructure

[...]

Ben Hutchinson, Andrew Smart, Alex Hanna, Emily Denton, Christina Greer, Oddur Kjartansson, Parker Barnes, Margaret Mitchell - Show less +4 more

03 Mar 2021

TL;DR: A rigorous framework for dataset development transparency that supports decision-making and accountability is introduced, which uses the cyclical, infrastructural and engineering nature of dataset development to draw on best practices from the software development lifecycle.

...read moreread less

Abstract: Datasets that power machine learning are often used, shared, and reused with little visibility into the processes of deliberation that led to their creation. As artificial intelligence systems are increasingly used in high-stakes tasks, system development and deployment practices must be adapted to address the very real consequences of how model development data is constructed and used in practice. This includes greater transparency about data, and accountability for decisions made when developing it. In this paper, we introduce a rigorous framework for dataset development transparency that supports decision-making and accountability. The framework uses the cyclical, infrastructural and engineering nature of dataset development to draw on best practices from the software development lifecycle. Each stage of the data development lifecycle yields documents that facilitate improved communication and decision-making, as well as drawing attention to the value and necessity of careful data work. The proposed framework makes visible the often overlooked work and decisions that go into dataset creation, a critical step in closing the accountability gap in artificial intelligence and a critical/necessary resource aligned with recent work on auditing processes.

...read moreread less

169 citations

Journal Article•DOI•

Bridging the Gap Between Ethics and Practice: Guidelines for Reliable, Safe, and Trustworthy Human-centered AI Systems

[...]

Ben Shneiderman¹•Institutions (1)

University of Maryland, College Park¹

16 Oct 2020

TL;DR: 15 recommendations are intended to increase the reliability, safety, and trustworthiness of HCAI systems: reliable systems based on sound software engineering practices, safety culture through business management strategies, and trustworthy certification by independent oversight.

...read moreread less

Abstract: This article attempts to bridge the gap between widely discussed ethical principles of Human-centered AI (HCAI) and practical steps for effective governance. Since HCAI systems are developed and implemented in multiple organizational structures, I propose 15 recommendations at three levels of governance: team, organization, and industry. The recommendations are intended to increase the reliability, safety, and trustworthiness of HCAI systems: (1) reliable systems based on sound software engineering practices, (2) safety culture through business management strategies, and (3) trustworthy certification by independent oversight. Software engineering practices within teams include audit trails to enable analysis of failures, software engineering workflows, verification and validation testing, bias testing to enhance fairness, and explainable user interfaces. The safety culture within organizations comes from management strategies that include leadership commitment to safety, hiring and training oriented to safety, extensive reporting of failures and near misses, internal review boards for problems and future plans, and alignment with industry standard practices. The trustworthiness certification comes from industry-wide efforts that include government interventions and regulation, accounting firms conducting external audits, insurance companies compensating for failures, non-governmental and civil society organizations advancing design principles, and professional organizations and research institutes developing standards, policies, and novel ideas. The larger goal of effective governance is to limit the dangers and increase the benefits of HCAI to individuals, organizations, and society.

...read moreread less

166 citations

Journal Article•DOI•

How do Data Science Workers Collaborate? Roles, Workflows, and Tools

[...]

Amy X. Zhang¹, Michael Muller², Dakuo Wang²•Institutions (2)

Massachusetts Institute of Technology¹, IBM²

28 May 2020

TL;DR: This paper conducted an online survey with 183 participants who work in various aspects of data science and found that data science teams are extremely collaborative and work with a variety of stakeholders and tools during the six common steps of a data science workflow (e.g., clean data and train model).

...read moreread less

Abstract: Today, the prominence of data science within organizations has given rise to teams of data science workers collaborating on extracting insights from data, as opposed to individual data scientists working alone. However, we still lack a deep understanding of how data science workers collaborate in practice. In this work, we conducted an online survey with 183 participants who work in various aspects of data science. We focused on their reported interactions with each other (e.g., managers with engineers) and with different tools (e.g., Jupyter Notebook). We found that data science teams are extremely collaborative and work with a variety of stakeholders and tools during the six common steps of a data science workflow (e.g., clean data and train model). We also found that the collaborative practices workers employ, such as documentation, vary according to the kinds of tools they use. Based on these findings, we discuss design implications for supporting data science team collaborations and future research directions.

...read moreread less

156 citations

Proceedings Article•DOI•

Explainability Fact Sheets: A Framework for Systematic Assessment of Explainable Approaches.

[...]

Kacper Sokol¹, Peter A. Flach¹•Institutions (1)

University of Bristol¹

11 Dec 2019-arXiv: Learning

TL;DR: In this article, a taxonomy and a set of descriptors that can be used to characterise and systematically assess explainable systems along five key dimensions: functional, operational, usability, safety and validation.

...read moreread less

Abstract: Explanations in Machine Learning come in many forms, but a consensus regarding their desired properties is yet to emerge. In this paper we introduce a taxonomy and a set of descriptors that can be used to characterise and systematically assess explainable systems along five key dimensions: functional, operational, usability, safety and validation. In order to design a comprehensive and representative taxonomy and associated descriptors we surveyed the eXplainable Artificial Intelligence literature, extracting the criteria and desiderata that other authors have proposed or implicitly used in their research. The survey includes papers introducing new explainability algorithms to see what criteria are used to guide their development and how these algorithms are evaluated, as well as papers proposing such criteria from both computer science and social science perspectives. This novel framework allows to systematically compare and contrast explainability approaches, not just to better understand their capabilities but also to identify discrepancies between their theoretical qualities and properties of their implementations. We developed an operationalisation of the framework in the form of Explainability Fact Sheets, which enable researchers and practitioners alike to quickly grasp capabilities and limitations of a particular explainable method. When used as a Work Sheet, our taxonomy can guide the development of new explainability approaches by aiding in their critical evaluation along the five proposed dimensions.

...read moreread less

142 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125

Collapse