scispace - formally typeset
Search or ask a question
Author

Stephanie Houde

Bio: Stephanie Houde is an academic researcher from IBM. The author has contributed to research in topics: Service provider & Declaration. The author has an hindex of 4, co-authored 5 publications receiving 247 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: A new open-source Python toolkit for algorithmic fairness, AI Fairness 360 (AIF360), released under an Apache v2.0 license, to help facilitate the transition of fairness research algorithms for use in an industrial setting and to provide a common framework for fairness researchers to share and evaluate algorithms.
Abstract: Fairness is an increasingly important concern as machine learning models are used to support decision making in high-stakes applications such as mortgage lending, hiring, and prison sentencing. This article introduces a new open-source Python toolkit for algorithmic fairness, AI Fairness 360 (AIF360), released under an Apache v2.0 license ( https://github.com/ibm/aif360 ). The main objectives of this toolkit are to help facilitate the transition of fairness research algorithms for use in an industrial setting and to provide a common framework for fairness researchers to share and evaluate algorithms. The package includes a comprehensive set of fairness metrics for datasets and models, explanations for these metrics, and algorithms to mitigate bias in datasets and models. It also includes an interactive Web experience that provides a gentle introduction to the concepts and capabilities for line-of-business users, researchers, and developers to extend the toolkit with their new algorithms and improvements and to use it for performance benchmarking. A built-in testing infrastructure maintains code quality.

356 citations

Journal ArticleDOI
TL;DR: This paper envisiones an SDoC for AI services to contain purpose, performance, safety, security, and provenance information to be completed and voluntarily released by AI service providers for examination by consumers.
Abstract: Accuracy is an important concern for suppliers of artificial intelligence (AI) services, but considerations beyond accuracy, such as safety (which includes fairness and explainability), security, and provenance, are also critical elements to engender consumers’ trust in a service. Many industries use transparent, standardized, but often not legally required documents called supplier's declarations of conformity (SDoCs) to describe the lineage of a product along with the safety and performance testing it has undergone. SDoCs may be considered multidimensional fact sheets that capture and quantify various aspects of the product and its development to make it worthy of consumers’ trust. In this article, inspired by this practice, we propose FactSheets to help increase trust in AI services. We envision such documents to contain purpose, performance, safety, security, and provenance information to be completed by AI service providers for examination by consumers. We suggest a comprehensive set of declaration items tailored to AI in the Appendix of this article.

243 citations

Journal ArticleDOI
TL;DR: While fair model- assisted decision making involves more than the application of unbiased models-consideration of application context, specifics of the decisions being made, resolution of conflicting stakeholder viewpoints, and so forth-mitigating bias from machine-learning software is important and possible but difficult and too often ignored.
Abstract: Today, machine-learning software is used to help make decisions that affect people's lives. Some people believe that the application of such software results in fairer decisions because, unlike humans, machine-learning software generates models that are not biased. Think again. Machine-learning software is also biased, sometimes in similar ways to humans, often in different ways. While fair model- assisted decision making involves more than the application of unbiased models-consideration of application context, specifics of the decisions being made, resolution of conflicting stakeholder viewpoints, and so forth-mitigating bias from machine-learning software is important and possible but difficult and too often ignored.

25 citations

Proceedings ArticleDOI
27 Jan 2020
TL;DR: This tutorial will teach participants to use and contribute to a new open-source Python package named AI Explainability 360 (AIX360) (https://aix360.mybluemix.net), a comprehensive and extensible toolkit that supports interpretability and explainability of data and machine learning models.
Abstract: This tutorial will teach participants to use and contribute to a new open-source Python package named AI Explainability 360 (AIX360) (https://aix360.mybluemix.net), a comprehensive and extensible toolkit that supports interpretability and explainability of data and machine learning models. Motivation for the toolkit. The AIX360 toolkit illustrates that there is no single approach to explainability that works best for all situations. There are many ways to explain: data vs. model, direct vs. post-hoc explanation, local vs. global, etc. The toolkit includes ten state of the art algorithms that cover different dimensions of explanations along with proxy explainability metrics. Moreover, one of our prime objectives is for AIX360 to serve as an educational tool even for non-machine learning experts (viz. social scientists, healthcare experts). To this end, the toolkit has an interactive demonstration, highly descriptive Jupyter notebooks covering diverse real-world use cases, and guidance materials, all helping one navigate the complex explainability space. Compared to existing open-source efforts on AI explainability, AIX360 takes a step forward in focusing on a greater diversity of ways of explaining, usability in industry, and software engineering. By integrating these three aspects, we hope that AIX360 will attract researchers in AI explainability and help translate our collective research results for practicing data scientists and developers deploying solutions in a variety of industries. Regarding the first aspect of diversity, Table 1 in [1] compares AIX360 to existing toolkits in terms of the types of explainability methods offered. The table shows that AIX360 not only covers more types of methods but also has metrics which can act as proxies for judging the quality of explanations. Regarding the second aspect of industry usage, AIX360 illustrates how these explainability algorithms can be applied in specific contexts (please see Audience, goals, and outcomes below). In just a few months since its initial release, the AIX360 toolkit already has a vibrant slack community with over 120 members and has been forked almost 80 times accumulating over 400 stars. This response leads us to believe that there is significant interest in the community in learning more about the toolkit and explainability in general. Audience, goals, and outcomes. The presentations in the tutorial will be aimed at an audience with different backgrounds and computer science expertise levels. For all audience members and especially those unfamiliar with Python programming, AIX360 provides an interactive experience (http://aix360.mybluemix.net/data) centered around a credit approval scenario as a gentle and grounded introduction to the concepts and capabilities of the toolkit. We will also teach all participants which type of explainability algorithm is most appropriate for a given use case, not only for those in the toolkit but also from the broader explainability literature. Knowing which explainability algorithms apply to which contexts and understanding when to use them can benefit most people, regardless of their technical background. The second part of the tutorial will consist of three use cases featuring different industry domains and explanation methods. Data scientists and developers can gain hands-on experience with the toolkit by running and modifying Jupyter notebooks, while others will be able to follow along by viewing rendered versions of the notebooks. Here is a rough agenda of the tutorial: 1) Overture: Provide a brief introduction to the area of explainability as well as introduce common terms. 2) Interactive Web Experience: The AIX360 interactive web experience (http://aix360.mybluemix.net/data) is intended to show a non-computer science audience how different explainability methods may suit different stakeholders in a credit approval scenario (data scientists, loan officers, and bank customers). 3) Taxonomy: We will next present a taxonomy that we have created for organizing the space of explanations and guiding practitioners toward an appropriate choice for their applications. 4) Installation: We will transition into a Python environment and ask participants to install the AIX360 package on their machines using provided instructions. 5) Example Use Cases in Finance, Government, and Healthcare: We will take participants through three use-cases in various application domains in the form of Jupyter notebooks. 6) Metrics: We will briefly showcase the two explainability metrics currently available through the toolkit. 7) Future Directions: The final segment will be to discuss future directions and how participants can contribute to the toolkit.

18 citations

Proceedings ArticleDOI
02 Jan 2021
TL;DR: An open-source software toolkit featuring eight diverse state-of-the-art explainability methods, two evaluation metrics, and an extensible software architecture that organizes these methods according to their use in the AI modeling pipeline to improve transparency of machine learning models and provides a platform to integrate new explainability techniques as they are developed.
Abstract: As machine learning algorithms make inroads into our lives and society, calls are increasing from multiple stakeholders for these algorithms to explain their outputs. Moreover, these stakeholders, whether they be government regulators, affected citizens, domain experts, or developers, present different requirements for explanations. To address these needs, we introduce AI Explainability 3601, an open-source software toolkit featuring eight diverse state-of-the-art explainability methods, two evaluation metrics, and an extensible software architecture that organizes these methods according to their use in the AI modeling pipeline. Additionally, we have implemented enhancements to bring research innovations closer to consumers of explanations, ranging from simplified, accessible versions of algorithms to guidance material to help users navigate the space of explanations along with tutorials and an interactive web demo to introduce AI explainability to practitioners. Together, our toolkit can help improve transparency of machine learning models and provides a platform to integrate new explainability techniques as they are developed.

5 citations


Cited by
More filters
Posted Content
TL;DR: This report suggests various steps that different stakeholders can take to improve the verifiability of claims made about AI systems and their associated development processes, with a focus on providing evidence about the safety, security, fairness, and privacy protection of AI systems.
Abstract: With the recent wave of progress in artificial intelligence (AI) has come a growing awareness of the large-scale impacts of AI systems, and recognition that existing regulations and norms in industry and academia are insufficient to ensure responsible AI development. In order for AI developers to earn trust from system users, customers, civil society, governments, and other stakeholders that they are building AI responsibly, they will need to make verifiable claims to which they can be held accountable. Those outside of a given organization also need effective means of scrutinizing such claims. This report suggests various steps that different stakeholders can take to improve the verifiability of claims made about AI systems and their associated development processes, with a focus on providing evidence about the safety, security, fairness, and privacy protection of AI systems. We analyze ten mechanisms for this purpose--spanning institutions, software, and hardware--and make recommendations aimed at implementing, exploring, or improving those mechanisms.

191 citations

Proceedings ArticleDOI
03 Mar 2021
TL;DR: A rigorous framework for dataset development transparency that supports decision-making and accountability is introduced, which uses the cyclical, infrastructural and engineering nature of dataset development to draw on best practices from the software development lifecycle.
Abstract: Datasets that power machine learning are often used, shared, and reused with little visibility into the processes of deliberation that led to their creation. As artificial intelligence systems are increasingly used in high-stakes tasks, system development and deployment practices must be adapted to address the very real consequences of how model development data is constructed and used in practice. This includes greater transparency about data, and accountability for decisions made when developing it. In this paper, we introduce a rigorous framework for dataset development transparency that supports decision-making and accountability. The framework uses the cyclical, infrastructural and engineering nature of dataset development to draw on best practices from the software development lifecycle. Each stage of the data development lifecycle yields documents that facilitate improved communication and decision-making, as well as drawing attention to the value and necessity of careful data work. The proposed framework makes visible the often overlooked work and decisions that go into dataset creation, a critical step in closing the accountability gap in artificial intelligence and a critical/necessary resource aligned with recent work on auditing processes.

169 citations

Journal ArticleDOI
16 Oct 2020
TL;DR: 15 recommendations are intended to increase the reliability, safety, and trustworthiness of HCAI systems: reliable systems based on sound software engineering practices, safety culture through business management strategies, and trustworthy certification by independent oversight.
Abstract: This article attempts to bridge the gap between widely discussed ethical principles of Human-centered AI (HCAI) and practical steps for effective governance. Since HCAI systems are developed and implemented in multiple organizational structures, I propose 15 recommendations at three levels of governance: team, organization, and industry. The recommendations are intended to increase the reliability, safety, and trustworthiness of HCAI systems: (1) reliable systems based on sound software engineering practices, (2) safety culture through business management strategies, and (3) trustworthy certification by independent oversight. Software engineering practices within teams include audit trails to enable analysis of failures, software engineering workflows, verification and validation testing, bias testing to enhance fairness, and explainable user interfaces. The safety culture within organizations comes from management strategies that include leadership commitment to safety, hiring and training oriented to safety, extensive reporting of failures and near misses, internal review boards for problems and future plans, and alignment with industry standard practices. The trustworthiness certification comes from industry-wide efforts that include government interventions and regulation, accounting firms conducting external audits, insurance companies compensating for failures, non-governmental and civil society organizations advancing design principles, and professional organizations and research institutes developing standards, policies, and novel ideas. The larger goal of effective governance is to limit the dangers and increase the benefits of HCAI to individuals, organizations, and society.

166 citations

Journal ArticleDOI
28 May 2020
TL;DR: This paper conducted an online survey with 183 participants who work in various aspects of data science and found that data science teams are extremely collaborative and work with a variety of stakeholders and tools during the six common steps of a data science workflow (e.g., clean data and train model).
Abstract: Today, the prominence of data science within organizations has given rise to teams of data science workers collaborating on extracting insights from data, as opposed to individual data scientists working alone. However, we still lack a deep understanding of how data science workers collaborate in practice. In this work, we conducted an online survey with 183 participants who work in various aspects of data science. We focused on their reported interactions with each other (e.g., managers with engineers) and with different tools (e.g., Jupyter Notebook). We found that data science teams are extremely collaborative and work with a variety of stakeholders and tools during the six common steps of a data science workflow (e.g., clean data and train model). We also found that the collaborative practices workers employ, such as documentation, vary according to the kinds of tools they use. Based on these findings, we discuss design implications for supporting data science team collaborations and future research directions.

156 citations

Proceedings ArticleDOI
TL;DR: In this article, a taxonomy and a set of descriptors that can be used to characterise and systematically assess explainable systems along five key dimensions: functional, operational, usability, safety and validation.
Abstract: Explanations in Machine Learning come in many forms, but a consensus regarding their desired properties is yet to emerge. In this paper we introduce a taxonomy and a set of descriptors that can be used to characterise and systematically assess explainable systems along five key dimensions: functional, operational, usability, safety and validation. In order to design a comprehensive and representative taxonomy and associated descriptors we surveyed the eXplainable Artificial Intelligence literature, extracting the criteria and desiderata that other authors have proposed or implicitly used in their research. The survey includes papers introducing new explainability algorithms to see what criteria are used to guide their development and how these algorithms are evaluated, as well as papers proposing such criteria from both computer science and social science perspectives. This novel framework allows to systematically compare and contrast explainability approaches, not just to better understand their capabilities but also to identify discrepancies between their theoretical qualities and properties of their implementations. We developed an operationalisation of the framework in the form of Explainability Fact Sheets, which enable researchers and practitioners alike to quickly grasp capabilities and limitations of a particular explainable method. When used as a Work Sheet, our taxonomy can guide the development of new explainability approaches by aiding in their critical evaluation along the five proposed dimensions.

142 citations