scispace - formally typeset
Search or ask a question
Author

Ismael Garrido-Muñoz

Bio: Ismael Garrido-Muñoz is an academic researcher. The author has contributed to research in topics: Deep learning & Transfer of learning. The author has an hindex of 1, co-authored 1 publications receiving 15 citations.

Papers
More filters
Posted ContentDOI
TL;DR: Bias is introduced in a formal way and how it has been treated in several networks, in terms of detection and correction, and a strategy to deal with bias in deep NLP is proposed.
Abstract: Deep neural networks are hegemonic approaches to many machine learning areas, including natural language processing (NLP). Thanks to the availability of large corpora collections and the capability of deep architectures to shape internal language mechanisms in self-supervised learning processes (also known as “pre-training”), versatile and performing models are released continuously for every new network design. These networks, somehow, learn a probability distribution of words and relations across the training collection used, inheriting the potential flaws, inconsistencies and biases contained in such a collection. As pre-trained models have been found to be very useful approaches to transfer learning, dealing with bias has become a relevant issue in this new scenario. We introduce bias in a formal way and explore how it has been treated in several networks, in terms of detection and correction. In addition, available resources are identified and a strategy to deal with bias in deep NLP is proposed.

78 citations

TL;DR: In this paper , the authors focus on natural language bias in neural networks and show that the model will have vastly different results depending on attributes such as the subject's gender, race or religion.
Abstract: Recent advances in artificial intelligence have made it possible to make our everyday lives better, from apps that translate with great accuracy, to search engines that understand your query, to virtual assistants that answer your questions. However, these models capture the biases present in society and incorporate them into their knowledge. These biases and prejudices appear in a multitude of applications and systems. Given the same context, the model will have vastly different results depending on attributes such as the subject’s gender, race or religion. In this thesis we focus on natural language bias in neural networks. However, bias is present in many areas of artificial intelligence. This behaviour is pervasive, first the models capture these associations and then replicate them as a result of their application. Bias in AI is encompassed in study areas such as Fairness or Explainability
Proceedings Article
TL;DR: In this paper , a data visualization tool developed during the investigation of the bias present in deep learning language models in Spanish is presented, which allows us to explore in detail the outcome of the response of the models we present with a set of template sentences, allowing us to compare the behavior of the model when the templates are presented with a context that alludes to a man or a woman.
Abstract: This paper presents a data visualization tool developed during the investigation of the bias present in deep learning language models in Spanish. The tool allows us to explore in detail the outcome of the response of the models we present with a set of template sentences, allowing us to compare the behavior of the models when the templates are presented with a context that alludes to a man or a woman. The exploration of the data in the tool is performed at various levels of detail, from visualizing the model output itself with its weights to visualizing the aggregation of the results by categories. It will be this last visualization that will provide some interesting conclusions about how the models perceive mainly women by their bodies and men by their behavior.

Cited by
More filters
Journal ArticleDOI
TL;DR: PromptInject as discussed by the authors proposes a prosaic alignment framework for mask-based iterative adversarial prompt composition, examining how GPT-3 can be easily misaligned by simple handcrafted inputs.
Abstract: Transformer-based large language models (LLMs) provide a powerful foundation for natural language tasks in large-scale customer-facing applications. However, studies that explore their vulnerabilities emerging from malicious user interaction are scarce. By proposing PromptInject, a prosaic alignment framework for mask-based iterative adversarial prompt composition, we examine how GPT-3, the most widely deployed language model in production, can be easily misaligned by simple handcrafted inputs. In particular, we investigate two types of attacks -- goal hijacking and prompt leaking -- and demonstrate that even low-aptitude, but sufficiently ill-intentioned agents, can easily exploit GPT-3's stochastic nature, creating long-tail risks. The code for PromptInject is available at https://github.com/agencyenterprise/PromptInject.

31 citations

Proceedings ArticleDOI
01 Jan 2022
TL;DR: It is found that many metrics to quantify ‘bias’ and ‘fairness’ in language models are not compatible with each other and highly depend on the choice of embeddings, which indicates that fairness or bias evaluation remains challenging for contextualized language models.
Abstract: An increasing awareness of biased patterns in natural language processing resources such as BERT has motivated many metrics to quantify ‘bias’ and ‘fairness’ in these resources. However, comparing the results of different metrics and the works that evaluate with such metrics remains difficult, if not outright impossible. We survey the literature on fairness metrics for pre-trained language models and experimentally evaluate compatibility, including both biases in language models and in their downstream tasks. We do this by combining traditional literature survey, correlation analysis and empirical evaluations. We find that many metrics are not compatible with each other and highly depend on (i) templates, (ii) attribute and target seeds and (iii) the choice of embeddings. We also see no tangible evidence of intrinsic bias relating to extrinsic bias. These results indicate that fairness or bias evaluation remains challenging for contextualized language models, among other reasons because these choices remain subjective. To improve future comparisons and fairness evaluations, we recommend to avoid embedding-based metrics and focus on fairness evaluations in downstream tasks.

29 citations

Journal ArticleDOI
TL;DR: In this article, a systematic study of the limitations and challenges of existing methods for mitigating bias in toxicity detection is presented, along with a case study to introduce the concept of bias shift due to knowledge-based bias mitigation.
Abstract: Detecting online toxicity has always been a challenge due to its inherent subjectivity. Factors such as the context, geography, socio-political climate, and background of the producers and consumers of the posts play a crucial role in determining if the content can be flagged as toxic. Adoption of automated toxicity detection models in production can thus lead to a sidelining of the various groups they aim to help in the first place. It has piqued researchers’ interest in examining unintended biases and their mitigation. Due to the nascent and multi-faceted nature of the work, complete literature is chaotic in its terminologies, techniques, and findings. In this paper, we put together a systematic study of the limitations and challenges of existing methods for mitigating bias in toxicity detection. We look closely at proposed methods for evaluating and mitigating bias in toxic speech detection. To examine the limitations of existing methods, we also conduct a case study to introduce the concept of bias shift due to knowledge-based bias mitigation. The survey concludes with an overview of the critical challenges, research gaps, and future directions. While reducing toxicity on online platforms continues to be an active area of research, a systematic study of various biases and their mitigation strategies will help the research community produce robust and fair models.

18 citations

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors provide a comprehensive survey of multi-modal pre-training models, including bidirectional encoder representations (BERT), vision transformer (ViT), generative pre-trained transformers (GPT), etc.
Abstract: With the urgent demand for generalized deep models, many pre-trained big models are proposed, such as bidirectional encoder representations (BERT), vision transformer (ViT), generative pre-trained transformers (GPT), etc. Inspired by the success of these models in single domains (like computer vision and natural language processing), the multi-modal pre-trained big models have also drawn more and more attention in recent years. In this work, we give a comprehensive survey of these models and hope this paper could provide new insights and helps fresh researchers to track the most cutting-edge works. Specifically, we firstly introduce the background of multi-modal pre-training by reviewing the conventional deep learning, pre-training works in natural language process, computer vision, and speech. Then, we introduce the task definition, key challenges, and advantages of multi-modal pre-training models (MM-PTMs), and discuss the MM-PTMs with a focus on data, objectives, network architectures, and knowledge enhanced pre-training. After that, we introduce the downstream tasks used for the validation of large-scale MM-PTMs, including generative, classification, and regression tasks. We also give visualization and analysis of the model parameters and results on representative downstream tasks. Finally, we point out possible research directions for this topic that may benefit future works. In addition, we maintain a continuously updated paper list for large-scale pre-trained multi-modal big models: https://github.com/wangxiao5791509/MultiModal_BigModels_Survey.

16 citations

Proceedings ArticleDOI
20 Oct 2022
TL;DR: In this article , the authors highlight the importance of extrinsic bias metrics that measure how a model's performance on some task is affected by gender, as opposed to intrinsic evaluations of model representations.
Abstract: Considerable efforts to measure and mitigate gender bias in recent years have led to the introduction of an abundance of tasks, datasets, and metrics used in this vein. In this position paper, we assess the current paradigm of gender bias evaluation and identify several flaws in it. First, we highlight the importance of extrinsic bias metrics that measure how a model’s performance on some task is affected by gender, as opposed to intrinsic evaluations of model representations, which are less strongly connected to specific harms to people interacting with systems. We find that only a few extrinsic metrics are measured in most studies, although more can be measured. Second, we find that datasets and metrics are often coupled, and discuss how their coupling hinders the ability to obtain reliable conclusions, and how one may decouple them. We then investigate how the choice of the dataset and its composition, as well as the choice of the metric, affect bias measurement, finding significant variations across each of them. Finally, we propose several guidelines for more reliable gender bias evaluation.

15 citations