scispace - formally typeset
Search or ask a question
Author

Ingmar Weber

Bio: Ingmar Weber is an academic researcher from Qatar Computing Research Institute. The author has contributed to research in topics: Social media & Population. The author has an hindex of 49, co-authored 257 publications receiving 9175 citations. Previous affiliations of Ingmar Weber include École Polytechnique Fédérale de Lausanne & Yahoo!.


Papers
More filters
Proceedings Article
03 May 2017
TL;DR: This work used a crowd-sourced hate speech lexicon to collect tweets containing hate speech keywords and labels a sample of these tweets into three categories: those containinghate speech, only offensive language, and those with neither.
Abstract: A key challenge for automatic hate-speech detection on social media is the separation of hate speech from other instances of offensive language. Lexical detection methods tend to have low precision because they classify all messages containing particular terms as hate speech and previous work using supervised learning has failed to distinguish between the two categories. We used a crowd-sourced hate speech lexicon to collect tweets containing hate speech keywords. We use crowd-sourcing to label a sample of these tweets into three categories: those containing hate speech, only offensive language, and those with neither. We train a multi-class classifier to distinguish between these different categories. Close analysis of the predictions and the errors shows when we can reliably separate hate speech from other offensive language and when this differentiation is more difficult. We find that racist and homophobic tweets are more likely to be classified as hate speech but that sexist tweets are generally classified as offensive. Tweets without explicit hate keywords are also more difficult to classify.

1,425 citations

Posted Content
TL;DR: This article used a crowd-sourced hate speech lexicon to collect tweets containing hate speech keywords and trained a multi-class classifier to distinguish hate speech from other offensive language, finding that racist and homophobic tweets are more likely to be classified as hate speech but that sexist tweets are generally classified as offensive.
Abstract: A key challenge for automatic hate-speech detection on social media is the separation of hate speech from other instances of offensive language. Lexical detection methods tend to have low precision because they classify all messages containing particular terms as hate speech and previous work using supervised learning has failed to distinguish between the two categories. We used a crowd-sourced hate speech lexicon to collect tweets containing hate speech keywords. We use crowd-sourcing to label a sample of these tweets into three categories: those containing hate speech, only offensive language, and those with neither. We train a multi-class classifier to distinguish between these different categories. Close analysis of the predictions and the errors shows when we can reliably separate hate speech from other offensive language and when this differentiation is more difficult. We find that racist and homophobic tweets are more likely to be classified as hate speech but that sexist tweets are generally classified as offensive. Tweets without explicit hate keywords are also more difficult to classify.

871 citations

Proceedings ArticleDOI
21 Jul 2017
TL;DR: This paper introduces Recipe1M, a new large-scale, structured corpus of over 1m cooking recipes and 800k food images, and demonstrates that regularization via the addition of a high-level classification objective both improves retrieval performance to rival that of humans and enables semantic vector arithmetic.
Abstract: In this paper, we introduce Recipe1M, a new large-scale, structured corpus of over 1m cooking recipes and 800k food images. As the largest publicly available collection of recipe data, Recipe1M affords the ability to train high-capacity models on aligned, multi-modal data. Using these data, we train a neural network to find a joint embedding of recipes and images that yields impressive results on an image-recipe retrieval task. Additionally, we demonstrate that regularization via the addition of a high-level classification objective both improves retrieval performance to rival that of humans and enables semantic vector arithmetic. We postulate that these embeddings will provide a basis for further exploration of the Recipe1M dataset and food and cooking in general. Code, data and models are publicly available

346 citations

Proceedings ArticleDOI
01 May 2019
TL;DR: This article examined racial bias in five different sets of Twitter data annotated for hate speech and abusive language and found that abusive language detection systems may discriminate against the groups who are often the targets of the abuse we are trying to detect.
Abstract: Technologies for abusive language detection are being developed and applied with little consideration of their potential biases. We examine racial bias in five different sets of Twitter data annotated for hate speech and abusive language. We train classifiers on these datasets and compare the predictions of these classifiers on tweets written in African-American English with those written in Standard American English. The results show evidence of systematic racial bias in all datasets, as classifiers trained on them tend to predict that tweets written in African-American English are abusive at substantially higher rates. If these abusive language detection systems are used in the field they will therefore have a disproportionate negative impact on African-American social media users. Consequently, these systems may discriminate against the groups who are often the targets of the abuse we are trying to detect.

328 citations

Proceedings ArticleDOI
01 Aug 2017
TL;DR: This paper proposed a typology that captures central similarities and differences between subtasks and discuss the implications of this for data annotation and feature construction, emphasizing the practical actions that can be taken by researchers to best approach their abusive language detection subtask of interest.
Abstract: As the body of research on abusive language detection and analysis grows, there is a need for critical consideration of the relationships between different subtasks that have been grouped under this label. Based on work on hate speech, cyberbullying, and online abuse we propose a typology that captures central similarities and differences between subtasks and discuss the implications of this for data annotation and feature construction. We emphasize the practical actions that can be taken by researchers to best approach their abusive language detection subtask of interest.

315 citations


Cited by
More filters
01 Jan 2016
TL;DR: The using multivariate statistics is universally compatible with any devices to read, allowing you to get the most less latency time to download any of the authors' books like this one.
Abstract: Thank you for downloading using multivariate statistics. As you may know, people have look hundreds times for their favorite novels like this using multivariate statistics, but end up in infectious downloads. Rather than reading a good book with a cup of tea in the afternoon, instead they juggled with some harmful bugs inside their laptop. using multivariate statistics is available in our digital library an online access to it is set as public so you can download it instantly. Our books collection saves in multiple locations, allowing you to get the most less latency time to download any of our books like this one. Merely said, the using multivariate statistics is universally compatible with any devices to read.

14,604 citations

Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Book
01 Jan 2001
TL;DR: This chapter discusses Decision-Theoretic Foundations, Game Theory, Rationality, and Intelligence, and the Decision-Analytic Approach to Games, which aims to clarify the role of rationality in decision-making.
Abstract: Preface 1. Decision-Theoretic Foundations 1.1 Game Theory, Rationality, and Intelligence 1.2 Basic Concepts of Decision Theory 1.3 Axioms 1.4 The Expected-Utility Maximization Theorem 1.5 Equivalent Representations 1.6 Bayesian Conditional-Probability Systems 1.7 Limitations of the Bayesian Model 1.8 Domination 1.9 Proofs of the Domination Theorems Exercises 2. Basic Models 2.1 Games in Extensive Form 2.2 Strategic Form and the Normal Representation 2.3 Equivalence of Strategic-Form Games 2.4 Reduced Normal Representations 2.5 Elimination of Dominated Strategies 2.6 Multiagent Representations 2.7 Common Knowledge 2.8 Bayesian Games 2.9 Modeling Games with Incomplete Information Exercises 3. Equilibria of Strategic-Form Games 3.1 Domination and Ratonalizability 3.2 Nash Equilibrium 3.3 Computing Nash Equilibria 3.4 Significance of Nash Equilibria 3.5 The Focal-Point Effect 3.6 The Decision-Analytic Approach to Games 3.7 Evolution. Resistance. and Risk Dominance 3.8 Two-Person Zero-Sum Games 3.9 Bayesian Equilibria 3.10 Purification of Randomized Strategies in Equilibria 3.11 Auctions 3.12 Proof of Existence of Equilibrium 3.13 Infinite Strategy Sets Exercises 4. Sequential Equilibria of Extensive-Form Games 4.1 Mixed Strategies and Behavioral Strategies 4.2 Equilibria in Behavioral Strategies 4.3 Sequential Rationality at Information States with Positive Probability 4.4 Consistent Beliefs and Sequential Rationality at All Information States 4.5 Computing Sequential Equilibria 4.6 Subgame-Perfect Equilibria 4.7 Games with Perfect Information 4.8 Adding Chance Events with Small Probability 4.9 Forward Induction 4.10 Voting and Binary Agendas 4.11 Technical Proofs Exercises 5. Refinements of Equilibrium in Strategic Form 5.1 Introduction 5.2 Perfect Equilibria 5.3 Existence of Perfect and Sequential Equilibria 5.4 Proper Equilibria 5.5 Persistent Equilibria 5.6 Stable Sets 01 Equilibria 5.7 Generic Properties 5.8 Conclusions Exercises 6. Games with Communication 6.1 Contracts and Correlated Strategies 6.2 Correlated Equilibria 6.3 Bayesian Games with Communication 6.4 Bayesian Collective-Choice Problems and Bayesian Bargaining Problems 6.5 Trading Problems with Linear Utility 6.6 General Participation Constraints for Bayesian Games with Contracts 6.7 Sender-Receiver Games 6.8 Acceptable and Predominant Correlated Equilibria 6.9 Communication in Extensive-Form and Multistage Games Exercises Bibliographic Note 7. Repeated Games 7.1 The Repeated Prisoners Dilemma 7.2 A General Model of Repeated Garnet 7.3 Stationary Equilibria of Repeated Games with Complete State Information and Discounting 7.4 Repeated Games with Standard Information: Examples 7.5 General Feasibility Theorems for Standard Repeated Games 7.6 Finitely Repeated Games and the Role of Initial Doubt 7.7 Imperfect Observability of Moves 7.8 Repeated Wines in Large Decentralized Groups 7.9 Repeated Games with Incomplete Information 7.10 Continuous Time 7.11 Evolutionary Simulation of Repeated Games Exercises 8. Bargaining and Cooperation in Two-Person Games 8.1 Noncooperative Foundations of Cooperative Game Theory 8.2 Two-Person Bargaining Problems and the Nash Bargaining Solution 8.3 Interpersonal Comparisons of Weighted Utility 8.4 Transferable Utility 8.5 Rational Threats 8.6 Other Bargaining Solutions 8.7 An Alternating-Offer Bargaining Game 8.8 An Alternating-Offer Game with Incomplete Information 8.9 A Discrete Alternating-Offer Game 8.10 Renegotiation Exercises 9. Coalitions in Cooperative Games 9.1 Introduction to Coalitional Analysis 9.2 Characteristic Functions with Transferable Utility 9.3 The Core 9.4 The Shapkey Value 9.5 Values with Cooperation Structures 9.6 Other Solution Concepts 9.7 Colational Games with Nontransferable Utility 9.8 Cores without Transferable Utility 9.9 Values without Transferable Utility Exercises Bibliographic Note 10. Cooperation under Uncertainty 10.1 Introduction 10.2 Concepts of Efficiency 10.3 An Example 10.4 Ex Post Inefficiency and Subsequent Oilers 10.5 Computing Incentive-Efficient Mechanisms 10.6 Inscrutability and Durability 10.7 Mechanism Selection by an Informed Principal 10.8 Neutral Bargaining Solutions 10.9 Dynamic Matching Processes with Incomplete Information Exercises Bibliography Index

3,569 citations