scispace - formally typeset
Search or ask a question
Author

Harith Alani

Bio: Harith Alani is an academic researcher from Open University. The author has contributed to research in topics: Ontology (information science) & Semantic Web. The author has an hindex of 47, co-authored 191 publications receiving 7718 citations. Previous affiliations of Harith Alani include University of South Wales & University of Southampton.


Papers
More filters
Book ChapterDOI
11 Nov 2012
TL;DR: This paper introduces a novel approach of adding semantics as additional features into the training set for sentiment analysis by adding its semantic concept from tweets as an additional feature, and measures the correlation of the representative concept with negative/positive sentiment.
Abstract: Sentiment analysis over Twitter offer organisations a fast and effective way to monitor the publics' feelings towards their brand, business, directors, etc. A wide range of features and methods for training sentiment classifiers for Twitter datasets have been researched in recent years with varying results. In this paper, we introduce a novel approach of adding semantics as additional features into the training set for sentiment analysis. For each extracted entity (e.g. iPhone) from tweets, we add its semantic concept (e.g. "Apple product") as an additional feature, and measure the correlation of the representative concept with negative/positive sentiment. We apply this approach to predict sentiment for three different Twitter datasets. Our results show an average increase of F harmonic accuracy score for identifying both negative and positive sentiment of around 6.5% and 4.8% over the baselines of unigrams and part-of-speech features respectively. We also compare against an approach based on sentiment-bearing topic analysis, and find that semantic features produce better Recall and F score when classifying negative sentiment, and better Precision with lower Recall and F score in positive sentiment classification.

501 citations

Journal ArticleDOI
TL;DR: The Artequakt project is considered, which links a knowledge extraction tool with an ontology to achieve continuous knowledge support and guide information extraction and is further enhanced using a lexicon-based term expansion mechanism that provides extended ontology terminology.
Abstract: To bring the Semantic Web to life and provide advanced knowledge services, we need efficient ways to access and extract knowledge from Web documents. Although Web page annotations could facilitate such knowledge gathering, annotations are rare and will probably never be rich or detailed enough to cover all the knowledge these documents contain. Manual annotation is impractical and unscalable, and automatic annotation tools remain largely undeveloped. Specialized knowledge services therefore require tools that can search and extract specific knowledge directly from unstructured text on the Web, guided by an ontology that details what type of knowledge to harvest. An ontology uses concepts and relations to classify domain knowledge. Other researchers have used ontologies to support knowledge extraction, but few have explored their full potential in this domain. The paper considers the Artequakt project which links a knowledge extraction tool with an ontology to achieve continuous knowledge support and guide information extraction. The extraction tool searches online documents and extracts knowledge that matches the given classification structure. It provides this knowledge in a machine-readable format that will be automatically maintained in a knowledge base (KB). Knowledge extraction is further enhanced using a lexicon-based term expansion mechanism that provides extended ontology terminology.

490 citations

Proceedings Article
01 May 2004
TL;DR: It is proposed in this paper that one approach to ontology evaluation should be corpus or data driven, because a corpus is the most accessible form of knowledge and its use allows a measure to be derived of the ‘fit’ between an ontology and a domain of knowledge.
Abstract: The evaluation of ontologies is vital for the growth of the Semantic Web. We consider a number of problems in evaluating a knowledge artifact like an ontology. We propose in this paper that one approach to ontology evaluation should be corpus or data driven. A corpus is the most accessible form of knowledge and its use allows a measure to be derived of the ‘fit’ between an ontology and a domain of knowledge. We consider a number of methods for measuring this ‘fit’ and propose a measure to evaluate structural fit, and a probabilistic approach to identifying the best ontology.

407 citations

Journal ArticleDOI
TL;DR: Different from typical lexicon-based approaches, SentiCircles takes into account the co-occurrence patterns of words in different contexts in tweets to capture their semantics and update their pre-assigned strength and polarity in sentiment lexicons accordingly.
Abstract: We propose a semantic sentiment representation of words called SentiCircle.SentiCircle captures the contextual semantic of words from their co-occurrences.SentiCircle updates the sentiment of words based on their contextual semantics.SentiCircle can be used to perform entity- and tweet-level level sentiment analysis. Sentiment analysis on Twitter has attracted much attention recently due to its wide applications in both, commercial and public sectors. In this paper we present SentiCircles, a lexicon-based approach for sentiment analysis on Twitter. Different from typical lexicon-based approaches, which offer a fixed and static prior sentiment polarities of words regardless of their context, SentiCircles takes into account the co-occurrence patterns of words in different contexts in tweets to capture their semantics and update their pre-assigned strength and polarity in sentiment lexicons accordingly. Our approach allows for the detection of sentiment at both entity-level and tweet-level. We evaluate our proposed approach on three Twitter datasets using three different sentiment lexicons to derive word prior sentiments. Results show that our approach significantly outperforms the baselines in accuracy and F-measure for entity-level subjectivity (neutral vs. polar) and polarity (positive vs. negative) detections. For tweet-level sentiment detection, our approach performs better than the state-of-the-art SentiStrength by 4-5% in accuracy in two datasets, but falls marginally behind by 1% in F-measure in the third dataset.

375 citations

Journal ArticleDOI
TL;DR: A broad multidisciplinary overview of the area of bias in AI systems is provided, focusing on technical challenges and solutions as well as to suggest new research directions towards approaches well‐grounded in a legal frame.
Abstract: Artificial Intelligence (AI)‐based systems are widely employed nowadays to make decisions that have far‐reaching impact on individuals and society. Their decisions might affect everyone, everywhere, and anytime, entailing concerns about potential human rights issues. Therefore, it is necessary to move beyond traditional AI algorithms optimized for predictive performance and embed ethical and legal principles in their design, training, and deployment to ensure social good while still benefiting from the huge potential of the AI technology. The goal of this survey is to provide a broad multidisciplinary overview of the area of bias in AI systems, focusing on technical challenges and solutions as well as to suggest new research directions towards approaches well‐grounded in a legal frame. In this survey, we focus on data‐driven AI, as a large part of AI is powered nowadays by (big) data and powerful machine learning algorithms. If otherwise not specified, we use the general term bias to describe problems related to the gathering or processing of data that might result in prejudiced decisions on the bases of demographic features such as race, sex, and so forth.

271 citations


Cited by
More filters
Journal ArticleDOI

[...]

08 Dec 2001-BMJ
TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.
Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

33,785 citations

Book
01 May 2012
TL;DR: Sentiment analysis and opinion mining is the field of study that analyzes people's opinions, sentiments, evaluations, attitudes, and emotions from written language as discussed by the authors and is one of the most active research areas in natural language processing and is also widely studied in data mining, Web mining, and text mining.
Abstract: Sentiment analysis and opinion mining is the field of study that analyzes people's opinions, sentiments, evaluations, attitudes, and emotions from written language. It is one of the most active research areas in natural language processing and is also widely studied in data mining, Web mining, and text mining. In fact, this research has spread outside of computer science to the management sciences and social sciences due to its importance to business and society as a whole. The growing importance of sentiment analysis coincides with the growth of social media such as reviews, forum discussions, blogs, micro-blogs, Twitter, and social networks. For the first time in human history, we now have a huge volume of opinionated data recorded in digital form for analysis. Sentiment analysis systems are being applied in almost every business and social domain because opinions are central to almost all human activities and are key influencers of our behaviors. Our beliefs and perceptions of reality, and the choices we make, are largely conditioned on how others see and evaluate the world. For this reason, when we need to make a decision we often seek out the opinions of others. This is true not only for individuals but also for organizations. This book is a comprehensive introductory and survey text. It covers all important topics and the latest developments in the field with over 400 references. It is suitable for students, researchers and practitioners who are interested in social media analysis in general and sentiment analysis in particular. Lecturers can readily use it in class for courses on natural language processing, social media analysis, text mining, and data mining. Lecture slides are also available online.

4,515 citations

Journal Article
TL;DR: Thaler and Sunstein this paper described a general explanation of and advocacy for libertarian paternalism, a term coined by the authors in earlier publications, as a general approach to how leaders, systems, organizations, and governments can nudge people to do the things the nudgers want and need done for the betterment of the nudgees, or of society.
Abstract: NUDGE: IMPROVING DECISIONS ABOUT HEALTH, WEALTH, AND HAPPINESS by Richard H. Thaler and Cass R. Sunstein Penguin Books, 2009, 312 pp, ISBN 978-0-14-311526-7This book is best described formally as a general explanation of and advocacy for libertarian paternalism, a term coined by the authors in earlier publications. Informally, it is about how leaders, systems, organizations, and governments can nudge people to do the things the nudgers want and need done for the betterment of the nudgees, or of society. It is paternalism in the sense that "it is legitimate for choice architects to try to influence people's behavior in order to make their lives longer, healthier, and better", (p. 5) It is libertarian in that "people should be free to do what they like - and to opt out of undesirable arrangements if they want to do so", (p. 5) The built-in possibility of opting out or making a different choice preserves freedom of choice even though people's behavior has been influenced by the nature of the presentation of the information or by the structure of the decisionmaking system. I had never heard of libertarian paternalism before reading this book, and I now find it fascinating.Written for a general audience, this book contains mostly social and behavioral science theory and models, but there is considerable discussion of structure and process that has roots in mathematical and quantitative modeling. One of the main applications of this social system is economic choice in investing, selecting and purchasing products and services, systems of taxes, banking (mortgages, borrowing, savings), and retirement systems. Other quantitative social choice systems discussed include environmental effects, health care plans, gambling, and organ donations. Softer issues that are also subject to a nudge-based approach are marriage, education, eating, drinking, smoking, influence, spread of information, and politics. There is something in this book for everyone.The basis for this libertarian paternalism concept is in the social theory called "science of choice", the study of the design and implementation of influence systems on various kinds of people. The terms Econs and Humans, are used to refer to people with either considerable or little rational decision-making talent, respectively. The various libertarian paternalism concepts and systems presented are tested and compared in light of these two types of people. Two foundational issues that this book has in common with another book, Network of Echoes: Imitation, Innovation and Invisible Leaders, that was also reviewed for this issue of the Journal are that 1 ) there are two modes of thinking (or components of the brain) - an automatic (intuitive) process and a reflective (rational) process and 2) the need for conformity and the desire for imitation are powerful forces in human behavior. …

3,435 citations