scispace - formally typeset
Search or ask a question
Author

James O'Shea

Bio: James O'Shea is an academic researcher from Manchester Metropolitan University. The author has contributed to research in topics: Semantic similarity & Fuzzy classification. The author has an hindex of 17, co-authored 90 publications receiving 1702 citations. Previous affiliations of James O'Shea include University College Dublin & University of Edinburgh.


Papers
More filters
Journal ArticleDOI
TL;DR: Experiments demonstrate that the proposed method provides a similarity measure that shows a significant correlation to human intuition and can be used in a variety of applications that involve text knowledge representation and discovery.
Abstract: Sentence similarity measures play an increasingly important role in text-related research and applications in areas such as text mining, Web page retrieval, and dialogue systems. Existing methods for computing sentence similarity have been adopted from approaches used for long text documents. These methods process sentences in a very high-dimensional space and are consequently inefficient, require human input, and are not adaptable to some application domains. This paper focuses directly on computing the similarity between very short texts of sentence length. It presents an algorithm that takes account of semantic information and word order information implied in the sentences. The semantic similarity of two sentences is calculated using information from a structured lexical database and from corpus statistics. The use of a lexical database enables our method to model human common sense knowledge and the incorporation of corpus statistics allows our method to be adaptable to different domains. The proposed method can be used in a variety of applications that involve text knowledge representation and discovery. Experiments on two sets of selected sentence pairs demonstrate that the proposed method provides a similarity measure that shows a significant correlation to human intuition

850 citations

Book ChapterDOI
26 Mar 2008
TL;DR: A comparative study of STASIS and LSA is described, which shows measures of semantic similarity can be applied to short texts for use in Conversational Agents (CAs), and a benchmark data set of 65 sentence pairs with human-derived similarity ratings is presented.
Abstract: This paper describes a comparative study of STASIS and LSA These measures of semantic similarity can be applied to short texts for use in Conversational Agents (CAs) CAs are computer programs that interact with humans through natural language dialogue Business organizations have spent large sums of money in recent years developing them for online customer selfservice, but achievements have been limited to simple FAQ systems We believe this is due to the labour-intensive process of scripting, which could be reduced radically by the use of short-text semantic similarity measures "Short texts" are typically 10-20 words long but are not required to be grammatically correct sentences, for example spoken utterances and text messages We also present a benchmark data set of 65 sentence pairs with human-derived similarity ratings This data set is the first of its kind, specifically developed to evaluate such measures and we believe it will be valuable to future researchers

86 citations

BookDOI
01 Jun 2016

58 citations

Book
16 Apr 2007
TL;DR: In this article, the authors discuss the philosophical quest and the clash of the images and the status of the sensible qualities in science, and the ontological primacy of the scientific image.
Abstract: Dedication. Preface. Introduction. Chapter One: The Philosophical Quest and the Clash of the Images. The quest for a stereoscopic fusion of the manifest and scientific images. The clash of the images and the status of the sensible qualities. Sensing, thinking, and willing: persons as complex physical systems?. Chapter Two: Scientific Realism and the Scientific Image. Empiricist approaches to the interpretation of scientific theories Sellars' critique of empiricism and his defense of scientific realism. The ontological primacy of the scientific image. Chapter Three: Meaning and Abstract Entities. Approaching thought through language: is meaning a relation?. Sellars' alternative functional role conception of meaning. The problem of abstract entities: introducing Sellars' nominalism. Abstract entities: problems and prospects for the metalinguistic account. Chapter Four: Thought, Language, and the Myth of Genius Jones. Meaning and pattern-governed linguistic behavior. Bedrock uniformity and rule-following normativity in the space of meanings. Our Rylean ancestors and genius. Jones's theory of inner thoughts. Privileged access and other issues in Sellars' account of thinking. Chapter Five: Knowledge, Immediate Experience, and the Myth of the Given. The idea of the given and the case of sense-datum theories. Toward Sellars' account of perception and appearance. Epistemic principles and the holistic structure of our knowledge. Genius Jones act two: the intrinsic character of our sensory experiences. Chapter Six: Truth, Picturing, and Ultimate Ontology. Truth as semantic assertibility and truth as correspondence. Picturing, linguistic representation, and reference. Truth, conceptual change, and the ideal scientific image. The ontology of sensory consciousness and absolute processes. Chapter Seven: A Synoptic Vision: Sellars' Naturalism with a Normative Turn. The structure of Sellars' normative 'copernican revolution.'. Intentions, volitions, and the moral point of view. Persons in the synoptic vision.

55 citations

Journal ArticleDOI
TL;DR: A novel fuzzy inference algorithm to generate fuzzy decision trees from induced crisp decision trees is proposed, suggesting that the later fuzzy tree is significantly more robust and produces a more balanced classification.

50 citations


Cited by
More filters
01 Jan 2009

7,241 citations

Journal ArticleDOI
TL;DR: This article proposes a framework for representing the meaning of word combinations in vector space in terms of additive and multiplicative functions, and introduces a wide range of composition models that are evaluated empirically on a phrase similarity task.

981 citations

Journal ArticleDOI
TL;DR: This first of its kind, comprehensive literature review of the diverse field of affective computing focuses mainly on the use of audio, visual and text information for multimodal affect analysis, and outlines existing methods for fusing information from different modalities.

969 citations

Book
01 Jan 1975
TL;DR: The major change in the second edition of this book is the addition of a new chapter on probabilistic retrieval, which I think is one of the most interesting and active areas of research in information retrieval.
Abstract: The major change in the second edition of this book is the addition of a new chapter on probabilistic retrieval. This chapter has been included because I think this is one of the most interesting and active areas of research in information retrieval. There are still many problems to be solved so I hope that this particular chapter will be of some help to those who want to advance the state of knowledge in this area. All the other chapters have been updated by including some of the more recent work on the topics covered. In preparing this new edition I have benefited from discussions with Bruce Croft, The material of this book is aimed at advanced undergraduate information (or computer) science students, postgraduate library science students, and research workers in the field of IR. Some of the chapters, particularly Chapter 6 * , make simple use of a little advanced mathematics. However, the necessary mathematical tools can be easily mastered from numerous mathematical texts that now exist and, in any case, references have been given where the mathematics occur. I had to face the problem of balancing clarity of exposition with density of references. I was tempted to give large numbers of references but was afraid they would have destroyed the continuity of the text. I have tried to steer a middle course and not compete with the Annual Review of Information Science and Technology. Normally one is encouraged to cite only works that have been published in some readily accessible form, such as a book or periodical. Unfortunately, much of the interesting work in IR is contained in technical reports and Ph.D. theses. For example, most the work done on the SMART system at Cornell is available only in reports. Luckily many of these are now available through the National Technical Information Service (U.S.) and University Microfilms (U.K.). I have not avoided using these sources although if the same material is accessible more readily in some other form I have given it preference. I should like to acknowledge my considerable debt to many people and institutions that have helped me. Let me say first that they are responsible for many of the ideas in this book but that only I wish to be held responsible. My greatest debt is to Karen Sparck Jones who taught me to research information retrieval as an experimental science. Nick Jardine and Robin …

822 citations