scispace - formally typeset
Search or ask a question

Showing papers on "Similarity (psychology) published in 2006"


Proceedings Article
16 Jul 2006
TL;DR: This paper shows that the semantic similarity method out-performs methods based on simple lexical matching, resulting in up to 13% error rate reduction with respect to the traditional vector-based similarity metric.
Abstract: This paper presents a method for measuring the semantic similarity of texts, using corpus-based and knowledge-based measures of similarity. Previous work on this problem has focused mainly on either large documents (e.g. text classification, information retrieval) or individual words (e.g. synonymy tests). Given that a large fraction of the information available today, on the Web and elsewhere, consists of short text snippets (e.g. abstracts of scientific documents, imagine captions, product descriptions), in this paper we focus on measuring the semantic similarity of short texts. Through experiments performed on a paraphrase data set, we show that the semantic similarity method out-performs methods based on simple lexical matching, resulting in up to 13% error rate reduction with respect to the traditional vector-based similarity metric.

1,308 citations


Journal ArticleDOI
TL;DR: This work argues for a ‘hybrid’ usage-based view in which acquisition depends on exemplar learning and retention, out of which permanent abstract schemas gradually emerge and are immanent across the summed similarity of exemplar collections.
Abstract: The early phases of syntactic acquisition are characterized by many input frequency and item effects, which argue against theories assuming innate access to classical syntactic categories. In formulating an alternative view, we consider both prototype and exemplar-learning models of categorization. We argue for a ‘hybrid’ usage-based view in which acquisition depends on exemplar learning and retention, out of which permanent abstract schemas gradually emerge and are immanent across the summed similarity of exemplar collections. These schemas are graded in strength depending on the number of exemplars and the degreeto whichsemantic similarityis reinforcedbyphonological, lexical, and distributional similarity.

234 citations


Patent
03 Feb 2006
TL;DR: In this paper, a knowledge base consisting of a collection of mediasets is used for identifying a new set of media items in response to an input set (or query set) of media item and knowledge base metrics.
Abstract: Systems and methods are disclosed for identifying a new set of media items in response to an input set (or “query set”) of media items and knowledge base metrics. The system uses a knowledge base consisting of a collection of mediasets. Various metrics among media items are considered by analyzing how the media items are grouped to form the mediasets in the knowledge base. Such association or “similarity” metrics are preferably stored in a matrix form that allows the system to efficiently identify a new set of media items that complements the input set of media items.

185 citations


Proceedings ArticleDOI
08 Oct 2006
TL;DR: A number of counterarguments that emphasize the importance of continuing research in automatic genre classification are presented and specific strategies for overcoming current performance limitations are discussed.
Abstract: Research in automatic genre classification has been producing increasingly small performance gains in recent years, with the result that some have suggested that such research should be abandoned in favor of more general similarity research. It has been further argued that genre classification is of limited utility as a goal in itself because of the ambiguities and subjectivity inherent to genre. This paper presents a number of counterarguments that emphasize the importance of continuing research in automatic genre classification. Specific strategies for overcoming current performance limitations are discussed, and a brief review of background research in musicology and psychology relating to genre is presented. Insights from these highly relevant fields are generally absent from discourse within the MIR community, and it is hoped that this will help to encourage a more multi-disciplinary approach to automatic genre classification in the future.

156 citations


Journal ArticleDOI
TL;DR: Results show a consistent focus on relational matches as the main determinant of analogical acceptance, and suggest analogy does not require strict overall identity of relational concepts.

151 citations


01 Jan 2006
TL;DR: A method for measuring the similarity of FCA concepts is presented, which is a refinement of a previous proposal of the author that allows a higher correlation with human judgement than other proposals for evaluating concept similarity in a taxonomy defined in the literature.
Abstract: Formal Concept Analysis (FCA) is revealing interesting in supporting difficult activities that are becoming fundamental in the development of the Semantic Web. Assessing concept similarity is one of such activities since it allows the identification of different concepts that are semantically close. In this paper, a method for measuring the similarity of FCA concepts is presented, which is a refinement of a previous proposal of the author. The refinement consists in determining the similarity of concept descriptors (attributes) by using the information content approach, rather than relying on human domain expertise. The information content approach which has been adopted allows a higher correlation with human judgement than other proposals for evaluating concept similarity in a taxonomy defined in the literature.

124 citations


Journal ArticleDOI
TL;DR: This paper examined the relationship between demographic similarity in the supervisor-subordinate dyad and family-supportive supervision and found that supervisors provided more family support to subordinates who were similar in either gender or race than to those subordinates who are dissimilar.
Abstract: This study examines the relationship between demographic similarity in the supervisor-subordinate dyad and family-supportive supervision. The authors found that supervisors provided more family support to subordinates who were similar in either gender or race than to those subordinates who were dissimilar. In addition, family-supportive supervision was highest when subordinates were similar to supervisors in both gender and race. A family-supportive organizational culture was positively related to family-supportive supervision, although contrary to what was predicted, it did not attenuate the effects of gender similarity and racial similarity on family-supportive supervision. Implications of the findings and directions for future research are discussed.

122 citations


Book ChapterDOI
TL;DR: In this paper, the authors reexamine prototype theory and the evidences with which it is associated, and the determination of extension is achieved by specifying a measure of the match between the representation of an object or class and the prototype representing the category.
Abstract: Publisher Summary This chapter reexamines prototype theory and the evidences with which it is associated. In general, the way in which prototype structure was demonstrated for a domain was to establish one or more of four key phenomena about categories in that domain. The four key phenomena are vagueness, typicality, genericity, and opacity. These are well documented and they constituted the basis on which domains as different as syntactical word classes, phonetic categories, speech perception, speech acts, psychiatric diagnosis, and personality perception were given a prototype treatment. For Prototype Theory the determination of extension is achieved by specifying a measure of the match between the representation of an object or class and the prototype representing the category. Category vagueness provides support for prototype representations given an additional assumption that the representation itself or the processes that utilize that representation are subject to random or contextual noise. Variation in the typicality of category members is often cited as one of the core tenets of Prototype Theory. However, it is questionable whether the simple fact of typicality variation itself is particularly discriminating between Prototype Theory and other accounts of concepts. The problem is that when instructed to judge typicality or goodness-of-example it may be unclear just what aspect of the category members people may be attending to. Typicality effects can be identified that are not simply to do with the familiarity or availability of category members. Theories of concepts that do not base categorization on similarity tend to be dismissive of typicality effects. The fourth phenomenon considered to support a prototype view of concepts is the difficulty that has been encountered in generating good accurate definitions of the meanings of content words (particularly nouns and verbs) in any language.

109 citations


Book ChapterDOI
29 Oct 2006
TL;DR: This paper presents ongoing work to develop a context-aware similarity theory for concepts specified in expressive description logics such as $\mathcal ALCNR$.
Abstract: Similarity measurement theories play an increasing role in GIScience and especially in information retrieval and integration Existing feature and geometric models have proven useful in detecting close but not identical concepts and entities However, until now none of these theories are able to handle the expressivity of description logics for various reasons and therefore are not applicable to the kind of ontologies usually developed for geographic information systems or the upcoming geospatial semantic web To close the resulting gap between available similarity theories on the one side and existing ontologies on the other, this paper presents ongoing work to develop a context-aware similarity theory for concepts specified in expressive description logics such as $\mathcal ALCNR$.

87 citations


Journal ArticleDOI
TL;DR: In this article, the authors employed the "bogus stranger" paradigm and focused on similarity/dissimilarity of interests in the context of attraction to a same-gender other.
Abstract: This study tested the hypothesis from the self-expansion model that the usual effect of greater attraction to a similar (vs. dissimilar) stranger will be reduced or reversed when a person is given information that a relationship would be likely to develop (i.e., that they would be very likely to get along) with the other person. The study employed the ‘‘bogus stranger’’ paradigm and focused on similarity/dissimilarity of interests in the context of attraction to a samegender other. The effect for similarity under conditions in which no information is given about relationship likelihood replicated the usual pattern of greater attraction to similars. However, as predicted, a significant similarity by information interaction demonstrated that this effect was significantly reduced (and slightly reversed) when participants had been given information that the partner will like self. In analyses for each gender separately, both of these effects were significant only for men, suggesting that the focus on interest similarity may have been less relevant for women.

86 citations


Journal ArticleDOI
TL;DR: In this paper, three British datasets on genetically modified food were used to test the plausibility of a causal model that integrates three different approaches to trust: dimensional, salient value similarity, and associationist.
Abstract: Although it is widely recognized that trust plays an important role in people's responses to various risks, there is still considerable conceptual disagreement about the different aspects of trust. There are at least 3 different approaches to trust: (a) dimensional, (b) salient value similarity, and (c) associationist. Three British datasets on genetically modified food were used to test the plausibility of a causal model that integrates these approaches. It appears that value similarity can be predicted by a combination of prior attitudes and perceived attitudes of the other, and that value similarity precedes other important trust judgments. The study suggests that various risk-relevant judgments are expressions of a more general attitude toward genetically modified food, and raises questions about the usefulness of detailed modelling.

Journal ArticleDOI
TL;DR: It is shown that it is possible to match the social attributes of a technological artifact with those of the user, and the specific ways in which technology design can manifest social attributes are described.
Abstract: This research proposes that technological artifacts are perceived as social actors, and that users can attribute personality and behavioral traits to them. These formed perceptions interact with the user’s own characteristics to construct an evaluation of the similarity between the user and the technological artifact. Such perceptions of similarity are important because individuals tend to more positively evaluate others, in this case technological artifacts, to whom they are more similar. Using an automated shopping assistant as one type of technological artifact, we investigate two types of perceived similarity between the customer and the artifact: perceived personality similarity and perceived behavioral similarity. We then investigate how design characteristics drive a customer’s perceptions of these similarities and, importantly, the bases for those design characteristics. Decisional guidance and speech act theory provide the basis for personality manifestation, while normative versus heuristic-based decision rules provide the basis for behavioral manifestation. We apply these design bases in an experiment. The results demonstrate that IT design characteristics can be used to manifest desired personalities and behaviors in a technological artifact. Moreover, these manifestations of personality and behavior interact with the customer’s own personality and behaviors to create matching 1 Dennis Galetta was the accepting senior editor. This paper was submitted on March 1 2006 and went through 2 rounds of revision. Journal of the Association for Information Systems Vol. 7 No. 12, pp. 821-861/December 2006 821 The Role of Design Characteristics in Shaping Perceptions/Al-Natour et al. perceptions of personality and behavioral similarity between the customer and the artifact. This study emphasizes the need to consider technological artifacts as social actors and describes the specific ways in which technology design can manifest social attributes. In doing so, we show that it is possible to match the social attributes of a technological artifact with those of the user.

Journal ArticleDOI
TL;DR: The author presents some suggested research directions for furthering the understanding of similarity-driven reasoning, analogy, learning, and explanation in AI.
Abstract: As AI moves into the second half of its first century, we certainly have much to cheer about. For AI to become truly robust, we must further our understanding of similarity-driven reasoning, analogy, learning, and explanation. In this article, the author presents some suggested research directions

Proceedings ArticleDOI
18 Dec 2006
TL;DR: A new concept-based mining model that relies on the analysis of both the sentence and the document, rather than, the traditional analysis of the document dataset only is introduced and enhances the clustering quality of sets of documents substantially.
Abstract: Most of text mining techniques are based on word and/or phrase analysis of the text. The statistical analysis of a term (word or phrase) frequency captures the importance of the term within a document. However, to achieve a more accurate analysis, the underlying mining technique should indicate terms that capture the semantics of the text from which the importance of a term in a sentence and in the document can be derived. A new concept-based mining model that relies on the analysis of both the sentence and the document, rather than, the traditional analysis of the document dataset only is introduced. The proposed mining model consists of a concept-based analysis of terms and a concept-based similarity measure. The term which contributes to the sentence semantics is analyzed with respect to its importance at the sentence and document levels. The model can efficiently find significant matching terms, either words or phrases, of the documents according to the semantics of the text. The similarity between documents relies on a new concept-based similarity measure which is applied to the matching terms between documents. Experiments using the proposed concept-based term analysis and similarity measure in text clustering are conducted. Experimental results demonstrate that the newly developed concept-based mining model enhances the clustering quality of sets of documents substantially.

Journal ArticleDOI
TL;DR: Relation priming is a phenomenon in which comprehension of a word pair (e.g. COPPER HORSE) is facilitated by the prior presentation of another word pair that instantiates the same conceptual relation (i.e. composed of) as mentioned in this paper.

Proceedings ArticleDOI
18 Dec 2006
TL;DR: A novel research problem of mining the latent friends of bloggers based on the contents of their blog entries is put forward, and a detailed analysis of the advantages and disadvantages of different approaches are given.
Abstract: The rapid growth of blog (also known as "weblog") data provides a rich resource for social community mining. In this paper, we put forward a novel research problem of mining the latent friends of bloggers based on the contents of their blog entries. Latent friends are defined in this paper as people who share the similar topic distribution in their blogs. These people may not actually know each other, but they have the interest and potential to find each other out. Three approaches are designed for latent friend detection. The first one, called cosine similarity-based method, determines the similarity between bloggers by calculating the cosine similarity between the contents of the blogs. The second approach, known as topic-based method, is based on the discovery of latent topics using a latent topic model and then calculating the similarity at the topic level. The third one is two-level similarity-based, which is conducted in two stages. In the first stage, an existing topic hierarchy is exploited to build a topic distribution for a blogger. Then, in the second stage, a detailed similarity comparison is conducted for bloggers that are close in interest to each other which are discovered in the first stage. Our experimental results show that both the topic-based and two-level similarity-based methods work well, and the last approach performs much better than the first two. In this paper, we give a detailed analysis of the advantages and disadvantages of different approaches.

Journal ArticleDOI
TL;DR: The semantic similarity between protocols and the current sentence and prior causal sentences was computed using LSA, and the magnitude of the similarity, as expressed by LSA-generated cosines, predicted performance on comprehension questions and the Nelson-Denny test of comprehension.
Abstract: The present study used latent semantic analysis (LSA) to analyze verbal protocols that were collected while participants read expository passages. In the study, participants were asked to type their thoughts after reading each sentence of 2 scientific texts. The semantic similarity between the protocols and the current sentence and prior causal sentences was computed using LSA. The magnitude of the similarity, as expressed by LSA-generated cosines, predicted performance on comprehension questions and the Nelson-Denny test of comprehension. The results were discussed in the context of previous research and test development.

Proceedings Article
01 Jan 2006
TL;DR: The proposed mid-level melody-based representation is an attempt to bridge the gap between audio and symbolic domains by providing an integrated melodic, rhythmic and structural representation of music signals.
Abstract: We propose a mid-level melody-based representation that incorporates melodic, rhythmic and structural aspects of a music signal and is useful for calculating audio similarity measures. Most current approaches to music similarity use either low-level signal features, such as MFCCs that mostly capture timbral characteristics of music and contain little semantic information, or require symbolic representations, which are difficult to obtain from audio signals. The proposed mid-level representation is our attempt to bridge the gap between audio and symbolic domains by providing an integrated melodic, rhythmic and structural representation of music signals. The representation is based on a set of melodic fragments extracted from prominent melodic lines, it is beat-synchronous, which makes it independent of tempo variations and contains information on repetitions of short melodic phrases within the analyzed piece. We show how it can be calculated automatically from polyphonic audio signals and demonstrate its use for discovering melodic similarities between songs. We present results obtained by using the representation for finding different interpretations of songs in a music collection.

Journal ArticleDOI
TL;DR: Farrell et al. as mentioned in this paper showed that placing dissimilar items on lists of phonologically similar items enhances the accuracy of ordered recall of the dissimilar item, and the long-term similarity structure of the items leads to dissimilar objects being more distinct on mixed lists.

Book ChapterDOI
29 Oct 2006
TL;DR: A sensitive measurement of semantic similarity is defined, which takes into account different hints hidden in the ontology definition and explicitly considers the application context.
Abstract: The paper proposes a framework to assess the semantic similarity among instances within an ontology It aims to define a sensitive measurement of semantic similarity, which takes into account different hints hidden in the ontology definition and explicitly considers the application context The similarity measurement is computed by combining and extending existing similarity measures and tailoring them according to the criteria induced by the context Experiments and evaluation of the similarity assessment are provided.

Journal ArticleDOI
TL;DR: In two experiments, the authors investigate judgments of document similarity as a function of interpoint distance and region membership in region-display spatializations.
Abstract: Region-display spatializations represent documents metaphorically as points within regions. Semantic interrelatedness is expressed by some combination of interpoint distance and region membership. In two experiments, the authors investigate judgments of document similarity as a function of these variables. Distance matters, but region membership largely determines judged similarity; hue further modifies it. Spatial metaphors are a popular approach to visualizing information in such forms as information worlds, information spaces, and cyberspaces

Patent
24 Apr 2006
TL;DR: In this paper, a conceptual representation space is generated based on source language documents and target language documents, wherein respective terms from the source-language and target-language documents have a representation in the conceptual space.
Abstract: An embodiment of the present invention provides a method for automatically translating text. First, a conceptual representation space is generated based on source-language documents and target-language documents, wherein respective terms from the source-language and target-language documents have a representation in the conceptual representation space. Second, a new source-language document is represented in the conceptual representation space, wherein a subset of terms in the new source-language document is represented in the conceptual representation space, such that each term in the subset has a representation in the conceptual representation space. Then, a term in the new source-language document is automatically translated into a corresponding target-language term based on a similarity between the representation of the term and the representation of the corresponding target-language term.

Journal ArticleDOI
TL;DR: In this paper, the authors examined whether recruiter-applicant demographic similarity affects selection decisions and the mediators proposed by the similarity-attraction paradigm were tested, however, they found that the similarity was not correlated with selection decisions.
Abstract: This study examines whether recruiter-applicant demographic similarity affects selection decisions. In addition, the mediators proposed by the similarity-attraction paradigm were tested. However, c...

Journal ArticleDOI
TL;DR: This paper examined gender disparity in union leadership by studying the effects of gender similarity between union members and their stewards and found that gender similarity augmented the effect of verbal persuasion on self-efficacy to serve as a steward.
Abstract: We examined gender disparity in union leadership by studying the effects of gender similarity between union members and their stewards. We theorized and found that gender similarity augmented the effect of verbal persuasion on self-efficacy to serve as a steward.

Journal ArticleDOI
TL;DR: This article reviewed the current lines of thought about Neanderthals and explored the validity of the conclusions and constructed a social synthesis, a solid foundation upon which the validity of inferences regarding Neanderthal cognitive ability and behavioural complexity may be examined.
Abstract: The Neanderthals have long fascinated archaeologists and anthropologists alike. Similarity to us coupled with clear differences has produced endless theorizing. This article reviews the background to such ideas. It examines the current lines of thought about Neanderthals and explores the validity of the conclusions. The ultimate aim is the construction of a social synthesis, a solid foundation upon which the validity of inferences regarding Neanderthal cognitive ability and behavioural complexity may be examined.

Proceedings ArticleDOI
08 Jun 2006
TL;DR: Examining language similarity in messages over time in an online community of adolescents from around the world using three computational measures: Spearman's Correlation Coefficient, Zipping and Latent Semantic Analysis suggests that participants' language diverges over a six-week period.
Abstract: This paper examines language similarity in messages over time in an online community of adolescents from around the world using three computational measures: Spearman's Correlation Coefficient, Zipping and Latent Semantic Analysis. Results suggest that the participants' language diverges over a six-week period, and that divergence is not mediated by demographic variables such as leadership status or gender. This divergence may represent the introduction of more unique words over time, and is influenced by a continual change in subtopics over time, as well as community-wide historical events that introduce new vocabulary at later time periods. Our results highlight both the possibilities and shortcomings of using document similarity measures to assess convergence in language use.

Patent
Julia E. Rice1, Peter Schwarz1, Yu Deng1
12 Jul 2006
TL;DR: In this article, a generalized axiomatic definition of information-theoretic similarity is provided for taxonomies that are structured as directed acyclic graph form which multiple terns may be used to describe an object.
Abstract: A generalized axiomatic definition of information-theoretic similarity is provided for taxonomies that are structured as directed acyclic graph form which multiple terns may be used to describe an object. The definition is adaptable in the presence of ambiguity, as introduced by an evolving taxonomy or classifiers with imperfect knowledge, and two new similarity measures are introduced based on the definitions. A pragmatic implementation is also provided for similarity measures that arc tightly integrated with an object-relational database and scales to large taxonomies and large datasets.

23 Jul 2006
TL;DR: In many theoretical and applied areas of computational linguistics researchers operate with a notion of linguistic distance or, conversely, linguistic similarity, which is the focus of the present workshop.
Abstract: In many theoretical and applied areas of computational linguistics researchers operate with a notion of linguistic distance or, conversely, linguistic similarity, which is the focus of the present workshop. While many CL areas make frequent use of such notions, it has received little focused attention, an honorable exception being Lebart & Rajman (2000). This workshop brings a number of these strands together, highlighting a number of common issues.

Proceedings Article
01 Jan 2006
TL;DR: Multidimensional scaling reveals a proximity of songs belonging to the same genre, congruent with the idea of genre being a perceptual dimension in subjects’ similarity ranking.
Abstract: This paper presents an empirical method for assessing music similarity on a set of stimuli using triadic comparisons in a balanced incomplete block design. We first evaluated the consistency of subjects in their rankings and then the concordance across subjects. The concordance was also evaluated for different subject populations to assess the influence of experience of the subject with the musical material. We finally analysed subjects’ ranking by the means of multidimensional scaling. Similarity judgments were found to be rather concordant across subjects. Significant differences between musicians and non-musicians and between subjects being familiar or non-familiar with the music were found for a small number of cases. Multidimensional scaling reveals a proximity of songs belonging to the same genre, congruent with the idea of genre being a perceptual dimension in subjects’ similarity ranking.

Journal ArticleDOI
01 May 2006-Synthese
TL;DR: The purpose of this paper is to reevaluate Lewis’s response to one of the oldest and most familiar objections to this proposal, the future similarity objection.
Abstract: David Lewis has long defended an analysis of counterfactuals in terms of comparative similarity of possible worlds. The purpose of this paper is to reevaluate Lewis's response to one of the oldest and most familiar objections to this proposal, the future similarity objection.