Topic

Semantic similarity

About: Semantic similarity is a research topic. Over the lifetime, 14605 publications have been published within this topic receiving 364659 citations. The topic is also known as: semantic relatedness.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Blanket execution: dynamic similarity testing for program binaries and components

[...]

Manuel Egele¹, Maverick Woo¹, Peter Chapman¹, David Brumley¹•Institutions (1)

Carnegie Mellon University¹

20 Aug 2014

TL;DR: This work proposes blanket execution, a novel dynamic equivalence testing primitive that achieves complete coverage by overriding the intended program logic under a controlled randomized environment, and builds a binary search engine that identifies similar functions across optimization boundaries.

...read moreread less

Abstract: Matching function binaries--the process of identifying similar functions among binary executables--is a challenge that underlies many security applications such as malware analysis and patch-based exploit generation. Recent work tries to establish semantic similarity based on static analysis methods. Unfortunately, these methods do not perform well if the compared binaries are produced by different compiler toolchains or optimization levels. In this work, we propose blanket execution, a novel dynamic equivalence testing primitive that achieves complete coverage by overriding the intended program logic. Blanket execution collects the side effects of functions during execution under a controlled randomized environment. Two functions are deemed similar, if their corresponding side effects, as observed under the same environment, are similar too. We implement our blanket execution technique in a system called BLEX. We evaluate BLEX rigorously against the state of the art binary comparison tool BinDiff. When comparing optimized and un-optimized executables from the popular GNU coreutils package, BLEX outperforms BinDiff by up to 3.5 times in correctly identifying similar functions. BLEX also outperforms BinDiff if the binaries have been compiled by different compilers. Using the functionality in BLEX, we have also built a binary search engine that identifies similar functions across optimization boundaries. Averaged over all indexed functions, our search engine ranks the correct matches among the top ten results 77% of the time.

...read moreread less

173 citations

Proceedings Article•

Weakly Supervised Training of Semantic Parsers

[...]

Jayant Krishnamurthy¹, Tom M. Mitchell¹•Institutions (1)

Carnegie Mellon University¹

12 Jul 2012

TL;DR: This work presents a method for training a semantic parser using only a knowledge base and an unlabeled text corpus, without any individually annotated sentences, and demonstrates recovery of this richer structure by extracting logical forms from natural language queries against Freebase.

...read moreread less

Abstract: We present a method for training a semantic parser using only a knowledge base and an unlabeled text corpus, without any individually annotated sentences. Our key observation is that multiple forms of weak supervision can be combined to train an accurate semantic parser: semantic supervision from a knowledge base, and syntactic supervision from dependency-parsed sentences. We apply our approach to train a semantic parser that uses 77 relations from Freebase in its knowledge representation. This semantic parser extracts instances of binary relations with state-of-the-art accuracy, while simultaneously recovering much richer semantic structures, such as conjunctions of multiple relations with partially shared arguments. We demonstrate recovery of this richer structure by extracting logical forms from natural language queries against Freebase. On this task, the trained semantic parser achieves 80% precision and 56% recall, despite never having seen an annotated logical form.

...read moreread less

172 citations

Journal Article•DOI•

Structural and semantic matching for assessing web-service similarity

[...]

Eleni Stroulia¹, Yiqiao Wang¹•Institutions (1)

University of Alberta¹

01 Dec 2005-International Journal of Cooperative Information Systems

TL;DR: A suite of methods that assess the similarity between two WSDL (Web Service Description Language) specifications based on the structure of their data types and operations and the semantics of their natural language descriptions and identifiers are developed.

...read moreread less

Abstract: The web-services stack of standards is designed to support the reuse and interoperation of software components on the web. A critical step in the process of developing applications based on web services is service discovery, i.e. the identification of existing web services that can potentially be used in the context of a new web application. Discovery through catalog-style browsing (such as supported currently by web-service registries) is clearly insufficient. To support programmatic service discovery, we have developed a suite of methods that assess the similarity between two WSDL (Web Service Description Language) specifications based on the structure of their data types and operations and the semantics of their natural language descriptions and identifiers. Given only a textual description of the desired service, a semantic information-retrieval method can be used to identify and order the most relevant WSDL specifications based on the similarity of the element descriptions of the available specifications with the query. If a (potentially partial) specification of the desired service behavior is also available, this set of likely candidates can be further refined by a semantic structure-matching step, assessing the structural similarity of the desired vs the retrieved services and the semantic similarity of their identifiers. In this paper, we describe and experimentally evaluate our suite of service-similarity assessment methods.

...read moreread less

172 citations

Journal Article•DOI•

Effects of organization and semantic similarity on recall and recognition

[...]

George Mandler¹, Zena Pearlstone¹, Henry S. Koopmans¹•Institutions (1)

University of California, San Diego¹

01 Jun 1969-Journal of Verbal Learning and Verbal Behavior

TL;DR: Three experiments were performed to extend the previous finding that number of cate-gories in organized, categorized lists determines the number of words recalled and introduce the notion of a postrecognition retrieval check.

...read moreread less

172 citations

Proceedings Article•DOI•

Social ranking: uncovering relevant content using tag-based recommender systems

[...]

Valentina Zanardi¹, Licia Capra¹•Institutions (1)

University College London¹

23 Oct 2008

TL;DR: This paper proposes Social Ranking, a method that exploits recommender system techniques to increase the efficiency of searches within Web 2.0, and proposes a mechanism to answer a user's query that ranks content based on the inferred semantic distance of the query to the tags associated to such content, weighted by the similarity of the querying user to the users who created those tags.

...read moreread less

Abstract: Social (or folksonomic) tagging has become a very popular way to describe, categorise, search, discover and navigate content within Web 2.0 websites. Unlike taxonomies, which overimpose a hierarchical categorisation of content, folksonomies empower end users by enabling them to freely create and choose the categories (in this case, tags) that best describe some content. However, as tags are informally defined, continually changing, and ungoverned, social tagging has often been criticised for lowering, rather than increasing, the efficiency of searching, due to the number of synonyms, homonyms, polysemy, as well as the heterogeneity of users and the noise they introduce. In this paper, we propose Social Ranking, a method that exploits recommender system techniques to increase the efficiency of searches within Web 2.0. We measure users' similarity based on their past tag activity. We infer tags' relationships based on their association to content. We then propose a mechanism to answer a user's query that ranks (recommends) content based on the inferred semantic distance of the query to the tags associated to such content, weighted by the similarity of the querying user to the users who created those tags. A thorough evaluation conducted on the CiteULike dataset demonstrates that Social Ranking neatly improves coverage, while not compromising on accuracy.

...read moreread less

171 citations

Collapse

Network Information

Performance

Metrics

15,319

Papers

407,958

Citations

No. of papers in the topic in previous years
Year	Papers
2023	202
2022	522
2021	641
2020	837
2019	866
2018	787

Semantic similarity

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics