Topic

Edit distance

About: Edit distance is a research topic. Over the lifetime, 2887 publications have been published within this topic receiving 71491 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

The Edit Distance as a Measure of Perceived Rhythmic Similarity

[...]

Olaf Post, Godfried T. Toussaint

01 Jul 2011-Empirical Musicology Review

TL;DR: In this paper, the effectiveness of the edit distance as a predictor of perceived rhythmic dissimilarity under simple rhythmic alterations was investigated, where rhythms were approached as a set of pulses that are either onsets or silences.

...read moreread less

Abstract: The 'edit distance' (or 'Levenshtein distance') measure of distance between two data sets is defined as the minimum number of editing operations - insertions, deletions, and substitutions - that are required to transform one data set to the other (Orpen and Huron, 1992). This measure of distance has been applied frequently and successfully in music information retrieval, but rarely in predicting human perception of distance. In this study, we investigate the effectiveness of the edit distance as a predictor of perceived rhythmic dissimilarity under simple rhythmic alterations. Approaching rhythms as a set of pulses that are either onsets or silences, we study two types of alterations. The first experiment is designed to test the model's accuracy for rhythms that are relatively similar; whether rhythmic variations with the same edit distance to a source rhythm are also perceived as relatively similar by human subjects. In addition, we observe whether the salience of an edit operation is affected by its metric placement in the rhythm. Instead of using a rhythm that regularly subdivides a 4/4 meter, our source rhythm is a syncopated 16-pulse rhythm, the son. Results show a high correlation between the predictions by the edit distance model and human similarity judgments (r = 0.87); a higher correlation than for the well-known generative theory of tonal music (r = 0.64). In the second experiment, we seek to assess the accuracy of the edit distance model in predicting relatively dissimilar rhythms. The stimuli used are random permutations of the son's inter-onset intervals: 3-3-4-2-4. The results again indicate that the edit distance correlates well with the perceived rhythmic dissimilarity judgments of the subjects (r = 0.76). To gain insight in the relationships between the individual rhythms, the results are also presented by means of graphic phylogenetic trees.

...read moreread less

22 citations

Book Chapter•DOI•

Indexing finite language representation of population genotypes

[...]

Jouni Sirén¹, Niko Välimäki¹, Veli Mäkinen¹•Institutions (1)

Helsinki Institute for Information Technology¹

05 Sep 2011

TL;DR: A way to index population genotype information together with the complete genome sequence, so that one can use the index to efficiently align a given sequence to the genome with all plausible genotype recombinations taken into account.

...read moreread less

Abstract: We propose a way to index population genotype information together with the complete genome sequence, so that one can use the index to efficiently align a given sequence to the genome with all plausible genotype recombinations taken into account. This is achieved through converting a multiple alignment of individual genomes into a finite automaton recognizing all strings that can be read from the alignment by switching the sequence at any time. The finite automaton is indexed with an extension of Burrows-Wheeler transform to allow pattern search inside the plausible recombinant sequences. The size of the index stays limited, because of the high similarity of individual genomes. The index finds applications in variation calling and in primer design.

...read moreread less

22 citations

Proceedings Article•DOI•

Named entity transliteration for cross-language information retrieval using compressed word format mapping algorithm

[...]

Srinivasan Janarthanam¹, Sethuramalingam Subramaniam², Udhyakumar Nallasamy³•Institutions (3)

University of Edinburgh¹, International Institute of Information Technology², Carnegie Mellon University³

30 Oct 2008

TL;DR: This paper presents a transliteration algorithm for mapping English named entities to their proper Tamil equivalents using a grapheme-based model, in which transliterations equivalents are identified by mapping the source language names to their equivalents in a target language database, instead of generating them.

...read moreread less

Abstract: Transliteration of named entities in user queries is a vital step in any Cross-Language Information Retrieval (CLIR) system. Several methods for transliteration have been proposed till date based on the nature of the languages considered. In this paper, we present a transliteration algorithm for mapping English named entities to their proper Tamil equivalents. Our algorithm employs a grapheme-based model, in which transliteration equivalents are identified by mapping the source language names to their equivalents in a target language database, instead of generating them. The basic principle is to compress the source word into its minimal form and align it across an indexed list of target language words to arrive at the top n-equivalents based on the edit distance. We compare the performance of our approach with a statistical generation approach using Microsoft Research India (MSRI) transliteration corpus. Our approach has proved very effective in terms of accuracy and time.

...read moreread less

22 citations

Patent•

Efficient computation of document similarity

[...]

Rikin Gandhi¹, Yasuhiro Matsuda¹, Mohammad Faisal¹•Institutions (1)

Business International Corporation¹

29 Nov 2006

TL;DR: In this article, a system embodiment includes logic to produce a gram from a string and logic to identify candidate documents based on identifying matches between query grams and document grams stored in an inverted index that relates grams to documents.

...read moreread less

Abstract: Systems, methodologies, media, and other embodiments associated with efficiently computing document similarity are described. One exemplary system embodiment includes logic to produce a gram from a string and logic to identify candidate documents based on identifying matches between query grams and document grams stored in an inverted index that relates grams to documents. The example system may also include logic to selectively partially reconstruct a candidate document from entries in the inverted index and logic to compute an edit distance between a string associated with a query and a string associated with the partially reconstructed candidate document. The example system may also include a signal logic configured to provide a signal corresponding to the edit distance.

...read moreread less

22 citations

Journal Article•DOI•

Fast index for approximate string matching

[...]

Dekel Tsur¹•Institutions (1)

Ben-Gurion University of the Negev¹

01 Dec 2010-Journal of Discrete Algorithms

TL;DR: An index that stores a text of length n such that given a pattern of length m, all the substrings of the text that are within Hamming distance (or edit distance) at most k from the pattern are reported in O(m+loglogn+#matches) time.

...read moreread less

22 citations

Collapse

Network Information

Performance

Metrics

3,030

Papers

78,281

Citations

No. of papers in the topic in previous years
Year	Papers
2023	39
2022	96
2021	111
2020	149
2019	145
2018	139

Edit distance

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics