Topic
String (computer science)
About: String (computer science) is a research topic. Over the lifetime, 19430 publications have been published within this topic receiving 333247 citations. The topic is also known as: str & s.
Papers published on a yearly basis
Papers
More filters
••
01 Aug 2012TL;DR: This paper designs efficient trie-join algorithms and pruning techniques to achieve high performance and shows that these algorithms outperform state-of-the-art methods by an order of magnitude on the data sets with short strings.
Abstract: A string similarity join finds similar pairs between two collections of strings. Many applications, e.g., data integration and cleaning, can significantly benefit from an efficient string-similarity-join algorithm. In this paper, we study string similarity joins with edit-distance constraints. Existing methods usually employ a filter-and-refine framework and suffer from the following limitations: (1) They are inefficient for the data sets with short strings (the average string length is not larger than 30); (2) They involve large indexes; (3) They are expensive to support dynamic update of data sets. To address these problems, we propose a novel method called trie-join, which can generate results efficiently with small indexes. We use a trie structure to index the strings and utilize the trie structure to efficiently find similar string pairs based on subtrie pruning. We devise efficient trie-join algorithms and pruning techniques to achieve high performance. Our method can be easily extended to support dynamic update of data sets efficiently. We conducted extensive experiments on four real data sets. Experimental results show that our algorithms outperform state-of-the-art methods by an order of magnitude on the data sets with short strings.
63 citations
•
09 Aug 1967TL;DR: In this article, a general purpose computer program and special purpose apparatus for matching strings of alphanumeric characters are disclosed, which makes use of a current character search list (augmented for all alternative characters) and a next-character search list augmented for all successful character matches, these characters are portions of the test text to which the string to be matched is compared.
Abstract: A general purpose computer program and special purpose apparatus for matching strings of alphanumeric characters are disclosed. The algorithm involved makes use of a current-character search list (augmented for all alternative characters) and a nextcharacter search list (augmented for all successful character matches). These characters are portions of the test text to which the string to be matched is compared. Each character of the string to be matched is tested by the current character list, during which time the next character list is compiled. Then a new character is obtained, the next character list substituted for the current character list, and the process continues. The process terminates successfully when test text characters are exhausted, and terminates unsuccessfully when the searched text to be matched is exhausted.
63 citations
••
TL;DR: The problem of finding a consensus string based on consensus error is NP-complete when the penalty matrix is a metric.
63 citations
••
19 Dec 2001TL;DR: It is shown how to solve CLOSEST STRING in linear time for constant d (the exponential growth is O(d d) ), and this result is extended to the closely related problems d-MISMATCH and DISTINGUISHING STRING SELECTION.
Abstract: CLOSEST STRING is one of the core problems in the field of consensus word analysis with particular importance for computational biology Given k strings of same length and a positive integer d, find a "closest string" s such that none of the given strings has Hamming distance greater than d from s Closest String is NP-complete We show how to solve CLOSEST STRING in linear time for constant d (the exponential growth is O(d d We extend this result to the closely related problems d-MISMATCH and DISTINGUISHING STRING SELECTION Moreover, we discuss fixed parameter tractability for parameter k and give an efficient linear time algorithm for CLOSEST STRING when k = 3 Finally, the practical usefulness of our findings is substantiated by some experimental results
63 citations
••
TL;DR: Offering more extensive and up-to-date coverage than other texts, Strategies for Teaching Strings is an essential all-purpose guide for those planning to enter the string teaching profession.
Abstract: Ideal for use in undergraduate string methods, string techniques, and instrumental string pedagogy courses, Strategies for Teaching Strings provides readers with all the information and skills necessary to teach string instruments in schools and to develop a successful school-based orchestral program. Based on national standards for successful string and orchestra teaching, the text begins by introducing the string instrument family and providing an overview of the development of the school orchestra program. Subsequent chapters-divided into three levels of string competency corresponding to elementary, middle, and secondary school skills-cover performance goals and objectives, strategies for teaching technical and performance skills, and solutions to common problems for each ability level. Rehearsal planning and preparation, rehearsal techniques, strategies for teaching improvisation, student recruitment and retention, and choosing literature for the school orchestra are also covered. The text is enhanced by line drawings and photographs that demonstrate correct playing techniques and fingering positions. An appendix includes special pedagogical approaches, lists of resources for string teachers, a listing of professional string associations, and a full listing of the Standards for Successful School String/Orchestra Teaching (published by the American String Teachers Association with the National School Orchestra Association). Offering more extensive and up-to-date coverage than other texts, Strategies for Teaching Strings is an essential all-purpose guide for those planning to enter the string teaching profession.
63 citations