scispace - formally typeset
Search or ask a question
Topic

String (computer science)

About: String (computer science) is a research topic. Over the lifetime, 19430 publications have been published within this topic receiving 333247 citations. The topic is also known as: str & s.


Papers
More filters
Journal ArticleDOI
01 Aug 2012
TL;DR: This paper designs efficient trie-join algorithms and pruning techniques to achieve high performance and shows that these algorithms outperform state-of-the-art methods by an order of magnitude on the data sets with short strings.
Abstract: A string similarity join finds similar pairs between two collections of strings. Many applications, e.g., data integration and cleaning, can significantly benefit from an efficient string-similarity-join algorithm. In this paper, we study string similarity joins with edit-distance constraints. Existing methods usually employ a filter-and-refine framework and suffer from the following limitations: (1) They are inefficient for the data sets with short strings (the average string length is not larger than 30); (2) They involve large indexes; (3) They are expensive to support dynamic update of data sets. To address these problems, we propose a novel method called trie-join, which can generate results efficiently with small indexes. We use a trie structure to index the strings and utilize the trie structure to efficiently find similar string pairs based on subtrie pruning. We devise efficient trie-join algorithms and pruning techniques to achieve high performance. Our method can be easily extended to support dynamic update of data sets efficiently. We conducted extensive experiments on four real data sets. Experimental results show that our algorithms outperform state-of-the-art methods by an order of magnitude on the data sets with short strings.

63 citations

Patent
Ken Thompson1
09 Aug 1967
TL;DR: In this article, a general purpose computer program and special purpose apparatus for matching strings of alphanumeric characters are disclosed, which makes use of a current character search list (augmented for all alternative characters) and a next-character search list augmented for all successful character matches, these characters are portions of the test text to which the string to be matched is compared.
Abstract: A general purpose computer program and special purpose apparatus for matching strings of alphanumeric characters are disclosed. The algorithm involved makes use of a current-character search list (augmented for all alternative characters) and a nextcharacter search list (augmented for all successful character matches). These characters are portions of the test text to which the string to be matched is compared. Each character of the string to be matched is tested by the current character list, during which time the next character list is compiled. Then a new character is obtained, the next character list substituted for the current character list, and the process continues. The process terminates successfully when test text characters are exhausted, and terminates unsuccessfully when the searched text to be matched is exhausted.

63 citations

Journal ArticleDOI
TL;DR: The problem of finding a consensus string based on consensus error is NP-complete when the penalty matrix is a metric.

63 citations

Book ChapterDOI
19 Dec 2001
TL;DR: It is shown how to solve CLOSEST STRING in linear time for constant d (the exponential growth is O(d d) ), and this result is extended to the closely related problems d-MISMATCH and DISTINGUISHING STRING SELECTION.
Abstract: CLOSEST STRING is one of the core problems in the field of consensus word analysis with particular importance for computational biology Given k strings of same length and a positive integer d, find a "closest string" s such that none of the given strings has Hamming distance greater than d from s Closest String is NP-complete We show how to solve CLOSEST STRING in linear time for constant d (the exponential growth is O(d d We extend this result to the closely related problems d-MISMATCH and DISTINGUISHING STRING SELECTION Moreover, we discuss fixed parameter tractability for parameter k and give an efficient linear time algorithm for CLOSEST STRING when k = 3 Finally, the practical usefulness of our findings is substantiated by some experimental results

63 citations

Journal ArticleDOI
TL;DR: Offering more extensive and up-to-date coverage than other texts, Strategies for Teaching Strings is an essential all-purpose guide for those planning to enter the string teaching profession.
Abstract: Ideal for use in undergraduate string methods, string techniques, and instrumental string pedagogy courses, Strategies for Teaching Strings provides readers with all the information and skills necessary to teach string instruments in schools and to develop a successful school-based orchestral program. Based on national standards for successful string and orchestra teaching, the text begins by introducing the string instrument family and providing an overview of the development of the school orchestra program. Subsequent chapters-divided into three levels of string competency corresponding to elementary, middle, and secondary school skills-cover performance goals and objectives, strategies for teaching technical and performance skills, and solutions to common problems for each ability level. Rehearsal planning and preparation, rehearsal techniques, strategies for teaching improvisation, student recruitment and retention, and choosing literature for the school orchestra are also covered. The text is enhanced by line drawings and photographs that demonstrate correct playing techniques and fingering positions. An appendix includes special pedagogical approaches, lists of resources for string teachers, a listing of professional string associations, and a full listing of the Standards for Successful School String/Orchestra Teaching (published by the American String Teachers Association with the National School Orchestra Association). Offering more extensive and up-to-date coverage than other texts, Strategies for Teaching Strings is an essential all-purpose guide for those planning to enter the string teaching profession.

63 citations


Network Information
Related Topics (5)
Time complexity
36K papers, 879.5K citations
88% related
Tree (data structure)
44.9K papers, 749.6K citations
86% related
Graph (abstract data type)
69.9K papers, 1.2M citations
85% related
Computational complexity theory
30.8K papers, 711.2K citations
82% related
Supervised learning
20.8K papers, 710.5K citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20222
2021491
2020704
2019759
2018816
2017806