scispace - formally typeset
Search or ask a question
Topic

String (computer science)

About: String (computer science) is a research topic. Over the lifetime, 19430 publications have been published within this topic receiving 333247 citations. The topic is also known as: str & s.


Papers
More filters
Proceedings ArticleDOI
03 Aug 2003
TL;DR: The approach is to create a visual challenge that is easy for humans but difficult for a computer to recognize a string of random distorted characters, which presents hard segmentation problems that humans are particularly apt at solving.
Abstract: How do you tell a computer from a human? The situation arises often on the Internet, when online polls are conducted, accounts are requested, undesired email is received, and chat-rooms are spammed. The approach we use is to create a visual challenge that is easy for humans but difficult for a computer. More specifically, our challenge is to recognize a string of random distorted characters. To pass the challenge, the subject must type in the correct corresponding ASCII string. From an OCR point of view, this problem is interesting because our goal is to use the vast amount of accumulated knowledge to defeat the state of the art OCR algorithms. This is a role reversal from traditional OCR research. Unlike many other systems, our algorithm is based on the assumption that segmentation is much more difficult than recognition. Our image challenges present hard segmentation problems that humans are particularly apt at solving. The technology is currently being used in MSN's Hotmail registration system, where it has significantly reduced daily registration rate with minimal Consumer Support impact.

77 citations

Patent
10 Nov 1994
TL;DR: In this article, a list of candidate recognized words is identified as a function of both comparison of dictionary entries to various combinations of recognized character combinations, and through a most likely character string analysis as developed without reference to the dictionary.
Abstract: In an handwriting recognition process, a list of candidate recognized words is identified (202) as a function of both comparison of dictionary entries to various combinations of recognized character combinations, and through a most likely character string analysis as developed without reference to the dictionary. The process selects (301) a word from the list and presents (302) this word to the user. The user then has the option of displaying (303) this list. When displaying the list, candidate words developed with reference to the dictionary are displayed in segregated manner from the most likely character string word and the most likely string of digits. The user can change the selected word by choosing from the list, or edit the selected word. When the user selects the most likely character string as the correct representation of the handwritten input to be recognized, the process automatically updates (310) the dictionary to include the most likely character string. The same process can occur when the user selects the most likely string of digits.

77 citations

01 Jan 2002
TL;DR: ExB is designed, a string matching algorithm tailored to the specific characteristics of NIDS string matching, and implemented in snort and experiments suggest that ExB offers improvements in overall system performance by as much as a factor of three.
Abstract: We consider the problem of efficient string-based signature matching for Network Intrusion Detection Systems (NIDSes). String matching computations dominate in the overall cost of running a NIDS, despite the use of efficient generalpurpose string matching algorithms. Aiming at increasing the efficiency and capacity of NIDSes, we have designed ExB, a string matching algorithm tailored to the specific characteristics of NIDS string matching. We have implemented ExB in snort and present experiments comparing ExB with the current best alternative solution. Our preliminary experiments suggest that ExB offers improvements in overall system performance by as much as a factor of three.

77 citations

Journal ArticleDOI
TL;DR: An algorithm for two-dimensional matching with an $O(n^2)$ text-scanning phase and the pattern preprocessing requires an ordered alphabet and runs with the same alphabet dependency as the previously known algorithms.
Abstract: There are many solutions to the string matching problem that are strictly linear in the input size and independent of alphabet size. Furthermore, the model of computation for these algorithms is very weak: they allow only simple arithmetic and comparisons of equality between characters of the input. In contrast, algorithms for two-dimensional matching have needed stronger models of computation, most notably assuming a totally ordered alphabet. The fastest algorithms for two-dimensional matching have therefore had a logarithmic dependence on the alphabet size. In the worst case, this gives an algorithm that runs in $O(n^2 \log{m})$ with $O(m^2 \log m)$ preprocessing. The authors show an algorithm for two-dimensional matching with an $O(n^2)$ text-scanning phase. Furthermore, the text scan requires no special assumptions about the alphabet, i.e., it runs on the same model as the standard linear-time string-matching algorithm. The pattern preprocessing requires an ordered alphabet and runs with the same alphabet dependency as the previously known algorithms.

77 citations

Proceedings ArticleDOI
23 Jan 2011
TL;DR: In this paper, the authors presented two representations of a string of length n compressed into a context-free grammar S of size n with O(log N) random access time and O(n · αk(n)) construction time and space on the RAM.
Abstract: Let S be a string of length N compressed into a context-free grammar S of size n We present two representations of S achieving O(log N) random access time, and either O(n · αk(n)) construction time and space on the pointer machine model, or O(n) construction time and space on the RAM Here, αk(n) is the inverse of the kth row of Ackermann's function Our representations also efficiently support decompression of any substring in S: we can decompress any substring of length m in the same complexity as a single random access query and additional O(m) time Combining these results with fast algorithms for uncompressed approximate string matching leads to several efficient algorithms for approximate string matching on grammar-compressed strings without decompression For instance, we can find all approximate occurrences of a pattern P with at most k errors in time O(n(min{|P|k, k4 +|P|} +log N) + occ), where occ is the number of occurrences of P in S Finally, we are able to generalize our results to navigation and other operations on grammar-compressed treesAll of the above bounds significantly improve the currently best known results To achieve these bounds, we introduce several new techniques and data structures of independent interest, including a predecessor data structure, two "biased" weighted ancestor data structures, and a compact representation of heavy-paths in grammars

77 citations


Network Information
Related Topics (5)
Time complexity
36K papers, 879.5K citations
88% related
Tree (data structure)
44.9K papers, 749.6K citations
86% related
Graph (abstract data type)
69.9K papers, 1.2M citations
85% related
Computational complexity theory
30.8K papers, 711.2K citations
82% related
Supervised learning
20.8K papers, 710.5K citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20222
2021491
2020704
2019759
2018816
2017806