scispace - formally typeset
Search or ask a question
Topic

String (computer science)

About: String (computer science) is a research topic. Over the lifetime, 19430 publications have been published within this topic receiving 333247 citations. The topic is also known as: str & s.


Papers
More filters
Journal ArticleDOI
01 Feb 1991
TL;DR: An inconsistent polynomial-time algorithm is presented which identifies every pattern language in the limit and investigates inference of arbitrary pattern languages within the framework of learning from good examples.
Abstract: A pattern is a finite string of constants and variables (cf. [1]). The language of a pattern is the set of all strings which can be obtained by substituting non-null strings of constants for the variables of the pattern. In the present paper, we consider the problem of learning pattern languages from examples. As a main result we present an inconsistent polynomial-time algorithm which identifies every pattern language in the limit. Furthermore, we investigate inference of arbitrary pattern languages within the framework of learning from good examples. Finally, we show that every pattern language can be identified in polynomial time from polynomially many disjointness queries, only.

114 citations

Book ChapterDOI
11 Aug 1990
TL;DR: This paper constructs the first publicly verifiable non-interactive zero-knowledge proof for any NP statement under the general assumption that one way permutations exist.
Abstract: In this paper we construct the first publicly verifiable non-interactive zero-knowledge proof for any NP statement under the general assumption that one way permutations exist. If the prover is polynomially bounded then our scheme is based on the stronger assumption that trapdoor permutations exist. In both cases we assume that P and V have a common random string, and use it to prove a single theorem (which may be chosen as a function of the known string).

114 citations

Journal ArticleDOI
TL;DR: A faster algorithm for dynamic string dictionary matching with bounded alphabets, and a novel method to efficiently manipulate failure links for two-dimensional patterns.
Abstract: In the dynamic dictionary matching problem, a dictionary D contains a set of patterns that can change over time by insertion and deletion of individual patterns. The user also presents text strings and asks for all occurrences of any patterns in the text. The two main contributions of this paper are: (1) a faster algorithm for dynamic string dictionary matching with bounded alphabets, and (2) a dynamic dictionary matching algorithm for two-dimensional texts and patterns. The first contribution is based on an algorithm that solves the general problem of maintaining a sequence of well-balanced parentheses under the operations insert, delete, and find nearest enclosing parenthesis pair. The main new idea behind the second contribution is a novel method to efficiently manipulate failure links for two-dimensional patterns.

114 citations

Journal ArticleDOI
TL;DR: In this article, a memory-efficient parallel string matching scheme is proposed for low-cost hardware-based intrusion detection systems, where long target patterns are divided into sub-patterns with a fixed length.
Abstract: For the low-cost hardware-based intrusion detection systems, this paper proposes a memory-efficient parallel string matching scheme. In order to reduce the number of state transitions, the finite state machine tiles in a string matcher adopt bit-level input symbols. Long target patterns are divided into subpatterns with a fixed length; deterministic finite automata are built with the subpatterns. Using the pattern dividing, the variety of target pattern lengths can be mitigated, so that memory usage in homogeneous string matchers can be efficient. In order to identify each original long pattern being divided, a two-stage sequential matching scheme is proposed for the successive matches with subpatterns. Experimental results show that total memory requirements decrease on average by 47.8 percent and 62.8 percent for Snort and ClamAV rule sets, in comparison with several existing bit-split string matching methods.

114 citations

Book ChapterDOI
31 Aug 2004
TL;DR: This paper reports on the techniques and experience in dealing with flexible string matching against real AT&T databases, and identifies various performance enhancements to speed up the matching process.
Abstract: Data Cleaning is an important process that has been at the center of research interest in recent years. Poor data quality is the result of a variety of reasons, including data entry errors and multiple conventions for recording database fields, and has a significant impact on a variety of business issues. Hence, there is a pressing need for technologies that enable flexible (fuzzy) matching of string information in a database. Cosine similarity with tf-idf is a well-established metric for comparing text, and recent proposals have adapted this similarity measure for flexibly matching a query string with values in a single attribute of a relation. In deploying tf-idf based flexible string matching against real AT&T databases, we observed that this technique needed to be enhanced in many ways. First, along the functionality dimension, where there was a need to flexibly match along multiple string-valued attributes, and also take advantage of known semantic equivalences. Second, we identified various performance enhancements to speed up the matching process, potentially trading off a small degree of accuracy for substantial performance gains. In this paper, we report on our techniques and experience in dealing with flexible string matching against real AT&T databases.

114 citations


Network Information
Related Topics (5)
Time complexity
36K papers, 879.5K citations
88% related
Tree (data structure)
44.9K papers, 749.6K citations
86% related
Graph (abstract data type)
69.9K papers, 1.2M citations
85% related
Computational complexity theory
30.8K papers, 711.2K citations
82% related
Supervised learning
20.8K papers, 710.5K citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20222
2021491
2020704
2019759
2018816
2017806