Topic

String (computer science)

About: String (computer science) is a research topic. Over the lifetime, 19430 publications have been published within this topic receiving 333247 citations. The topic is also known as: str & s.

...read moreread less

Papers published on a yearly basis

1 / 3

Papers

PDF

Open Access

More filters

Posted Content•

An Optimal Bloom Filter Replacement Based on Matrix Solving

[...]

Ely Porat¹•Institutions (1)

Bar-Ilan University¹

11 Apr 2008-arXiv: Data Structures and Algorithms

TL;DR: This work suggests a method for holding a dictionary data structure, which maps keys to values, in the spirit of Bloom Filters, and suggests a data structure that requires only nk bits space, has O (n) preprocessing time, and has a O (logn ) query time.

...read moreread less

Abstract: We suggest a method for holding a dictionary data structure, which maps keys to values, in the spirit of Bloom Filters. The space requirements of the dictionary we suggest are much smaller than those of a hashtable. We allow storing n keys, each mapped to value which is a string of k bits. Our suggested method requires nk + o(n) bits space to store the dictionary, and O(n) time to produce the data structure, and allows answering a membership query in O(1) memory probes. The dictionary size does not depend on the size of the keys. However, reducing the space requirements of the data structure comes at a certain cost. Our dictionary has a small probability of a one sided error. When attempting to obtain the value for a key that is stored in the dictionary we always get the correct answer. However, when testing for membership of an element that is not stored in the dictionary, we may get an incorrect answer, and when requesting the value of such an element we may get a certain random value. Our method is based on solving equations in GF(2^k) and using several hash functions. Another significant advantage of our suggested method is that we do not require using sophisticated hash functions. We only require pairwise independent hash functions. We also suggest a data structure that requires only nk bits space, has O(n2) preprocessing time, and has a O(log n) query time. However, this data structures requires a uniform hash functions. In order replace a Bloom Filter of n elements with an error proability of 2^{-k}, we require nk + o(n) memory bits, O(1) query time, O(n) preprocessing time, and only pairwise independent hash function. Even the most advanced previously known Bloom Filter would require nk+O(n) space, and a uniform hash functions, so our method is significantly less space consuming especially when k is small.

...read moreread less

65 citations

Proceedings Article•DOI•

Structural patterns vs. string patterns for extracting semantic information from dictionaries

[...]

Simonetta Montemagni¹, Lucy Vanderwende²•Institutions (2)

University of Pisa¹, Microsoft²

23 Aug 1992

TL;DR: This chapter presents evidence for preferring to extract semantic information from a syntactic analysis of a dictionary definition rather than directly from the definition string itself when the information to be extracted is found in the differentiae.

...read moreread less

Abstract: This chapter presents evidence for preferring to extract semantic information from a syntactic analysis of a dictionary definition rather than directly from the definition string itself when the information to be extracted is found in the differentiae. We present examples of how very complex information can be extracted from the differentiae of the definition using structural analysis patterns, and why string patterns would fail to do the same.

...read moreread less

65 citations

Proceedings Article•DOI•

[...]

Jiaheng Lu¹, Chunbin Lin¹, Wei Wang², Chen Li³, Haiyong Wang¹ - Show less +1 more•Institutions (3)

Renmin University of China¹, University of New South Wales², University of California, Irvine³

22 Jun 2013

TL;DR: An expansion-based framework to measure string similarities efficiently while considering synonyms is presented, and an estimator to approximate the size of candidates to enable an online selection of signature filters to further improve the efficiency.

...read moreread less

Abstract: A string similarity measure quantifies the similarity between two text strings for approximate string matching or comparison. For example, the strings "Sam" and "Samuel" can be considered similar. Most existing work that computes the similarity of two strings only considers syntactic similarities, e.g., number of common words or q-grams. While these are indeed indicators of similarity, there are many important cases where syntactically different strings can represent the same real-world object. For example, "Bill" is a short form of "William". Given a collection of predefined synonyms, the purpose of the paper is to explore such existing knowledge to evaluate string similarity measures more effectively and efficiently, thereby boosting the quality of string matching.In particular, we first present an expansion-based framework to measure string similarities efficiently while considering synonyms. Because using synonyms in similarity measures is, while expressive, computationally expensive (NP-hard), we propose an efficient algorithm, called selective-expansion, which guarantees the optimality in many real scenarios. We then study a novel indexing structure called SI-tree, which combines both signature and length filtering strategies, for efficient string similarity joins with synonyms. We develop an estimator to approximate the size of candidates to enable an online selection of signature filters to further improve the efficiency. This estimator provides strong low-error, high-confidence guarantees while requiring only logarithmic space and time costs, thus making our method attractive both in theory and in practice. Finally, the results from an empirical study of the algorithms verify the effectiveness and efficiency of our approach.

...read moreread less

65 citations

Proceedings Article•

Searching for Jumbled Patterns in Strings

[...]

Ferdinando Cicalese¹, Gabriele Fici¹, Zsuzsanna Lipták²•Institutions (2)

University of Salerno¹, Bielefeld University²

01 Jan 2009

TL;DR: This work presents two new algorithms for the case where the text is fixed and many queries arrive over time, and iteratively constructs a linear size data structure which then allows answering queries in constant time, for many queries even during the construction phase.

...read moreread less

Abstract: The Parikh vector of a string s over a finite ordered alphabet Σ = {a1, , aσ} is defined as the vector of multiplicities of the characters, ie p(s) = (p1, , pσ), where pi = |{j | sj = ai}| Parikh vector q occurs in s if s has a substring t with p(t) = q The problem of searching for a query q in a text s of length n can be solved simply and optimally with a sliding window approach in O(n) time We present two new algorithms for the case where the text is fixed and many queries arrive over time The first algorithm finds all occurrences of a given Parikh vector in a text (over a fixed alphabet of size σ ≥ 2) and appears to have a sub-linear expected time complexity The second algorithm only decides whether a given Parikh vector appears in a binary text; it iteratively constructs a linear size data structure which then allows answering queries in constant time, for many queries even during the construction phase

...read moreread less

65 citations

Journal Article•DOI•

Oblivious Transfers and Privacy Amplification

[...]

Gilles Brassard¹, Claude Crépeau², Stefan Wolf¹•Institutions (2)

Université de Montréal¹, McGill University²

01 Sep 2003-Journal of Cryptology

TL;DR: This work presents a new technique for reducing one-out-of-two string OT, based on so-called privacy amplification, that is more efficient in terms of the number of required realizations of bit OT, and allows for reducing string OT to (apparently) much weaker primitives.

...read moreread less

Abstract: Oblivious transfer (OT) is an important primitive in cryptography. In chosen one-out-of-two string OT, a sender offers two strings, one of which the other party, called the receiver, can choose to read, not learning any information about the other string. The sender on the other hand does not obtain any information about the receiver's choice. We consider the problem of reducing this primitive to OT for single bits. Previous attempts to doing this were based on self-intersecting codes. We present a new technique for the same task, based on so-called privacy amplification. It is shown that our method has two important advantages over the previous approaches. First, it is more efficient in terms of the number of required realizations of bit OT, and second, the technique even allows for reducing string OT to (apparently) much weaker primitives. An example of such a primitive is universal OT, where the receiver can adaptively choose what type of information he wants to obtain about the two bits sent by the sender subject to the only constraint that some, possibly very small, uncertainty must remain about the pair of bits.

...read moreread less

65 citations

Collapse

Network Information

Performance

Metrics

19,430

Papers

362,272

Citations

No. of papers in the topic in previous years
Year	Papers
2022	2
2021	491
2020	704
2019	759
2018	816
2017	806

String (computer science)

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics