Topic

String (computer science)

About: String (computer science) is a research topic. Over the lifetime, 19430 publications have been published within this topic receiving 333247 citations. The topic is also known as: str & s.

...read moreread less

Papers published on a yearly basis

1 / 3

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Transformation-based Framework for Record Matching

[...]

Arvind Arasu¹, Surajit Chaudhuri¹, Raghav Kaushik¹•Institutions (1)

Microsoft¹

07 Apr 2008

TL;DR: A programmatic framework of record matching that takes such user-defined string transformations as input, and is the first proposal for such a framework to be proposed.

...read moreread less

Abstract: Today's record matching infrastructure does not allow a flexible way to account for synonyms such as "Robert" and "Bob" which refer to the same name, and more general forms of string transformations such as abbreviations. We propose a programmatic framework of record matching that takes such user-defined string transformations as input. To the best of our knowledge, this is the first proposal for such a framework. This transformational framework, while expressive, poses significant computational challenges which we address. We empirically evaluate our techniques over real data.

...read moreread less

151 citations

Proceedings Article•DOI•

Edit Probability for Scene Text Recognition

[...]

Fan Bai¹, Zhanzhan Cheng, Yi Niu, Shiliang Pu, Shuigeng Zhou¹ - Show less +1 more•Institutions (1)

Fudan University¹

18 Jun 2018

TL;DR: Zhang et al. as discussed by the authors proposed a novel method called edit probability (EP) for scene text recognition, which tries to estimate the probability of generating a string from the output sequence of probability distribution conditioned on the input image, while considering the possible occurrences of missing/superfluous characters.

...read moreread less

Abstract: We consider the scene text recognition problem under the attention-based encoder-decoder framework, which is the state of the art. The existing methods usually employ a frame-wise maximal likelihood loss to optimize the models. When we train the model, the misalignment between the ground truth strings and the attention's output sequences of probability distribution, which is caused by missing or superfluous characters, will confuse and mislead the training process, and consequently make the training costly and degrade the recognition accuracy. To handle this problem, we propose a novel method called edit probability (EP) for scene text recognition. EP tries to effectively estimate the probability of generating a string from the output sequence of probability distribution conditioned on the input image, while considering the possible occurrences of missing/superfluous characters. The advantage lies in that the training process can focus on the missing, superfluous and unrecognized characters, and thus the impact of the misalignment problem can be alleviated or even overcome. We conduct extensive experiments on standard benchmarks, including the IIIT-5K, Street View Text and ICDAR datasets. Experimental results show that the EP can substantially boost scene text recognition performance.

...read moreread less

151 citations

Journal Article•DOI•

Handwritten word recognition with character and inter-character neural networks

[...]

Paul D. Gader¹, Magdi A. Mohamed¹, Jung-Hsien Chiang•Institutions (1)

University of Missouri¹

01 Feb 1997

TL;DR: An off-line handwritten word recognition system that assigns confidence that pairs of segments are compatible with character confidence assignments and that this confidence is integrated into the dynamic programming is described.

...read moreread less

Abstract: An off-line handwritten word recognition system is described. Images of handwritten words are matched to lexicons of candidate strings. A word image is segmented into primitives. The best match between sequences of unions of primitives and a lexicon string is found using dynamic programming. Neural networks assign match scores between characters and segments. Two particularly unique features are that neural networks assign confidence that pairs of segments are compatible with character confidence assignments and that this confidence is integrated into the dynamic programming. Experimental results are provided on data from the U.S. Postal Service.

...read moreread less

151 citations

Journal Article•DOI•

The role of orthographic and phonotactic rules in perceiving letter patterns.

[...]

Kathryn T. Spoehr¹, Edward E. Smith•Institutions (1)

Douglass Residential College¹

01 Feb 1975-Journal of Experimental Psychology: Human Perception and Performance

TL;DR: Three experiments examined the role of orthographic and phonotactic rules in the tachistoscopic recognition of letter strings and demonstrated that the perceptual accuracy for a string is correlated with the number of recoding steps needed to convert that string into speech.

...read moreread less

Abstract: Three experiments examined the role of orthographic and phonotactic rules in the tachistoscopic recognition of letter strings. Experiment 1 showed that the presence of a vowel or multiletter spelling patterns facilitates perceptual accuracy. To account for these results a model was proposed in which an input string is first parsed into syllablelike units, which are then recorded into speech. It was demonstrated that the perceptual accuracy for a string is correlated with the number of recoding steps needed to convert that string into speech. Experiment 2 further demonstrated that this recoding process can predict perceptibility differences among strings with varying numbers of phonotactic violations, and Experiment 3 assessed some of the specific assumptions of the recoding process.

...read moreread less

150 citations

Book Chapter•DOI•

A DPLL(T) Theory Solver for a Theory of Strings and Regular Expressions

[...]

Tianyi Liang¹, Andrew Reynolds¹, Cesare Tinelli¹, Clark Barrett², Morgan Deters² - Show less +1 more•Institutions (2)

University of Iowa¹, New York University²

18 Jul 2014

TL;DR: A set of algebraic techniques for solving constraints over the theory of unbounded strings natively, without reduction to other problems are presented and implemented in the SMT solver cvc4 to expand its already large set of built-in theories to a theory of strings with concatenation, length, and membership in regular languages.

...read moreread less

Abstract: An increasing number of applications in verification and security rely on or could benefit from automatic solvers that can check the satisfiability of constraints over a rich set of data types that includes character strings. Unfortunately, most string solvers today are standalone tools that can reason only about (some fragment) of the theory of strings and regular expressions, sometimes with strong restrictions on the expressiveness of their input language. These solvers are based on reductions to satisfiability problems over other data types, such as bit vectors, or to automata decision problems. We present a set of algebraic techniques for solving constraints over the theory of unbounded strings natively, without reduction to other problems. These techniques can be used to integrate string reasoning into general, multi-theory SMT solvers based on the DPLL(T) architecture. We have implemented them in our SMT solver cvc4 to expand its already large set of built-in theories to a theory of strings with concatenation, length, and membership in regular languages. Our initial experimental results show that, in addition, over pure string problems, cvc4 is highly competitive with specialized string solvers with a comparable input language.

...read moreread less

148 citations

Collapse

Network Information

Performance

Metrics

19,430

Papers

362,272

Citations

No. of papers in the topic in previous years
Year	Papers
2022	2
2021	491
2020	704
2019	759
2018	816
2017	806

String (computer science)

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics