scispace - formally typeset
Search or ask a question
Topic

String (computer science)

About: String (computer science) is a research topic. Over the lifetime, 19430 publications have been published within this topic receiving 333247 citations. The topic is also known as: str & s.


Papers
More filters
Journal ArticleDOI
TL;DR: This letter describes a chain-of-states method that optimizes reaction paths under the sole constraint of equally spaced structures that requires no spring forces, interpolation algorithms, or other heuristics to control structure distribution.
Abstract: This letter describes a chain-of-states method that optimizes reaction paths under the sole constraint of equally spaced structures. In contrast to NEB and string methods, it requires no spring forces, interpolation algorithms, or other heuristics to control structure distribution. Rigorous use of a quadratic PES allows calculation of an optimization step with a predefined distribution in Cartesian space. The method is a formal extension of single-structure quasi-Newton methods. An initial guess can be evolved, as in the growing string method.

82 citations

Patent
08 Aug 2006
TL;DR: In this paper, a natural language system searching system develops concept and string indexes of a textual database, such as a group of litigation documents, by breaking the text to be indexed into sentences, words, dates, names and places in a reader, identifying phrases in a phrase parser, recovering word stems in a morphology module and determining the sense of potentially ambiguous words in a sense selector, all in accordance with words and concepts (word senses) stored in lexicon database 9-32.
Abstract: A natural language system searching system develops concept and string indexes of a textual database, such as a group of litigation documents, by breaking the text to be indexed into sentences, words, dates, names and places in a reader, identifying phrases in a phrase parser, recovering word stems in a morphology module and determining the sense of potentially ambiguous words in a sense selector, all in accordance with words and concepts (word senses) stored in lexicon database 9-32. A query may then be processed by the reader, phrase parser, morphology module, and sense selector to provide a text meaning output which can be compared with the concept and string indexes to identify, retrieve and display documents and/or portions of documents related to the query. A lexicon enhancer adds vocabulary semi-automatically.

82 citations

Journal ArticleDOI
TL;DR: An improved complete composition vector method under the assumption of a uniform and independent model to estimate sequence information contributing to selection for sequence comparison is proposed and is more robust compared with existing counterparts and comparable in robustness with alignment-based methods.
Abstract: Historically, two categories of computational algorithms (alignment-based and alignment-free) have been applied to sequence comparison–one of the most fundamental issues in bioinformatics. Multiple sequence alignment, although dominantly used by biologists, possesses both fundamental as well as computational limitations. Consequently, alignment-free methods have been explored as important alternatives in estimating sequence similarity. Of the alignment-free methods, the string composition vector (CV) methods, which use the frequencies of nucleotide or amino acid strings to represent sequence information, show promising results in genome sequence comparison of prokaryotes. The existing CV-based methods, however, suffer certain statistical problems, thereby underestimating the amount of evolutionary information in genetic sequences. We show that the existing string composition based methods have two problems, one related to the Markov model assumption and the other associated with the denominator of the frequency normalization equation. We propose an improved complete composition vector method under the assumption of a uniform and independent model to estimate sequence information contributing to selection for sequence comparison. Phylogenetic analyses using both simulated and experimental data sets demonstrate that our new method is more robust compared with existing counterparts and comparable in robustness with alignment-based methods. We observed two problems existing in the currently used string composition methods and proposed a new robust method for the estimation of evolutionary information of genetic sequences. In addition, we discussed that it might not be necessary to use relatively long strings to build a complete composition vector (CCV), due to the overlapping nature of vector strings with a variable length. We suggested a practical approach for the choice of an optimal string length to construct the CCV.

81 citations

Patent
03 Jun 2003
TL;DR: In this article, methods for estimating language models such that the conditional likelihood of a class given a word string, which is very well correlated with classification accuracy, is maximized are described.
Abstract: Methods are disclosed for estimating language models such that the conditional likelihood of a class given a word string, which is very well correlated with classification accuracy, is maximized. The methods comprise tuning statistical language model parameters jointly for all classes such that a classifier discriminates between the correct class and the incorrect ones for a given training sentence or utterance. Specific embodiments of the present invention pertain to implementation of the rational function growth transform in the context of a discriminative training technique for n-gram classifiers.

81 citations


Network Information
Related Topics (5)
Time complexity
36K papers, 879.5K citations
88% related
Tree (data structure)
44.9K papers, 749.6K citations
86% related
Graph (abstract data type)
69.9K papers, 1.2M citations
85% related
Computational complexity theory
30.8K papers, 711.2K citations
82% related
Supervised learning
20.8K papers, 710.5K citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20222
2021491
2020704
2019759
2018816
2017806