Showing papers on "String (computer science) published in 2007"

PDF

Open Access

Journal Article•DOI•

STRING 7—recent developments in the integration and prediction of protein interactions

[...]

Christian von Mering¹, Lars Juhl Jensen, Michael Kuhn, Samuel Chaffron¹, Tobias Doerks, Beate Krüger, Berend Snel², Peer Bork³ - Show less +4 more•Institutions (3)

University of Zurich¹, Utrecht University², Max Delbrück Center for Molecular Medicine³

01 Jan 2007-Nucleic Acids Research

TL;DR: Although primarily developed for protein interaction analysis, the resource has also been successfully applied to comparative genomics, phylogenetics and network studies, which are all facilitated by programmatic access to the database backend and the availability of compact download files.

...read moreread less

Abstract: Information on protein–protein interactions is still mostly limited to a small number of model organisms, and originates from a wide variety of experimental and computational techniques The database and online resource STRING generalizes access to protein interaction data, by integrating known and predicted interactions from a variety of sources The underlying infrastructure includes a consistent body of completely sequenced genomes and exhaustive orthology classifications, based on which interaction evidence is transferred between organisms Although primarily developed for protein interaction analysis, the resource has also been successfully applied to comparative genomics, phylogenetics and network studies, which are all facilitated by programmatic access to the database backend and the availability of compact download files As of release 7, STRING has almost doubled to 373 distinct organisms, and contains more than 15 million proteins for which associations have been pre-computed Novel features include AJAX-based web-navigation, inclusion of additional resources such as BioGRID, and detailed protein domain annotation STRING is available at http:// stringemblde/

...read moreread less

669 citations

Journal Article•DOI•

Simplified and improved string method for computing the minimum energy paths in barrier-crossing events.

[...]

Weinan E¹, Weiqing Ren², Eric Vanden-Eijnden²•Institutions (2)

Princeton University¹, New York University²

23 Apr 2007-Journal of Chemical Physics

TL;DR: A simplified and improved version of the string method, originally proposed by E et al. (2002) for identifying the minimum energy paths in barrier-crossing events, that is more stable and accurate and combined with the climbing image technique for the accurate calculation of saddle points.

...read moreread less

Abstract: We present a simplified and improved version of the string method, originally proposed by E et al. [Phys. Rev. B 66, 052301 (2002)] for identifying the minimum energy paths in barrier-crossing events. In this new version, the step of projecting the potential force to the direction normal to the string is eliminated and the full potential force is used in the evolution of the string. This not only simplifies the numerical procedure, but also makes the method more stable and accurate. We discuss the algorithmic details of the improved string method, analyze its stability, accuracy and efficiency, and illustrate it via numerical examples. We also show how the string method can be combined with the climbing image technique for the accurate calculation of saddle points and we present another algorithm for the accurate calculation of the unstable directions at the saddle points.

...read moreread less

638 citations

Proceedings Article•DOI•

Sound and precise analysis of web applications for injection vulnerabilities

[...]

Gary Wassermann¹, Zhendong Su¹•Institutions (1)

University of California, Davis¹

10 Jun 2007

TL;DR: This paper proposes a precise, sound, and fully automated analysis technique for SQL injection that successfully discovered previously unknown and sometimes subtle vulnerabilities in real-world programs, has a low false positive rate, and scales to large programs.

...read moreread less

Abstract: Web applications are popular targets of security attacks. One common type of such attacks is SQL injection, where an attacker exploits faulty application code to execute maliciously crafted database queries. Bothstatic and dynamic approaches have been proposed to detect or prevent SQL injections; while dynamic approaches provide protection for deployed software, static approaches can detect potential vulnerabilities before software deployment. Previous static approaches are mostly based on tainted information flow tracking and have at least some of the following limitations: (1) they do not model the precise semantics of input sanitization routines; (2) they require manually written specifications, either for each query or for bug patterns; or (3) they are not fully automated and may require user intervention at various points in the analysis. In this paper, we address these limitations by proposing a precise, sound, and fully automated analysis technique for SQL injection. Our technique avoids the need for specifications by consideringas attacks those queries for which user input changes the intended syntactic structure of the generated query. It checks conformance to this policy byconservatively characterizing the values a string variable may assume with a context free grammar, tracking the nonterminals that represent user-modifiable data, and modeling string operations precisely as language transducers. We have implemented the proposed technique for PHP, the most widely-used web scripting language. Our tool successfully discovered previously unknown and sometimes subtle vulnerabilities in real-world programs, has a low false positive rate, and scales to large programs (with approx. 100K loc).

...read moreread less

416 citations

Proceedings Article•DOI•

Privacy preserving error resilient dna searching through oblivious automata

[...]

Juan Ramón Troncoso-Pastoriza¹, Stefan Katzenbeisser², Mehmet U. Celik²•Institutions (2)

University of Vigo¹, Philips²

28 Oct 2007

TL;DR: A new error-resilient privacy-preserving string searching protocol that allows to execute any finite state machine in an oblivious manner, requiring a communication complexity which is linear both in the number of states and the length of the input string.

...read moreread less

Abstract: Human Desoxyribo-Nucleic Acid (DNA) sequences offer a wealth of information that reveal, among others, predisposition to various diseases and paternity relations. The breadth and personalized nature of this information highlights the need for privacy-preserving protocols. In this paper, we present a new error-resilient privacy-preserving string searching protocol that is suitable for running private DNA queries. This protocol checks if a short template (e.g., a string that describes a mutation leading to a disease), known to one party, is present inside a DNA sequence owned by another party, accounting for possible errors and without disclosing to each party the other party's input. Each query is formulated as a regular expression over a finite alphabet and implemented as an automaton. As the main technical contribution, we provide a protocol that allows to execute any finite state machine in an oblivious manner, requiring a communication complexity which is linear both in the number of states and the length of the input string.

...read moreread less

239 citations

Proceedings Article•DOI•

Dynamic test input generation for database applications

[...]

Michael Emmi¹, Rupak Majumdar¹, Koushik Sen²•Institutions (2)

University of California, Los Angeles¹, University of California, Berkeley²

09 Jul 2007

TL;DR: An algorithm that can track symbolic constraints across language boundaries and use those constraints in conjunction with a novel constraint solver to generate both program inputs and database state is developed and a constraints solver is proposed that can solve symbolic constraints consisting of both linear arithmetic constraints over variables as well as string constraints.

...read moreread less

Abstract: We describe an algorithm for automatic test input generation for database applications. Given a program in an imperative language that interacts with a database through API calls, our algorithm generates both input data for the program as well as suitable database records to systematically explore all paths of the program, including those paths whose execution depend on data returned by database queries. Our algorithm is based on concolic execution, where the program is run with concrete inputs and simultaneously also with symbolic inputs for both program variables as well as the database state. The symbolic constraints generated along a path enable us to derive new input values and new database records that can cause execution to hit uncovered paths. Simultaneously, the concrete execution helps to retain precision in the symbolic computations by allowing dynamic values to be used in the symbolic executor. This allows our algorithm, for example, to identify concrete SQL queries made by the program, even if these queries are built dynamically.The contributions of this paper are the following. We develop an algorithm that can track symbolic constraints across language boundaries and use those constraints in conjunction with a novel constraint solver to generate both program inputs and database state. We propose a constraint solver that can solve symbolic constraints consisting of both linear arithmetic constraints over variables as well as string constraints (string equality, disequality, as well as membership in regular languages). Finally, we provide an evaluation of the algorithm on a Java implementation of MediaWiki, a popular wiki package that interacts with a database back-end.

...read moreread less

232 citations

Proceedings Article•DOI•

An improved algorithm to accelerate regular expression evaluation

[...]

Michela Becchi¹, Patrick Crowley¹•Institutions (1)

Washington University in St. Louis¹

03 Dec 2007

TL;DR: This paper introduces a general compression technique that results in at most 2N state traversals when processing a string of length N, and describes a novel alphabet reduction scheme for DFA-based structures that can yield further dramatic reductions in data structure size.

...read moreread less

Abstract: Modern network intrusion detection systems need to perform regular expression matching at line rate in order to detect the occurrence of critical patterns in packet payloads. While deterministic finite automata (DFAs) allow this operation to be performed in linear time, they may exhibit prohibitive memory requirements. In [9], Kumar et al. propose Delayed Input DFAs (D2FAs), which provide a trade-off between the memory requirements of the compressed DFA and the number of states visited for each character processed, which corresponds directly to the memory bandwidth required to evaluate regular expressions.In this paper we introduce a general compression technique that results in at most 2N state traversals when processing a string of length N. In comparison to the D2FA approach, our technique achieves comparable levels of compression, with lower provable bounds on memory bandwidth (or greater compression for a given bandwidth bound). Moreover, our proposed algorithm has lower complexity, is suitable for scenarios where a compressed DFA needs to be dynamically built or updated, and fosters locality in the traversal process. Finally, we also describe a novel alphabet reduction scheme for DFA-based structures that can yield further dramatic reductions in data structure size.

...read moreread less

220 citations

Patent•

Three dimensional NAND memory

[...]

Nima Mokhlesi¹, Roy E. Scheuerlein¹•Institutions (1)

SanDisk¹

27 Mar 2007

TL;DR: In this article, a monolithic, three dimensional NAND string includes a first memory cell located over a second memory cell, such that a defined boundary exists between the semiconductor active region of the first memory cells and the second memory cells.

...read moreread less

Abstract: A monolithic, three dimensional NAND string includes a first memory cell located over a second memory cell. A semiconductor active region of the first memory cell is formed epitaxially on a semiconductor active region of the second memory cell, such that a defined boundary exists between the semiconductor active region of the first memory cell and the semiconductor active region of the second memory cell.

...read moreread less

215 citations

Proceedings Article•

VGRAM: improving performance of approximate queries on string collections using variable-length grams

[...]

Chen Li¹, Bin Wang², Xiaochun Yang²•Institutions (2)

University of California, Irvine¹, Northeastern University (China)²

23 Sep 2007

TL;DR: A novel technique, called VGRAM, to judiciously choose high-quality grams of variable lengths from a collection of strings to support queries on the collection, and shows the significant performance improvements on three existing algorithms.

...read moreread less

Abstract: Many applications need to solve the following problem of approximate string matching: from a collection of strings, how to find those similar to a given string, or the strings in another (possibly the same) collection of strings? Many algorithms are developed using fixed-length grams, which are substrings of a string used as signatures to identify similar strings. In this paper we develop a novel technique, called VGRAM, to improve the performance of these algorithms. Its main idea is to judiciously choose high-quality grams of variable lengths from a collection of strings to support queries on the collection. We give a full specification of this technique, including how to select high-quality grams from the collection, how to generate variable-length grams for a string based on the preselected grams, and what is the relationship between the similarity of the gram sets of two strings and their edit distance. A primary advantage of the technique is that it can be adopted by a plethora of approximate string algorithms without the need to modify them substantially. We present our extensive experiments on real data sets to evaluate the technique, and show the significant performance improvements on three existing algorithms.

...read moreread less

198 citations

Book Chapter•DOI•

Computability of models for sequence assembly

[...]

Paul Medvedev¹, Konstantinos Georgiou¹, Gene Myers², Michael Brudno¹•Institutions (2)

University of Toronto¹, Howard Hughes Medical Institute²

08 Sep 2007

TL;DR: This work shows sequence assembly to be NP-hard under two different models: string graphs and de Bruijn graphs, and gives the first, to the knowledge, optimal polynomial time algorithm for genome assembly that explicitly models the double-strandedness of DNA.

...read moreread less

Abstract: Graph-theoretic models have come to the forefront as some of the most powerful and practical methods for sequence assembly. Simultaneously, the computational hardness of the underlying graph algorithms has remained open. Here we present two theoretical results about the complexity of these models for sequence assembly. In the first part, we show sequence assembly to be NP-hard under two different models: string graphs and de Bruijn graphs. Together with an earlier result on the NP-hardness of overlap graphs, this demonstrates that all of the popular graph-theoretic sequence assembly paradigms are NP-hard. In our second result, we give the first, to our knowledge, optimal polynomial time algorithm for genome assembly that explicitly models the double-strandedness of DNA. We solve the Chinese Postman Problem on bidirected graphs using bidirected flow techniques and show to how to use it to find the shortest doublestranded DNA sequence which contains a given set of k-long words. This algorithm has applications to sequencing by hybridization and short read assembly.

...read moreread less

196 citations

Journal Article•

On String Languages Generated by Spiking Neural P Systems

[...]

Haiming Chen¹, Rudolf Freund², Mihai Ionescu³, Gheorghe Paun⁴, Mario J. Pérez-Jiménez⁴ - Show less +1 more•Institutions (4)

Chinese Academy of Sciences¹, Vienna University of Technology², Rovira i Virgili University³, University of Seville⁴

01 Jan 2007-Fundamenta Informaticae

TL;DR: In this paper, the authors consider spiking neural P systems as binary string generators, where the set of spike trains of halting computations of a given system constitutes the language generated by that system.

...read moreread less

Abstract: We continue the study of spiking neural P systems by considering these computing devices as binary string generators: the set of spike trains of halting computations of a given system constitutes the language generated by that system. Although the "direct" generative capacity of spiking neural P systems is rather restricted (some very simple languages cannot be generated in this framework), regular languages are inverse-morphic images of languages of finite spiking neural P systems, and recursively enumerable languages are projections of inverse-morphic images of languages generated by spiking neural P systems.

...read moreread less

170 citations

Patent•

System and method of analyzing web addresses

[...]

Dan Hubbard, Alan Tse

29 Nov 2007

TL;DR: In this paper, a system and method are provided for identifying active content in websites on a network, which includes a method of classifying web addresses and generating a score indicative of the reputation, or likelihood that a web site associated with an uncategorized URL contains active or other targeted content based on an analysis of the URL.

...read moreread less

Abstract: A system and method are provided for identifying active content in websites on a network. One embodiment includes a method of classifying web addresses. One embodiment may include a method of generating a score indicative of the reputation, or likelihood that a web site associated with an uncategorized URL contains active or other targeted content based on an analysis of the URL. In certain embodiments, the score is determined solely from the URL string. Other embodiments include systems configured to perform such methods.

...read moreread less

Proceedings Article•DOI•

String Stability Analysis for Heterogeneous Vehicle Strings

[...]

E. Shaw¹, J.K. Hedrick²•Institutions (2)

Northrop Grumman Corporation¹, University of California, Berkeley²

09 Jul 2007

TL;DR: It is shown that string stability can be achieved for heterogeneous vehicle strings of arbitrary length and arbitrary vehicle type ordering, and the necessary and sufficient conditions forheterogeneous string stability are given for the constant spacing leader-predecessor following control strategy.

...read moreread less

Abstract: The spacing errors of a string stable, homogeneous vehicle string attenuate uniformly down the vehicle chain. This result is useful for implementing vehicle formation control because it provides a guideline for the proper intervehicle spacing. In the heterogeneous case, the differing dynamics of the vehicles means the spacing errors do not attenuate or amplify uniformly down the vehicle chain, regardless of whether the formation is string stable or not. Questions arise regarding how heterogeneous string stability should be defined, and what should the proper intervehicle spacing be in order to guarantee nominal safety. In this paper, heterogeneous vehicle strings under simple decentralized control laws with the constant spacing control policy are analyzed. A definition for heterogeneous string stability is proposed. The necessary and sufficient conditions for heterogeneous string stability are given for the constant spacing leader-predecessor following control strategy. The scalability of the control scheme is verified by analyzing the worst case disturbance to error gain. It is shown that string stability can be achieved for heterogeneous vehicle strings of arbitrary length and arbitrary vehicle type ordering.

...read moreread less

Patent•

Phonetic decoding and concatentive speech synthesis

[...]

David Robert Baker¹, Mark Richard Barnard¹, Richard John Gadd¹, Eric William Janke¹•Institutions (1)

Nuance Communications¹

15 Nov 2007

TL;DR: A speech processing system includes a multiplexer that receives speech data input as part of a conversation turn in a conversation session between two or more users where one user is a speaker and each of the other users is a listener in each conversation turn as mentioned in this paper.

...read moreread less

Abstract: A speech processing system includes a multiplexer that receives speech data input as part of a conversation turn in a conversation session between two or more users where one user is a speaker and each of the other users is a listener in each conversation turn A speech recognizing engine converts the speech data to an input string of acoustic data while a speech modifier forms an output string based on the input string by changing an item of acoustic data according to a rule The system also includes a phoneme speech engine for converting the first output string of acoustic data including modified and unmodified data to speech data for output via the multiplexer to listeners during the conversation turn

...read moreread less

Journal Article•DOI•

A Simple and Systematic Approach to Assigning Denavit–Hartenberg Parameters

[...]

Peter Corke¹•Institutions (1)

Commonwealth Scientific and Industrial Research Organisation¹

01 Jun 2007-IEEE Transactions on Robotics

TL;DR: A simple and intuitive approach to determining the kinematic parameters of a serial-link robot in Denavit-Hartenberg (DH) notation, amenable to computer algebra manipulation and a Java program is available as supplementary downloadable material.

...read moreread less

Abstract: This paper presents a simple and intuitive approach to determining the kinematic parameters of a serial-link robot in Denavit-Hartenberg (DH) notation Once a manipulator's kinematics is parameterized in this form, a large body of standard algorithms and code implementations for kinematics, dynamics, motion planning, and simulation are available The proposed method has two parts The first is the ldquowalk through,rdquo a simple procedure that creates a string of elementary translations and rotations, from the user-defined base coordinate to the end-effector The second step is an algebraic procedure to manipulate this string into a form that can be factorized as link transforms, which can be represented in standard or modified DH notation The method allows for an arbitrary base and end-effector coordinate system as well as an arbitrary zero joint angle pose The algebraic procedure is amenable to computer algebra manipulation and a Java program is available as supplementary downloadable material

...read moreread less

Book Chapter•DOI•

Ring signatures of sub-linear size without random oracles

[...]

Nishanth Chandran¹, Jens Groth¹, Amit Sahai¹•Institutions (1)

University of California, Los Angeles¹

09 Jul 2007

TL;DR: A variation of the ring signature scheme is offered, where the signer is guaranteed anonymity even if the common reference string is maliciously generated, and an additional feature of this scheme is that it has perfect anonymity.

...read moreread less

Abstract: Ring signatures, introduced by Rivest, Shamir and Tauman, enable a user to sign a message anonymously on behalf of a "ring". A ring is a group of users, which includes the signer. We propose a ring signature scheme that has size O(√N) where N is the number of users in the ring. An additional feature of our scheme is that it has perfect anonymity. Our ring signature like most other schemes uses the common reference string model. We offer a variation of our scheme, where the signer is guaranteed anonymity even if the common reference string is maliciously generated.

...read moreread less

Journal Article•DOI•

Fast exact string matching algorithms

[...]

Thierry Lecroq¹•Institutions (1)

University of Rouen¹

30 May 2007-Information Processing Letters

TL;DR: A very fast new family of string matching algorithms based on hashing q-grams are proposed, which are the fastest on many cases, in particular, on small size alphabets.

...read moreread less

Patent•

System and method for facilitating downhole operations

[...]

John R. Whitsitt¹, Jonas Jason K¹, Gary L. Rytlewski¹, Dinesh R. Patel¹•Institutions (1)

Schlumberger¹

10 Oct 2007

TL;DR: In this article, a technique is provided to facilitate use of a service tool at a downhole location, which has different operational configurations that can be selected and used without moving the service string.

...read moreread less

Abstract: A technique is provided to facilitate use of a service tool at a downhole location. The service tool has different operational configurations that can be selected and used without moving the service string.

...read moreread less

Proceedings Article•DOI•

A simple storage scheme for strings achieving entropy bounds

[...]

Paolo Ferragina¹, Rossano Venturini¹•Institutions (1)

University of Pisa¹

07 Jan 2007

TL;DR: In this article, a storage scheme for a string S[1, n] drawn from an alphabet σ, that requires space close to the κ-th order empirical entropy of S, and allows to retrieve any l-long substring of S in optimal O(1+l/log|∑|n) time.

...read moreread less

Abstract: We propose a storage scheme for a string S[1, n], drawn from an alphabet σ, that requires space close to the κ-th order empirical entropy of S, and allows to retrieve any l-long substring of S in optimal O(1+l/log|∑|n) time. This matches the best known bounds [14, 7], via the use of binary encodings and tables only. We also apply this storage scheme to prove new time vs space trade-offs for compressed self-indexes [5, 12] and the Burrows-Wheeler Transform [2].

...read moreread less

Proceedings Article•DOI•

Succinct indexes for strings, binary relations and multi-labeled trees

[...]

Jérémy Barbay¹, Meng He¹, J. Ian Munro¹, S. Srinivasa Rao²•Institutions (2)

University of Waterloo¹, IT University of Copenhagen²

07 Jan 2007

TL;DR: This paper defines and design succinct indexes for several abstract data types, namely strings, binary relations and multi-labeled trees, and designs a succinct encoding that represents a string of length n over an alphabet of size σ using bits to support access/rank/select operations.

...read moreread less

Abstract: We define and design succinct indexes for several abstract data types (ADTs). The concept is to design auxiliary data structures that occupy asymptotically less space than the information-theoretic lower bound on the space required to encode the given data, and support an extended set of operations using the basic operators defined in the ADT. As opposed to succinct (integrated data/index) encodings, the main advantage of succinct indexes is that we make assumptions only on the ADT through which the main data is accessed, rather than the way in which the data is encoded. This allows more freedom in the encoding of the main data. In this paper, we present succinct indexes for various data types, namely strings, binary relations and multi-labeled trees. Given the support for the interface of the ADTs of these data Types, we can support various useful operations efficiently by constructing succinct indexes for them. When the operators in the ADTs are supported in constant time, our results are comparable to previous results, while allowing more flexibility in the encoding of the given data.Using our techniques, we design a succinct encoding that represents a string of length n over an alphabet of size σ using nHk + o(n lg σ) bits1 to support access/rank/select operations in o((lg lg σ)3) time. We also design a succinct text index using nHk + o(n lg σ) bits that supports pattern matching queries in O(m lg lg σ + occ lg1+enlg lg σ) time, for a given pattern of length m. Previous results on these two problems either have a lg σ factor instead of lg lg σ in terms of running time, or are not compressible.

...read moreread less

Proceedings Article•DOI•

Can We Translate Letters

[...]

David Vilar¹, Jan-Thorsten Peter¹, Hermann Ney¹•Institutions (1)

RWTH Aachen University¹

23 Jun 2007

TL;DR: This work tries to find out if a nearly unmodified state-of-the-art translation system is able to cope with the problem and whether it is capable to further generalize translation rules, for example at the level of word suffixes and translation of unseen words.

...read moreread less

Abstract: Current statistical machine translation systems handle the translation process as the transformation of a string of symbols into another string of symbols. Normally the symbols dealt with are the words in different languages, sometimes with some additional information included, like morphological data. In this work we try to push the approach to the limit, working not on the level of words, but treating both the source and target sentences as a string of letters. We try to find out if a nearly unmodified state-of-the-art translation system is able to cope with the problem and whether it is capable to further generalize translation rules, for example at the level of word suffixes and translation of unseen words. Experiments are carried out for the translation of Catalan to Spanish.

...read moreread less

Proceedings Article•DOI•

Efficient token based clone detection with flexible tokenization

[...]

Hamid Abdul Basit¹, Simon J. Puglisi², William F. Smyth³, Andrew Turpin⁴, Stan Jarzabek⁵ - Show less +1 more•Institutions (5)

Lahore University of Management Sciences¹, Curtin University², McMaster University³, RMIT University⁴, National University of Singapore⁵

03 Sep 2007

TL;DR: String algorithms are explored to find suitable data structures and algorithms for efficient token based clone detection and implemented them in the tool Repeated Tokens Finder (RTF), which incorporates a suffix array based linear time algorithm to detect string matches.

...read moreread less

Abstract: Code clones are similar code fragments that occur at multiple locations in a software system. Detection of code clones provides useful information for maintenance, reengineering, program understanding and reuse. Several techniques have been proposed to detect code clones. These techniques differ in the code representation used for analysis of clones, ranging from plain text to parse trees and program dependence graphs. Clone detection based on lexical tokens involves minimal code transformation and gives good results, but is computationally expensive because of the large number of tokens that need to be compared. We explored string algorithms to find suitable data structures and algorithms for efficient token based clone detection and implemented them in our tool Repeated Tokens Finder (RTF). Instead of using suffix tree for string matching, we use more memory efficient suffix array. RTF incorporates a suffix array based linear time algorithm to detect string matches. It also provides a simple and customizable tokenization mechanism. Initial analysis and experiments show that our clone detection is simple, scalable, and performs better than the previous well-known tools.

...read moreread less

Journal Article•DOI•

A note on the number of squares in a word

[...]

Lucian Ilie¹•Institutions (1)

University of Western Ontario¹

30 Jul 2007-Theoretical Computer Science

TL;DR: This note improves the bound that the number of squares in a word of length n is bounded by 2n to 2n-@Q(logn), and conjectures that the conjectured bound is n.

...read moreread less

Patent•

Interactive image tagging

[...]

Gavin M. Gear¹, Sam J. George¹, Richard L. Spencer¹•Institutions (1)

Microsoft¹

15 Mar 2007

TL;DR: In this paper, the authors described techniques for automatic generation of one or more tags associated with an image file using hand-written annotations for a displayed image and handwriting recognition processing of the ink annotations.

...read moreread less

Abstract: Techniques are described for performing automatic generation of one or more tags associated with an image file. One or more ink annotations for a displayed image are received. Handwriting recognition processing of the one or more ink annotations is performed. A string is generated and the string includes one or more recognized words used to form the one or more tags associated with the image file. The handwriting recognition processing and generating the string are performed in response to receiving the ink annotations.

...read moreread less

Book Chapter•DOI•

Concurrently-secure blind signatures without random oracles or setup assumptions

[...]

Carmit Hazay¹, Jonathan Katz², Chiu-Yuen Koo², Yehuda Lindell¹•Institutions (2)

Bar-Ilan University¹, University of Maryland, College Park²

21 Feb 2007

TL;DR: A new protocol for blind signatures in which security is preserved even under arbitrarily-many concurrent executions is shown, which is the first to be proven secure in a concurrent setting without random oracles or a trusted setup assumption such as a common reference string.

...read moreread less

Abstract: We show a new protocol for blind signatures in which security is preserved even under arbitrarily-many concurrent executions. The protocol can be based on standard cryptographic assumptions and is the first to be proven secure in a concurrent setting (under any assumptions) without random oracles or a trusted setup assumption such as a common reference string. Along the way, we also introduce new definitions of security for blind signature schemes.

...read moreread less

Patent•

Solar inverter and plant for converting solar energy into electrical energy

[...]

Sergio Zanarini, Morici Riccardo

11 Oct 2007

TL;DR: In this article, the authors described a plant for converting solar energy into electrical energy, comprising a photovoltaic generator (2a) including at least one string (2) of photovolastic modules (M), a pulse generator (31) able to send electrical pulses to the input of the string, a signal detector (OP) arranged at the output of a string and able to detect the presence of a signal which is a function of the electrical pulses at the input, and alarm means connected to the signal detector, and can generate an alarm in the event that there

...read moreread less

Abstract: There is described a plant (1) for converting solar energy into electrical energy, comprising a photovoltaic generator (2a) including at least one string (2) of photovoltaic modules (M), a pulse generator (31) able to send electrical pulses to the input of the string (2), a signal detector (OP) arranged at the output of the string (2) and able to detect, at the output of the string (2), the presence of a signal which is a function of the electrical pulses at the input, and alarm means connected to the signal detector (OP) and able to generate an alarm in the event that there is no signal at the output of the string (2).

...read moreread less

Patent•

Flash memory program inhibit scheme

[...]

Jin-Ki Kim

29 Nov 2007

TL;DR: In this article, a local boosted channel inhibit scheme was proposed to reduce program disturb in a NAND Flash memory cell string where no programming from the erased state is desired, where the selected memory cell was decoupled from the other cells in the NAND string.

...read moreread less

Abstract: A method for minimizing program disturb in Flash memories. To reduce program disturb in a NAND Flash memory cell string where no programming from the erased state is desired, a local boosted channel inhibit scheme is used. In the local boosted channel inhibit scheme, the selected memory cell in a NAND string where no programming is desired, is decoupled from the other cells in the NAND string. This allows the channel of the decoupled cell to be locally boosted to a voltage level sufficient for inhibiting F-N tunneling when the corresponding wordline is raised to a programming voltage. Due to the high boosting efficiency, the pass voltage applied to the gates of the remaining memory cells in the NAND string can be reduced relative to prior art schemes, thereby minimizing program disturb while allowing for random page programming.

...read moreread less

Book Chapter•DOI•

Cryptography in the multi-string model

[...]

Jens Groth¹, Rafail Ostrovsky¹•Institutions (1)

University of California, Los Angeles¹

19 Aug 2007

TL;DR: This paper defines multi-string non-interactive zero-knowledge proofs and proves that they exist under general cryptographic assumptions, and suggests a universally composable commitment scheme in the multistring model.

...read moreread less

Abstract: The common random string model introduced by Blum, Feldman and Micali permits the construction of cryptographic protocols that are provably impossible to realize in the standard model. We can think of this model as a trusted party generating a random string and giving it to all parties in the protocol. However, the introduction of such a third party should set alarm bells going off: Who is this trusted party? Why should we trust that the string is random? Even if the string is uniformly random, how do we know it does not leak private information to the trusted party? The very point of doing cryptography in the first place is to prevent us from trusting the wrong people with our secrets. In this paper, we propose the more realistic multi-string model. Instead of having one trusted authority, we have several authorities that generate random strings. We do not trust any single authority; we only assume a majority of them generate the random string honestly. This security model is reasonable, yet at the same time it is very easy to implement. We could for instance imagine random strings being provided on the Internet, and any set of parties that want to execute a protocol just need to agree on which authorities' strings they want to use. We demonstrate the use of the multi-string model in several fundamental cryptographic tasks. We define multi-string non-interactive zero-knowledge proofs and prove that they exist under general cryptographic assumptions. Our multistring NIZK proofs have very strong security properties such as simulation-extractability and extraction zero-knowledge, which makes it possible to compose them with arbitrary other protocols and to reuse the random strings. We also build efficient simulation-sound multi-string NIZK proofs for circuit satisfiability based on groups with a bilinear map. The sizes of these proofs match the best constructions in the single common random string model. We suggest a universally composable commitment scheme in the multistring model. It has been proven that UC commitment does not exist in the plain model without setup assumptions. Prior to this work, constructions were only known in the common reference string model and the registered public key model. One of the applications of the UC commitment scheme is a coin-flipping protocol in the multi-string model. Armed with the coin-flipping protocol, we can securely realize any multi-party computation protocol.

...read moreread less

Journal Article•DOI•

The number of runs in a string

[...]

Wojciech Rytter¹•Institutions (1)

University of Warsaw¹

01 Sep 2007-Information & Computation

TL;DR: It is shown that ρ(n)≤n and there are at most O.67n runs with periods larger than 87, which supports the conjecture that the number of all runs is smaller than n.

...read moreread less

Abstract: A run in a string is a nonextendable (with the same minimal period) periodic segment in a string. The set of runs corresponds to the structure of internal periodicities in a string. Periodicities in strings were extensively studied and are important both in theory and practice (combinatorics of words, pattern-matching, computational biology). Let ρ(n) be the maximal number of runs in a string of length n. It has been shown that ρ(n)=O(n), the proof was very complicated and the constant coefficient in O(n) has not been given explicitly. We demystify the proof of the linear upper bound for ρ(n) and propose a new approach to the analysis of runs based on the properties of subperiods:the periods of periodic parts of the runs We show that ρ(n)≤n and there are at most O.67n runs with periods larger than 87. This supports the conjecture that the number of all runs is smaller than n. We also give a completely new proof of the linear bound and discover several new interesting "periodicity lemmas".

...read moreread less

Proceedings Article•DOI•

Preventing injection attacks with syntax embeddings

[...]

Martin Bravenboer¹, Eelco Dolstra², Eelco Visser¹•Institutions (2)

Delft University of Technology¹, Utrecht University²

01 Oct 2007

TL;DR: This work describes a more natural style of programming that yields code that is impervious to injections by construction, and automatically generates code that maps the embedded language to constructs in the host language that reconstruct the embedded sentences, adding escaping functions where appropriate.

...read moreread less

Abstract: Software written in one language often needs to construct sentences in another language, such as SQL queries, XML output, or shell command invocations. This is almost always done using unhygienic string manipulation, the concatenation of constants and client-supplied strings. A client can then supply specially crafted input that causes the constructed sentence to be interpreted in an unintended way, leading to an injection attack. We describe a more natural style of programming that yields code that is impervious to injections by construction. Our approach embeds the grammars of the guest languages (e.g., SQL) into that of the host language (e.g., Java) and automatically generates code that maps the embedded language to constructs in the host language that reconstruct the embedded sentences, adding escaping functions where appropriate. This approach is generic, meaning that it can be applied with relative ease to any combination of host and guest languages.

...read moreread less

Journal Article•DOI•

Communicating Probability Distributions

[...]

Gerhard Kramer¹, Serap A. Savari•Institutions (1)

Bell Labs¹

01 Feb 2007-IEEE Transactions on Information Theory

TL;DR: A rate distortion problem is solved that is motivated by a quantum data compression problem to send information about a source string x so that a receiver can construct a second string y for which the joint empirical probability distribution of x and y is close to some desired distribution.

...read moreread less

Abstract: A rate distortion problem is solved that is motivated by a quantum data compression problem The goal is to send information about a source string x so that a receiver can construct a second string y for which the joint empirical probability distribution of x and y is close to some desired distribution The problem differs from the usual rate distortion problems in that one must consider both remote sources and distortion functions that are not averages of per-letter distortion functions

...read moreread less

Collapse