scispace - formally typeset
Search or ask a question

Showing papers on "Formal language published in 2006"


Proceedings Article
01 Jan 2006
TL;DR: The BLOG model as discussed by the authors is a formal language for defining probability models with unknown objects and identity uncertainty, and it can be used to describe a generative process in which some steps add objects to the world, and others determine attributes and relations on these objects.
Abstract: We introduce BLOG, a formal language for defining probability models with unknown objects and identity uncertainty. A BLOG model describes a generative process in which some steps add objects to the world, and others determine attributes and relations on these objects. Subject to certain acyclicity constraints, a BLOG model specifies a unique probability distribution over first-order model structures that can contain varying and unbounded numbers of objects. Furthermore, inference algorithms exist for a large class of BLOG models.

427 citations


Journal ArticleDOI
TL;DR: Jinja is a compromise between the realism of the language and the tractability and clarity of its formal semantics, and provides a unified model of the source language, the virtual machine, and the compiler.
Abstract: We introduce Jinja, a Java-like programming language with a formal semantics designed to exhibit core features of the Java language architecture. Jinja is a compromise between the realism of the language and the tractability and clarity of its formal semantics. The following aspects are formalised: a big and a small step operational semantics for Jinja and a proof of their equivalence, a type system and a definite initialisation analysis, a type safety proof of the small step semantics, a virtual machine (JVM), its operational semantics and its type system, a type safety proof for the JVM; a bytecode verifier, that is, a data flow analyser for the JVM, a correctness proof of the bytecode verifier with respect to the type system, and a compiler and a proof that it preserves semantics and well-typedness. The emphasis of this work is not on particular language features but on providing a unified model of the source language, the virtual machine, and the compiler. The whole development has been carried out in the theorem prover Isabelle/HOL.

269 citations


Book
01 Jan 2006
TL;DR: This monograph Tanya Reinhart discusses strategies enabling the interface of different cognitive systems, which she identifies as the systems of concepts, inference, context, and sound, and argues that in each of these areas there are certain aspects of meaning and use that cannot be coded in the CS formal language.
Abstract: In this monograph Tanya Reinhart discusses strategies enabling the interface of different cognitive systems, which she identifies as the systems of concepts, inference, context, and sound. Her point of departure is Noam Chomsky's hypothesis that language is optimally designed--namely, that in many cases, the bare minimum needed for constructing syntactic derivations is sufficient for the full needs of the interface. Deviations from this principle are viewed as imperfections.The book covers in depth four areas of the interface: quantifier scope, focus, anaphora resolution, and implicatures. The first question in each area is what makes the computational system (CS, syntax) legible to the other systems at the interface--how much of the information needed for the interface is coded already in the CS, and how it is coded. Next Reinhart argues that in each of these areas there are certain aspects of meaning and use that cannot be coded in the CS formal language, on both conceptual and empirical grounds. This residue is governed by interface strategies that can be viewed as repair of imperfections. They require constructing and comparing a reference set of alternative derivations to determine whether a repair operation is indeed the only way to meet the interface requirements.Evidence that reference-set computation applies in these four areas comes from language acquisition. The required computation poses a severe load on working memory. While adults can cope with this load, children, whose working memory is less developed, fail in tasks requiring this computation.

262 citations


Proceedings ArticleDOI
17 Jul 2006
TL;DR: This work presents a new approach for mapping natural language sentences to their formal meaning representations using string-kernel-based classifiers, which compares favorably to other existing systems and is particularly robust to noise.
Abstract: We present a new approach for mapping natural language sentences to their formal meaning representations using string-kernel-based classifiers. Our system learns these classifiers for every production in the formal language grammar. Meaning representations for novel natural language sentences are obtained by finding the most probable semantic parse using these string classifiers. Our experiments on two real-world data sets show that this approach compares favorably to other existing systems and is particularly robust to noise.

253 citations


Proceedings ArticleDOI
12 Aug 2006
TL;DR: It is shown that satisfiability for the two-variable first-order logic FO2(~,<,+1) is decidable over finite and over infinite data words, where ~ is a binary predicate testing the data value equality and +1,< are the usual successor and order predicates.
Abstract: In a data word each position carries a label from a finite alphabet and a data value from some infinite domain. These models have been already considered in the realm of semistructured data, timed automata and extended temporal logics. It is shown that satisfiability for the two-variable first-order logic FO^2(~,\le,+1) is decidable over finite and over infinite data words, where i« is a binary predicate testing the data value equality and +1,\le are the usual successor and order predicates. The complexity of the problem is at least as hard as Petri net reachability. Several extensions of the logic are considered, some remain decidable while some are undecidable.

220 citations


Book
01 Jan 2006
TL;DR: This guide guides students interactively through many of the concepts in an automata theory course or the early topics in a compiler course, including the descriptions of algorithms JFLAP has implemented.
Abstract: JFLAP: An Interactive Formal Languages and Automata Package is a hands-on supplemental guide through formal languages and automata theory. JFLAP guides students interactively through many of the concepts in an automata theory course or the early topics in a compiler course, including the descriptions of algorithms JFLAP has implemented. Students can experiment with the concepts in the text and receive immediate feedback when applying these concepts with the accompanying software. The text describes each area of JFLAP and reinforces concepts with end-of-chapter exercises. In addition to JFLAP, this guide incorporates two other automata theory tools into JFLAP: JellRap and Pate.

154 citations


Book ChapterDOI
01 Jan 2006
TL;DR: This paper describes two algorithms for inferring reaction rules and kinetic parameter values from a temporal specification formalizing the biological data and illustrates how these machine learning techniques may be useful to the modeler.
Abstract: One central issue in systems biology is the definition of formal languages for describing complex biochemical systems and their behavior at different levels. The biochemical abstract machine BIOCHAM is based on two formal languages, one rule-based language used for modeling biochemical networks, at three abstraction levels corresponding to three semantics: boolean, concentration and population; and one temporal logic language used for formalizing the biological properties of the system. In this paper, we show how the temporal logic language can be turned into a specification language. We describe two algorithms for inferring reaction rules and kinetic parameter values from a temporal specification formalizing the biological data. Then, with an example of the cell cycle control, we illustrate how these machine learning techniques may be useful to the modeler.

122 citations


Dissertation
01 Jan 2006
TL;DR: This thesis consists of six research papers that contribute theory to alleviate the state space explosion problem and demonstrate the practical use of model checking technology by applying it to realistic case studies.
Abstract: Model checking is a technique to automatically analyse systems that have been modeled in a formal language. The timed automaton framework is such a formal language. It is suitable to model many realistic problems in which time plays a central role. Examples are distributed algorithms, protocols, embedded software and scheduling problems. The main problem with model checking is the exponential growth of the state space as models become larger (also known as the 'state space explosion' problem). This thesis consists of six research papers. Three of these contribute theory to alleviate the state space explosion problem. The other three demonstrate the practical use of model checking technology by applying it to realistic case studies.

86 citations


Book ChapterDOI
19 Feb 2006
TL;DR: In this article, the authors look at a corpus of English descriptions used as programming assignments, and develop some techniques for mapping linguistic constructs onto program structures, which they refer to as programmatic semantics.
Abstract: Natural Language Processing holds great promise for making computer interfaces that are easier to use for people, since people will (hopefully) be able to talk to the computer in their own language, rather than learn a specialized language of computer commands. For programming, however, the necessity of a formal programming language for communicating with a computer has always been taken for granted. We would like to challenge this assumption. We believe that modern Natural Language Processing techniques can make possible the use of natural language to (at least partially) express programming ideas, thus drastically increasing the accessibility of programming to non-expert users. To demonstrate the feasibility of Natural Language Programming, this paper tackles what are perceived to be some of the hardest cases: steps and loops. We look at a corpus of English descriptions used as programming assignments, and develop some techniques for mapping linguistic constructs onto program structures, which we refer to as programmatic semantics.

78 citations


Proceedings ArticleDOI
24 Jul 2006
TL;DR: A technique to interactively explore the semantics of a specification by simulating its behavior for user-defined scenarios and techniques to automatically check specifications against a set of user-provided assertions, which must be satisfied, and aSet of possibilities, whichmust not be contradicted.
Abstract: Formal languages are increasingly used to describe the functional requirements (specifications) of circuits. These requirements are used as a means to communicate design intent and as basis for verification. In both settings it is of utmost importance that the specifications are of high quality. However, formal requirements are seldom the object of validation, even though they can be hard to understand and interactions between them can be subtle. In this paper, we present techniques and guidelines to explore and assure the quality of a formal specification. We define a technique to interactively explore the semantics of a specification by simulating its behavior for user-defined scenarios. Furthermore, we define techniques to automatically check specifications against a set of user-provided assertions, which must be satisfied, and a set of possibilities, which must not be contradicted. The proposed techniques support the user in the iterative development and refinement of high-quality specifications.

58 citations


Book
22 Jun 2006
TL;DR: A visual approach to formal languages with a focus on biopolymers and Turing machines and from biopolymer to formal language theory.
Abstract: Recent applications to biomolecular science and DNA computing have created a new audience for automata theory and formal languages. This is the only introductory book to cover such applications. It begins with a clear and readily understood exposition of the fundamentals that assumes only a background in discrete mathematics. The first five chapters give a gentle but rigorous coverage of basic ideas as well as topics not found in other texts at this level, including codes, retracts and semiretracts. Chapter 6 introduces combinatorics on words and uses it to describe a visually inspired approach to languages. The final chapter explains recently-developed language theory coming from developments in bioscience and DNA computing. With over 350 exercises (for which solutions are available), many examples and illustrations, this text will make an ideal contemporary introduction for students; others, new to the field, will welcome it for self-learning.

Book ChapterDOI
25 Sep 2006
TL;DR: An arc-consistency algorithm for context-free grammars, an investigation of when logic combinations of grammar constraints are tractable, and when the boundaries run between regular, context- free, and context-sensitive grammar filtering are studied.
Abstract: By introducing the Regular Membership Constraint, Gilles Pesant pioneered the idea of basing constraints on formal languages. The paper presented here is highly motivated by this work, taking the obvious next step, namely to investigate constraints based on grammars higher up in the Chomsky hierarchy. We devise an arc-consistency algorithm for context-free grammars, investigate when logic combinations of grammar constraints are tractable, show how to exploit non-constant size grammars and reorderings of languages, and study where the boundaries run between regular, context-free, and context-sensitive grammar filtering.

Journal ArticleDOI
TL;DR: A new semantics for multiset rewriting founded on an alternative view of linear logic is presented and a completely new approach to understanding concurrent and distributed programming as a manifestation of logic is proposed, which yields a language that merges those two main paradigms of concurrency.

Posted Content
TL;DR: An exposition of the theory of M and G-automata, or finite automata augmented with a multiply-only register storing an element of a given monoid or group, with a group-theoretic interpretation and proof of a key theorem of Chomsky and Schützenberger from formal language theory.
Abstract: We present an exposition of the theory of finite automata augmented with a multiply-only register storing an element of a given monoid or group. Included are a number of new results of a foundational nature. We illustrate our techniques with a group-theoretic interpretation and proof of a key theorem of Chomsky and Schutzenberger from formal language theory.

Journal Article
TL;DR: This work presents a tutorial of the ITP tool, a rewriting-based theorem prover that can be used to prove inductive properties of membership equational specifications, and introduces membership Equational logic as a formal language particularly ad- equate for specifying and verifying semantic data structures.
Abstract: We present a tutorial of the ITP tool, a rewriting-based theorem prover that can be used to prove inductive properties of membership equational specifications. We also introduce membership equational logic as a formal language particularly ad- equate for specifying and verifying semantic data structures, such as ordered lists, binary search trees, priority queues, and powerlists. The ITP tool is a Maude program that makes extensive use of the reflective capabilities of this system. In fact, rewriting- based proof simplification steps are directly executed by the powerful underlying Maude rewriting engine. The ITP tool is currently available as a web-based application that includes a module editor, a formula editor, and a command editor. These editors allow users to create and modify their specifications, to formalize properties about them, and to guide their proofs by filling and submitting web forms.


Book ChapterDOI
08 Sep 2006
TL;DR: A small language CDL is proposed as a formal model of the simplified WS-CDL, which includes important concepts related to participant roles and collaborations among them in a choreography.
Abstract: The Web Services Choreography Description Language (WS-CDL) is a W3C specification for the description of peer-to-peer collaborations of participants from a global viewpoint. For the rigorous development and tools support for the language, the formal semantics of WS-CDL is worth investigating. This paper proposes a small language CDL as a formal model of the simplified WS-CDL, which includes important concepts related to participant roles and collaborations among them in a choreography. The formal operational semantics of CDL is given. Based on the formal model, we discuss further: 1) project a given choreography to orchestration views, which provides a basis for the implementation of the choreography by code generation; 2) translate WS-CDL to the input language of the model-checker SPIN, which allows us to automatically verify the correctness of a given choreography. An automatic translator has been implemented.


Book ChapterDOI
13 Sep 2006
TL;DR: ERA is introduced, an ECA language based on, and extending the framework of logic programs updates that exhibits capabilities to integrate external updates and perform self updates to its knowledge and behaviour.
Abstract: Event-Condition-Action (ECA) languages are an intuitive and powerful paradigm for programming reactive systems Usually, important features for an ECA language are reactive and reasoning capabilities, the possibility to express complex actions and events, and a declarative semantics In this paper, we introduce ERA, an ECA language based on, and extending the framework of logic programs updates that, together with these features, also exhibits capabilities to integrate external updates and perform self updates to its knowledge (data and classical rules) and behaviour (reactive rules)

Journal Article
Ina Mäurer1
TL;DR: This work introduces weighted 2-dimensional online tessellation automata (W2OTA) extending the common automata-theoretic model for picture languages and proves that the class of picture series defined by sentences of the weighted logics coincides with the family ofPicture series that are computable by W2OTA.
Abstract: The theory of two-dimensional languages, generalizing formal string languages, was motivated by problems arising from image processing and models of parallel computing. Weighted automata and series over pictures map pictures to some semiring and provide an extension to a quantitative setting. We establish a notion of a weighted MSO logics over pictures. The semantics of a weighted formula will be a picture series. We introduce weighted 2-dimensional online tessellation automata (W2OTA) extending the common automata-theoretic model for picture languages. We prove that the class of picture series defined by sentences of the weighted logics coincides with the family of picture series that are computable by W2OTA. Moreover, behaviours of W2OTA coincide precisely with the recognizable picture series characterized in [18].

Journal ArticleDOI
TL;DR: It is shown that if M is a DFA with n states over an alphabet with at least two letters and L = L(M), then the worst-case state complexity of L2 is n2n - 2n-1.

Journal ArticleDOI
TL;DR: An approach to the formalization of existing criteria used in computer systems software testing is described and a new Reinforced Condition/Decision Coverage (RC/DC) criterion is proposed that is more suitable for the testing of safety-critical software where MC/DC may not provide adequate assurance.
Abstract: This paper describes an approach to the formalization of existing criteria used in computer systems software testing and proposes a new Reinforced Condition/Decision Coverage (RC/DC) criterion. This new criterion has been developed from the well-known Modified Condition/Decision Coverage (MC/DC) criterion and is more suitable for the testing of safety-critical software where MC/DC may not provide adequate assurance. As a formal language for describing the criteria, the Z notation has been selected. Formal definitions in the Z notation for RC/DC, as well as MC/DC and other criteria, are presented. Specific examples of using these criteria for specification-based testing are considered and some features are formally proved. This characterization is helpful in the understanding of different types of testing and also the correct application of a desired testing regime.

Journal ArticleDOI
TL;DR: A set of criteria designed for use in the selection of an appropriate programming language for introductory courses are reviewed and refinements in the process are advanced that should circumvent the problems that were discovered in the original proposal.
Abstract: Introduction A cursory glance through back issues of computer-related journals makes it apparent that discussions about the introductory programming language course and the language appropriate for that course have been numerous and on-going (Smolarski, 2003). The selection of a programming language for instructional purposes is often viewed as a tedious chore because there is no well-established approach for performing the evaluation. However, the choice of a programming language has serious education repercussions (Schneider, 1978). Dijkstra (1972, p. 864) stated that "... the tools we are trying to use and the language or notation we are using to express or record our thoughts are the major factors determining what we can think or express at all! The analysis of the influence that programming languages have on the thinking habits of their users ... give[s] us a new collection of yardsticks for comparing the relative merits of various programming languages." The informal process may involve faculty discussion, with champions touting the advantages of their preferred language, and an eventual consensus, or at least surrender. Because the process must be repeated every three or four years it would be preferable to develop a structured approach to make the process more systematic. The goal of this study is to develop and refine an instrument to facilitate the selection of a programming language and to make the process more uniform and easily replicated. The original paper (Parker, Ottaway, & Chao, 2006) proposed an objective selection process. A pilot study was conducted to test the viability of the process. The following steps outline the proposed approach that guided the pilot study. 1. Compile a list of language selection criteria. 2. Weight each of the criteria. Ask each evaluator to weight, specific to the department's needs, the value of importance for each criterion. 3. Determine a list of candidate languages. The list should be comprised of languages nominated by the faculty rather than a complete list of available languages. 4. Evaluate the language. Each candidate language should be assigned a rating for each criterion. 5. Calculate weighted score. For each candidate language, a weighted score can be calculated by adding together the language score multiplied by the weight assigned to each criterion. The language with the highest weighted score is the optimal choice, given the evaluators' assessments. This paper first reviews a set of criteria designed for use in the selection of an appropriate programming language for introductory courses. It then describes the structure of the pilot study and resulting findings. Finally, the paper advances refinements in the process that should circumvent the problems that were discovered in our original proposal. Criteria A previous paper proposed criteria for the selection of a programming language for introductory courses. The criteria were derived by perusing over sixty papers relevant to language selection and justified by a brief review of the supporting literature in (Parker et al., 2006). Each of the criteria in Table 1 has been used in one or more previous studies that evaluate programming languages. A complete literature review and justification for each of the criterion can be found in (Parker et al., 2006), but a brief explanation of each follows. Reasonable Financial Cost This criterion refers to the price to acquire the programming language or the development environment. This may involve individual packages or a site license for a network version. There may be an academic discount for educational institutions, there may be an alliance in which the university or department can enroll, or there may even be a free, downloadable version. Availability of Student/ Academic Version The availability of a student version or academic version allows students to install the development environment on their personal machine, making it more convenient for them to work on their assignments when the computer lab is not accessible. …

Journal Article
TL;DR: In this article, a small language CDL is proposed as a formal model of the simplified WS-CDL, which includes important concepts related to participant roles and collaborations among them in a choreography.
Abstract: The Web Services Choreography Description Language (WS-CDL) is a W3C specification for the description of peer-to-peer collaborations of participants from a global viewpoint. For the rigorous development and tools support for the language, the formal semantics of WS-CDL is worth investigating. This paper proposes a small language CDL as a formal model of the simplified WS-CDL, which includes important concepts related to participant roles and collaborations among them in a choreography. The formal operational semantics of CDL is given. Based on the formal model, we discuss further: 1) project a given choreography to orchestration views, which provides a basis for the implementation of the choreography by code generation; 2) translate WS-CDL to the input language of the model-checker SPIN, which allows us to automatically verify the correctness of a given choreography. An automatic translator has been implemented.

Journal ArticleDOI
TL;DR: An implementable necessary and sufficient condition for predictability of occurrences of an event in systems modeled by regular languages is presented.

Proceedings ArticleDOI
12 Aug 2006
TL;DR: It follows that one can effectively decide whether a given regular language is captured by one of these two fragments of first order logic.
Abstract: Two results by Schutzenberger (1965) and by Mc- Naughton and Papert (1971) lead to a precise description of the expressive power of first order logic on words interpreted as ordered colored structures. In this paper, we study the expressive power of existential formulas and of Boolean combinations of existential formulas in a logic enriched by modular numerical predicates. We first give a combinatorial description of the corresponding regular languages, and then give an algebraic characterization in terms of their syntactic morphisms. It follows that one can effectively decide whether a given regular language is captured by one of these two fragments of first order logic. The proofs rely on nontrivial techniques of semigroup theory: stamps, derived categories and wreath products.

Book ChapterDOI
Ina Mäurer1
23 Feb 2006
TL;DR: In this article, the authors introduce weighted 2-dimensional online tessellation automata (W2OTA) extending the common automata-theoretic model for picture languages, and prove that the class of picture series defined by sentences of the weighted logics coincides with the family of picture-series that are computable by W2OTA.
Abstract: The theory of two-dimensional languages, generalizing formal string languages, was motivated by problems arising from image processing and models of parallel computing. Weighted automata and series over pictures map pictures to some semiring and provide an extension to a quantitative setting. We establish a notion of a weighted MSO logics over pictures. The semantics of a weighted formula will be a picture series. We introduce weighted 2-dimensional online tessellation automata (W2OTA) extending the common automata-theoretic model for picture languages. We prove that the class of picture series defined by sentences of the weighted logics coincides with the family of picture series that are computable by W2OTA. Moreover, behaviours of W2OTA coincide precisely with the recognizable picture series characterized in [18].

Proceedings ArticleDOI
03 Mar 2006
TL;DR: A hands-on approach to problem solving in the formal languages and automata theory course, and a new feature in JFLAP, Turing machine building blocks, where one can now build complex Turing machines by using other Turing machines as components or building blocks.
Abstract: We present a hands-on approach to problem solving in the formal languages and automata theory course. Using the tool JFLAP, students can solve a wide range of problems that are tedious to solve using pencil and paper. In combination with the more traditional theory problems, students study a wider-range of problems on a topic. Thus, students explore the formal languages and automata concepts computationally and visually with JFLAP, and theoretically without JFLAP. In addition, we present a new feature in JFLAP, Turing machine building blocks. One can now build complex Turing machines by using other Turing machines as components or building blocks.

Journal ArticleDOI
TL;DR: A useful result is obtained by showing that unrestricted iteration of the superposition operation, where the "parents" in a subsequent iteration can be any words produced during any preceding iteration step, is equivalent to restricted iteration, where at each step one parent must be a word from the initial language.
Abstract: In this paper we propose a new formal operation on words and languages, called superposition. By this operation, based on a Watson–Crick-like complementarity, we can generate a set of words, starting from a pair of words, in which the contribution of a word to the result need not be one subword only, as happens in classical bio-operations of DNA computing. Specifically, starting from two single stranded molecules x and y such that a suffix of x is complementary to a prefix of y, a prefix of x is complementary to a suffix of y, or x is complementary to a subword of y, a new word z, which is a prolongation of x to the right, to the left, or to both, respectively, is obtained by annealing. If y is complementary to a subword of x, then the result is x. This operation is considered here as an abstract operation on formal languages. We relate it to other operations in formal language theory and we settle the closure properties under this operation of classes in the Chomsky hierarchy. We obtain a useful result by showing that unrestricted iteration of the superposition operation, where the "parents" in a subsequent iteration can be any words produced during any preceding iteration step, is equivalent to restricted iteration, where at each step one parent must be a word from the initial language. This result is used for establishing the closure properties of classes in the Chomsky hierarchy under iterated superposition. Actually, since the results are formulated in terms of AFL theory, they are applicable to more classes of languages. Then we discuss "adult" languages, languages consisting of words that cannot be extended by further superposition, and show that this notion might bring us to the border of recursive languages. Finally, we consider some operations involved in classical DNA algorithms, such as Adleman's, which might be expressed through iterated superposition.

Journal ArticleDOI
TL;DR: This work has shown that memoization and techniques for handling left recursion have either been presented independently, or else attempts at their integration have compromised modularity and clarity of the code.
Abstract: Top-down backtracking language processors are highly modular, can handle ambiguity, and are easy to implement with clear and maintainable code. However, a widely-held, and incorrect, view is that top-down processors are inherently exponential for ambiguous grammars and cannot accommodate left-recursive productions. It has been known for many years that exponential complexity can be avoided by memoization, and that left-recursive productions can be accommodated through a variety of techniques. However, until now, memoization and techniques for handling left recursion have either been presented independently, or else attempts at their integration have compromised modularity and clarity of the code.