Showing papers on "Formal language published in 2014"

PDF

Open Access

Journal Article•DOI•

A survey and classification of controlled natural languages

[...]

Tobias Kuhn¹•Institutions (1)

01 Mar 2014-Computational Linguistics

TL;DR: A comprehensive survey of existing English-based Natural Language Controlled Natural Language (CNL) can be found in this article, where the authors provide a common terminology and a common model for CNL, to contribute to the understanding of their general nature, to provide a starting point for researchers interested in the area and to help developers to make design decisions.

...read moreread less

Abstract: What is here called controlled natural language CNL has traditionally been given many different names. Especially during the last four decades, a wide variety of such languages have been designed. They are applied to improve communication among humans, to improve translation, or to provide natural and intuitive representations for formal notations. Despite the apparent differences, it seems sensible to put all these languages under the same umbrella. To bring order to the variety of languages, a general classification scheme is presented here. A comprehensive survey of existing English-based CNLs is given, listing and describing 100 languages from 1930 until today. Classification of these languages reveals that they form a single scattered cloud filling the conceptual space between natural languages such as English on the one end and formal languages such as propositional logic on the other. The goal of this article is to provide a common terminology and a common model for CNL, to contribute to the understanding of their general nature, to provide a starting point for researchers interested in the area, and to help developers to make design decisions.

...read moreread less

308 citations

Book•

Computability, Complexity, and Languages: Fundamentals of Theoretical Computer Science

[...]

Martin Davis¹, Ron Sigal², Elaine J. Weyuker¹•Institutions (2)

New York University¹, Yale University²

23 Sep 2014

TL;DR: This chapter discusses Grammars and Automata, which are Regular Languages, and Logic, which is a Comparative Dictionary of Logic and Propositional Calculus.

...read moreread less

Abstract: Preliminaries. Computability: Programs and Computable Functions. Primitive Recursive Functions. A Universal Program. Calculations on Strings. Turing Machines. Processes and Grammars. Classifying Unsolvable Problems. Grammars and Automata: Regular Languages. Context-Free Languages. Context-Sensitive Languages. Logic: Propositional Calculus. Quantification Theory. Complexity: Abstract Complexity. Polynomial Time Computability. Semantics: Approximation Orderings. Denotational Semantics of Recursion Equations. Operational Semantics of Recursion Equations. Suggestions for Further Reading. Subject Index.

...read moreread less

292 citations

Journal Article•DOI•

Gr\"obner methods for representations of combinatorial categories

[...]

Steven V Sam, Andrew Snowden

05 Sep 2014-arXiv: Commutative Algebra

TL;DR: In this article, the authors studied how the combinatorial behavior of a category C affects the algebraic behavior of representations of C, and showed that C-algebraic representations are noetherian.

...read moreread less

Abstract: Given a category C of a combinatorial nature, we study the following fundamental question: how does the combinatorial behavior of C affect the algebraic behavior of representations of C? We prove two general results. The first gives a combinatorial criterion for representations of C to admit a theory of Grobner bases. From this, we obtain a criterion for noetherianity of representations. The second gives a combinatorial criterion for a general "rationality" result for Hilbert series of representations of C. This criterion connects to the theory of formal languages, and makes essential use of results on the generating functions of languages, such as the transfer-matrix method and the Chomsky-Schutzenberger theorem. Our work is motivated by recent work in the literature on representations of various specific categories. Our general criteria recover many of the results on these categories that had been proved by ad hoc means, and often yield cleaner proofs and stronger statements. For example: we give a new, more robust, proof that FI-modules (originally introduced by Church-Ellenberg-Farb), and a family of natural generalizations, are noetherian; we give an easy proof of a generalization of the Lannes-Schwartz artinian conjecture from the study of generic representation theory of finite fields; we significantly improve the theory of $\Delta$-modules, introduced by Snowden in connection to syzygies of Segre embeddings; and we establish fundamental properties of twisted commutative algebras in positive characteristic.

...read moreread less

188 citations

Journal Article•DOI•

Eddy, a formal language for specifying and analyzing data flow specifications for conflicting privacy requirements

[...]

Travis D. Breaux¹, Hanan Hibshi¹, Ashwini Rao¹•Institutions (1)

Carnegie Mellon University¹

01 Sep 2014-Requirements Engineering

TL;DR: A strict subset of commonly found privacy requirements are identified and a methodology to map these requirements from natural language text to a formal language in description logic, called Eddy is developed, so developers can detect conflicting privacy requirements within a policy and enable the tracing of data flows within these policies.

...read moreread less

Abstract: Increasingly, companies use multi-source data to operate new information systems, such as social networking, e-commerce, and location-based services. These systems leverage complex, multi-stakeholder data supply chains in which each stakeholder (e.g., users, developers, companies, and government) must manage privacy and security requirements that cover their practices. US regulator and European regulator expect companies to ensure consistency between their privacy policies and their data practices, including restrictions on what data may be collected, how it may be used, to whom it may be transferred, and for what purposes. To help developers check consistency, we identified a strict subset of commonly found privacy requirements and we developed a methodology to map these requirements from natural language text to a formal language in description logic, called Eddy. Using this language, developers can detect conflicting privacy requirements within a policy and enable the tracing of data flows within these policies. We derived our methodology from an exploratory case study of the Facebook platform policy and an extended case study using privacy policies from Zynga and AOL Advertising. In this paper, we report results from multiple analysts in a literal replication study, which includes a refined methodology and set of heuristics that we used to extract privacy requirements from policy texts. In addition to providing the method, we report results from performing automated conflict detection within the Facebook, Zynga, and AOL privacy specifications, and results from a computer simulation that demonstrates the scalability of our formal language toolset to specifications of reasonable size.

...read moreread less

86 citations

Book Chapter•DOI•

Design of formal languages and interfaces: "formal" does not mean "unreadable".

[...]

Maria Spichkova¹•Institutions (1)

RMIT University¹

01 Jan 2014

TL;DR: This chapter provides an introduction to a work that aims to apply the achievements of engineering psychology to the area of formal methods, focusing on the specification phase of a system development process.

...read moreread less

Abstract: This chapter provides an introduction to a work that aims to apply the achievements of engineering psychology to the area of formal methods, focusing on the specification phase of a system development process. Formal methods often assume that only two factors should be satisfied: the method must be sound and give such a representation, which is concise and beautiful from the mathematical point of view, without taking into account any question of readability, usability, or tool support. This leads to the fact that formal methods are treated by most engineers as something that is theoretically important but practically too hard to understand and to use, where even some small changes of a formal method can make it more understandable and usable for an average engineer.

...read moreread less

50 citations

Journal Article•DOI•

Learning Strictly Local Subsequential Functions

[...]

Jane Chandlee¹, Rémi Eyraud, Jeffrey Heinz¹•Institutions (1)

University of Delaware¹

01 Nov 2014-Transactions of the Association for Computational Linguistics

TL;DR: This work provides an automata-theoretic characterization of the ISL class and theorems establishing how the classes are related to each other and to Strictly Local languages and evidence that local phonological and morphological processes belong to these classes.

...read moreread less

Abstract: We define two proper subclasses of subsequential functions based on the concept of Strict Locality (McNaughton and Papert, 1971; Rogers and Pullum, 2011; Rogers et al., 2013) for formal languages. They are called Input and Output Strictly Local (ISL and OSL). We provide an automata-theoretic characterization of the ISL class and theorems establishing how the classes are related to each other and to Strictly Local languages. We give evidence that local phonological and morphological processes belong to these classes. Finally we provide a learning algorithm which provably identifies the class of ISL functions in the limit from positive data in polynomial time and data. We demonstrate this learning result on appropriately synthesized artificial corpora. We leave a similar learning result for OSL functions for future work and suggest future directions for addressing non-local phonological processes.

...read moreread less

46 citations

Journal Article•DOI•

Natural Language Inference in Coq

[...]

Stergios Chatzikyriakidis¹, Zhaohui Luo¹•Institutions (1)

Royal Holloway, University of London¹

01 Dec 2014-Journal of Logic, Language and Information

TL;DR: A first attempt to deal with NLI and natural language reasoning in general by using the proof assistant technology, using Luo’s Modern Type Theory with coercive subtyping as the formal language into which to translate natural language semantics.

...read moreread less

Abstract: In this paper we propose a way to deal with natural language inference (NLI) by implementing Modern Type Theoretical Semantics in the proof assistant Coq. The paper is a first attempt to deal with NLI and natural language reasoning in general by using the proof assistant technology. Valid NLIs are treated as theorems and as such the adequacy of our account is tested by trying to prove them. We use Luo's Modern Type Theory (MTT) with coercive subtyping as the formal language into which we translate natural language semantics, and we further implement these semantics in the Coq proof assistant. It is shown that the use of a MTT with an adequate subtyping mechanism can give us a number of promising results as regards NLI. Specifically, it is shown that a number of inference cases, i.e. quantifiers, adjectives, conjoined noun phrases and temporal reference among other things can be successfully dealt with. It is then shown, that even though Coq is an interactive and not an automated theorem prover, automation of all of the test examples is possible by introducing user-defined automated tactics. Lastly, the paper offers a number of innovative approaches to NL phenomena like adjectives, collective predication, comparatives and factive verbs among other things, contributing in this respect to the theoretical study of formal semantics using MTTs.

...read moreread less

44 citations

Journal Article•DOI•

Logic and the Generative Power of Autosegmental Phonology

[...]

Adam Jardine¹•Institutions (1)

University of Delaware¹

19 Mar 2014

TL;DR: In this article, a methodology for a model-theoretic study of autosegmental diagrams with monadic second-order logic is introduced, and the preliminary conclusion is that autosegmentsal diagrams which conform to the well-formedness constraints defined here likely describe most regular sets of strings.

...read moreread less

Abstract: Autosegmental Phonology is studied in the framework of Formal Language Theory, which classifies the computational complexity of patterns. In contrast to previous computational studies of Autosegmental Phonology, which were mainly concerned with finite-state implementations of the formalism, a methodology for a model-theoretic study of autosegmental diagrams with monadic second-order logic is introduced. Monadic second order logic provides a mathematically rigorous way of studying autosegmental formalisms, and its complexity is well understood. The preliminary conclusion is that autosegmental diagrams which conform to the well-formedness constraints defined here likely describe at most regular sets of strings.

...read moreread less

44 citations

Book•

Regulated Grammars and Automata

[...]

Alexander Meduna, Petr Zemek

04 Mar 2014

TL;DR: This is the first book to offer key theoretical topics and terminology concerning regulated grammars and automata, the most important language-defining devices that work under controls represented by additional mathematical mechanisms.

...read moreread less

Abstract: This is the first book to offer key theoretical topics and terminology concerning regulated grammars and automata. They are the most important language-defining devices that work under controls represented by additional mathematical mechanisms. Key topics include formal language theory, grammatical regulation, grammar systems, erasing rules, parallelism, word monoids, regulated and unregulated automata and control languages. The book explores how the information utilized in computer science is most often represented by formal languages defined by appropriate formal devices. It provides both algorithms and a variety of real-world applications, allowing readers to understand both theoretical concepts and fundamentals. There is a special focus on applications to scientific fields including biology, linguistics and informatics. This book concludes with case studies and future trends for the field. Regulated Grammars and Automata is designed as a reference for researchers and professionals working in computer science and mathematics who deal with language processors. Advanced-level students in computer science and mathematics will also find this book a valuable resource as a secondary textbook or reference.

...read moreread less

43 citations

Book Chapter•DOI•

Variability within Modeling Language Definitions

[...]

María Victoria Cengarle, Hans Grönninger, Bernhard Rumpe

22 Sep 2014-arXiv: Software Engineering

TL;DR: A framework to explicitly document and manage the variation points and their corresponding variants of a variable modeling language enables the systematic study of various kinds of variabilities and their interdependencies.

...read moreread less

Abstract: We present a taxonomy of the variability mechanisms offered by modeling languages. The definition of a formal language encompasses a syntax and a semantic domain as well as the mapping that relates them, thus language variabilities are classified according to which of those three pillars they address. This work furthermore proposes a framework to explicitly document and manage the variation points and their corresponding variants of a variable modeling language. The framework enables the systematic study of various kinds of variabilities and their interdependencies. Moreover, it allows a methodical customization of a language, for example, to a given application domain. The taxonomy of variability is explicitly of interest for the UML to provide a more precise understanding of its variation points.

...read moreread less

39 citations

Proceedings Article•DOI•

What influences dwell time during source code reading?: analysis of element type and frequency as factors

[...]

Teresa Busjahn¹, Roman Bednarik², Carsten Schulte¹•Institutions (2)

Free University of Berlin¹, University of Eastern Finland²

26 Mar 2014

TL;DR: A study in which 15 programmers with various expertise read short source codes and recorded their eye movements shows that most attention is oriented towards understanding of identifiers, operators, keywords and literals, relatively little reading time is spent on separators.

...read moreread less

Abstract: While knowledge about reading behavior in natural-language text is abundant, little is known about the visual attention distribution when reading source code of computer programs. Yet, this knowledge is important for teaching programming skills as well as designing IDEs and programming languages. We conducted a study in which 15 programmers with various expertise read short source codes and recorded their eye movements. In order to study attention distribution on code elements, we introduced the following procedure: First we (pre)-processed the eye movement data using log-transformation. Taking into account the word lengths, we then analyzed the time spent on different lexical elements. It shows that most attention is oriented towards understanding of identifiers, operators, keywords and literals, relatively little reading time is spent on separators. We further inspected the attention on keywords and provide a description of the gaze on these primary building blocks for any formal language. The analysis indicates that approaches from research on natural-language text reading can be applied to source code as well, however not without review.

...read moreread less

Proceedings Article•DOI•

Developing Correctly Replicated Databases Using Formal Tools

[...]

Nicolas Schiper¹, Vincent Rahli¹, Robbert van Renesse¹, Marck Bickford¹, Robert L. Constable¹ - Show less +1 more•Institutions (1)

Cornell University¹

23 Jun 2014

TL;DR: This paper describes the experience with building highly-available databases using replication protocols that were generated with the help of correct-by-construction formal methods, and develops two replicated databases that have performance that is competitive with popular databases in one of the two considered benchmarks.

...read moreread less

Abstract: Fault-tolerant distributed systems often contain complex error handling code. Such code is hard to test or model-check because there are often too many possible failure scenarios to consider. As we will demonstrate in this paper, formal methods have evolved to a state in which it is possible to generate this code along with correctness guarantees. This paper describes our experience with building highly-available databases using replication protocols that were generated with the help of correct-by-construction formal methods. The goal of our project is to obtain databases with unsurpassed reliability while providing good performance. We report on our experience using a total order broadcast protocol based on Paxos and specified using a new formal language called Event ML. We compile Event ML specifications into a form that can be formally verified while simultaneously obtaining code that can be executed. We have developed two replicated databases based on this code and show that they have performance that is competitive with popular databases in one of the two considered benchmarks.

...read moreread less

Proceedings Article•DOI•

Separating regular languages with first-order logic

[...]

Thomas Place¹, Marc Zeitoun¹•Institutions (1)

L'Abri¹

14 Jul 2014

TL;DR: It is proved that in order to answer the decision problem: given two regular input languages of finite words, decide whether there exists a first-order definable separator, sufficient information can be extracted from semigroups recognizing the input languages, using a fixpoint computation.

...read moreread less

Abstract: Given two languages, a separator is a third language that contains the first one and is disjoint from the second one. We investigate the following decision problem: given two regular input languages of finite words, decide whether there exists a first-order definable separator. We prove that in order to answer this question, sufficient information can be extracted from semigroups recognizing the input languages, using a fixpoint computation. This yields an Exptime algorithm for checking first-order separability. Moreover, the correctness proof of this algorithm yields a stronger result, namely a description of a possible separator. Finally, we prove that this technique can be generalized to answer the same question for regular languages of infinite words.

...read moreread less

Proceedings Article•

Semantic Parsing with Combinatory Categorial Grammars

[...]

Yoav Artzi¹, Nicholas FitzGerald¹, Luke Zettlemoyer¹•Institutions (1)

University of Washington¹

01 Oct 2014

TL;DR: A unified approach for learning Combinatory Categorial Grammar (CCG) semantic parsers, that induces both a CCG lexicon and the parameters of a parsing model is described.

...read moreread less

Abstract: Semantic parsers map natural language sentences to formal representations of their underlying meaning. Building accurate semantic parsers without prohibitive engineering costs is a long-standing, open research problem.The tutorial will describe general principles for building semantic parsers. The presentation will be divided into two main parts: learning and modeling. In the learning part, we will describe a unified approach for learning Combinatory Categorial Grammar (CCG) semantic parsers, that induces both a CCG lexicon and the parameters of a parsing model. The approach learns from data with labeled meaning representations, as well as from more easily gathered weak supervision. It also enables grounded learning where the semantic parser is used in an interactive environment, for example to read and execute instructions. The modeling section will include best practices for grammar design and choice of semantic representation. We will motivate our use of lambda calculus as a language for building and representing meaning with examples from several domains.The ideas we will discuss are widely applicable. The semantic modeling approach, while implemented in lambda calculus, could be applied to many other formal languages. Similarly, the algorithms for inducing CCG focus on tasks that are formalism independent, learning the meaning of words and estimating parsing parameters. No prior knowledge of CCG is required. The tutorial will be backed by implementation and experiments in the University of Washington Semantic Parsing Framework (UW SPF, http://yoavartzi.com/spf).

...read moreread less

Posted Content•

Probabilistic Inductive Logic Programming Based on Answer Set Programming

[...]

Matthias Nickles, Alessandra Mileo

04 May 2014-arXiv: Artificial Intelligence

TL;DR: A new formal language for the expressive representation of probabilistic knowledge based on Answer Set Programming (ASP) that allows for the annotation of first-order formulas as well as ASP rules and facts with probabilities and for learning of such weights from data (parameter estimation).

...read moreread less

Abstract: We propose a new formal language for the expressive representation of probabilistic knowledge based on Answer Set Programming (ASP). It allows for the annotation of first-order formulas as well as ASP rules and facts with probabilities and for learning of such weights from data (parameter estimation). Weighted formulas are given a semantics in terms of soft and hard constraints which determine a probability distribution over answer sets. In contrast to related approaches, we approach inference by optionally utilizing so-called streamlining XOR constraints, in order to reduce the number of computed answer sets. Our approach is prototypically implemented. Examples illustrate the introduced concepts and point at issues and topics for future research.

...read moreread less

Journal Article•DOI•

Linguistic models at the crossroads of agents, learning and formal languages

[...]

Leonor Becerra-Bonache¹, M. Dolores Jiménez López²•Institutions (2)

Jean Monnet University¹, Rovira i Virgili University²

15 Dec 2014

TL;DR: The goal is to show how interdisciplinary research between these three fields can contribute to better understand how natural language is acquired and processed.

...read moreread less

Abstract: This paper aims at reviewing the most relevant linguistic applications developed in the intersection between three different fields: machine learning, formal language theory and agent technologies. On the one hand, we present some of the main linguistic contributions of the intersection between machine learning and formal languages, which constitutes a well-established research area known as Grammatical Inference. On the other hand, we present an overview of the main linguistic applications of models developed in the intersection between agent technologies and formal languages, such as colonies, grammar systems and eco-grammar systems. Our goal is to show how interdisciplinary research between these three fields can contribute to better understand how natural language is acquired and processed.

...read moreread less

Journal Article•DOI•

Strategic Reasoning: Building Cognitive Models from Logical Formulas

[...]

Sujata Ghosh¹, Ben Meijering², Rineke Verbrugge²•Institutions (2)

Indian Statistical Institute¹, University of Groningen²

01 Mar 2014-Journal of Logic, Language and Information

TL;DR: An attempt to bridge the gap between logical and cognitive treatments of strategic reasoning in games by presenting a formal language to represent different strategies on a finer-grained level than was possible before.

...read moreread less

Abstract: This paper presents an attempt to bridge the gap between logical and cognitive treatments of strategic reasoning in games. There have been extensive formal debates about the merits of the principle of backward induction among game theorists and logicians. Experimental economists and psychologists have shown that human subjects, perhaps due to their bounded resources, do not always follow the backward induction strategy, leading to unexpected outcomes. Recently, based on an eye-tracking study, it has turned out that even human subjects who produce the outwardly correct `backward induction answer' use a different internal reasoning strategy to achieve it. The paper presents a formal language to represent different strategies on a finer-grained level than was possible before. The language and its semantics help to precisely distinguish different cognitive reasoning strategies, that can then be tested on the basis of computational cognitive models and experiments with human subjects. The syntactic framework of the formal system provides a generic way of constructing computational cognitive models of the participants of the Marble Drop game.

...read moreread less

Book Chapter•DOI•

Scope-Bounded Pushdown Languages

[...]

Salvatore La Torre¹, Margherita Napoli¹, Gennaro Parlato²•Institutions (2)

University of Salerno¹, University of Southampton²

26 Aug 2014

TL;DR: The equivalence of the deterministic and nondeterministic versions of scoped Mpa are proved and it is shown that scope-bounded computations of an n-stack Mvpa can be simulated, rearranging the input word, by using only one stack.

...read moreread less

Abstract: We study the formal language theory of multistack pushdown automata (Mpa) restricted to computations where a symbol can be popped from a stack S only if it was pushed within a bounded number of contexts of S (scoped Mpa) We contribute to show that scoped Mpa are indeed a robust model of computation, by focusing on the corresponding theory of visibly Mpa (Mvpa) We prove the equivalence of the deterministic and nondeterministic versions and show that scope-bounded computations of an n-stack Mvpa can be simulated, rearranging the input word, by using only one stack These results have several interesting consequences, such as, the closure under complement, the decidability of universality, inclusion and equality, and a Parikh theorem We also give a logical characterization and compare the expressiveness of the scope-bounded restriction with Mvpa classes from the literature

...read moreread less

Proceedings Article•DOI•

A glimpse on constant delay enumeration (Invited Talk)

[...]

Luc Segoufin¹•Institutions (1)

École normale supérieure de Cachan¹

01 Mar 2014

TL;DR: This work focuses on the case where the enumeration is performed with a constant delay between any two consecutive solutions, after a linear time preprocessing, where this cannot be always achieved.

...read moreread less

Abstract: We survey some of the recent results about enumerating the answers to queries over a database. We focus on the case where the enumeration is performed with a constant delay between any two consecutive solutions, after a linear time preprocessing. This cannot be always achieved. It requires restricting either the class of queries or the class of databases. We describe here several scenarios when this is possible. 1998 ACM Subject Classification F.4 Mathematical logic and formal languages

...read moreread less

Posted Content•

Automatically Extracting Requirements Specifications from Natural Language.

[...]

Shalini Ghosh¹, Daniel Elenius¹, Wenchao Li¹, Patrick Lincoln¹, Natarajan Shankar¹, Wilfried Steiner - Show less +2 more•Institutions (1)

SRI International¹

13 Mar 2014

TL;DR: A methodology for connecting semi-formal requirements with formal descriptions through an intermediate representation is proposed, and concrete empirical evidence that it is possible to bridge the gap between stylized natural language requirements and formal specifications with ARSENAL is provided.

...read moreread less

Abstract: Natural language (supplemented with diagrams and some mathematical notations) is convenient for succinct communication of technical descriptions between the various stakeholders (e.g., customers, designers, implementers) involved in the design of software systems. However, natural language descriptions can be informal, incomplete, imprecise and ambiguous, and cannot be processed easily by design and analysis tools. Formal languages, on the other hand, formulate design requirements in a precise and unambiguous mathematical notation, but are more difficult to master and use. We propose a methodology for connecting semi-formal requirements with formal descriptions through an intermediate representation. We have implemented this methodology in a research prototype called ARSENAL with the goal of constructing a robust, scalable, and trainable framework for bridging the gap between natural language requirements and formal tools. The main novelty of ARSENAL lies in its automated generation of a fully-specified formal model from natural language requirements. ARSENAL has a modular and flexible architecture that facilitates porting it from one domain to another. ARSENAL has been tested on complex requirements from dependable systems in multiple domains (e.g., requirements from the FAAIsolette and TTEthernet systems), and evaluated its degree of automation and robustness to requirements perturbation. The results provide concrete empirical evidence that it is possible to bridge the gap between stylized natural language requirements and formal specifications with ARSENAL, achieving a promising level of performance and domain independence.

...read moreread less

Book Chapter•DOI•

DeltaCCS: A Core Calculus for Behavioral Change

[...]

Malte Lochau¹, Stephan Mennicke², Hauke Baller², Lars Ribbeck²•Institutions (2)

Technische Universität Darmstadt¹, Braunschweig University of Technology²

08 Oct 2014

TL;DR: This work proposes a delta-oriented extension to Milner's process calculus CCS, called DeltaCCS, that allows for modular reasoning about behavioral variability, and defines variability-aware CCS congruences for a modular reasoning on the preservation of behavioral properties defined by the Modal μ-Calculus after changing CCS specifications.

...read moreread less

Abstract: Concepts for enriching formal languages with variability capabilities aim at comprehensive specifications and efficient development of families of similar software variants as propagated, e.g., by the software product line paradigm. However, recent approaches are usually limited to purely structural variability, e.g., by adapting choice operator semantics for variant selection. Those approaches lack 1 a modular separation of common and variable parts and/or 2 a rigorous formalization of semantical impacts of structural variations. To overcome those deficiencies, we propose a delta-oriented extension to Milner's process calculus CCS, called DeltaCCS, that allows for modular reasoning about behavioral variability. In DeltaCCS, modular change directives are applied to core processes by altering term rewriting semantics in a determined way. We define variability-aware CCS congruences for a modular reasoning on the preservation of behavioral properties defined by the Modal μ-Calculus after changing CCS specifications. We implemented a DeltaCCS model checker to efficiently verify the members of a family of process variants.

...read moreread less

Journal Article•

Featherweight OCL: A Proposal for a Machine-Checked Formal Semantics for OCL 2.5

[...]

Achim D. Brucker, Frédéric Tuong, Burkhart Wolff

16 Jan 2014-The Archive of Formal Proofs

TL;DR: A formalization of the core of OCL in HOL provides denotational definitions, a logical calculus and operational rules that allow for the execution of OCR expressions by a mixture of term rewriting and code compilation.

...read moreread less

Abstract: The Unified Modeling Language (UML) is one of the few modeling languages that is widely used in industry. While UML is mostly known as diagrammatic modeling language (e.g., visualizing class models), it is complemented by a textual language, called Object Constraint Language (OCL). OCL is a textual annotation language, originally based on a three-valued logic, that turns UML into a formal language. Unfortunately the semantics of this specification language, captured in the "Annex A" of the OCL standard, leads to different interpretations of corner cases. Many of these corner cases had been subject to formal analysis since more than ten years.The situation complicated with the arrival of version 2.3 of the OCL standard. OCL was aligned with the latest version of UML: this led to the extension of the three-valued logic by a second exception element, called null. While the first exception element invalid has a strict semantics, null has a non strict interpretation. The combination of these semantic features lead to remarkable confusion for implementors of OCL compilers and interpreters.In this paper, we provide a formalization of the core of OCL in HOL. It provides denotational definitions, a logical calculus and operational rules that allow for the execution of OCL expressions by a mixture of term rewriting and code compilation. Moreover, we describe a coding-scheme for UML class models that were annotated by code-invariants and code contracts. An implementation of this coding-scheme has been undertaken: it consists of a kind of compiler that takes a UML class model and translates it into a family of definitions and derived theorems over them capturing the properties of constructors and selectors, tests and casts resulting from the class model. However, this compiler is not included in this document.Our formalization reveals several inconsistencies and contradictions in the current version of the OCL standard. They reflect a challenge to define and implement OCL tools in a uniform manner. Overall, this document is intended to provide the basis for a machine-checked text "Annex A" of the OCL standard targeting at tool implementors.

...read moreread less

Journal Article•DOI•

XML to annotations mapping definition with patterns

[...]

Milan Nosal¹, Jaroslav Porubän¹•Institutions (1)

Technical University of Košice¹

01 Jan 2014-Computer Science and Information Systems

TL;DR: It is argued that mapping patterns facilitate creating configuration tools and it is shown that there are typical XML to annotations mapping solutions that indicate a correspondence between embedded and external metadata formats in general.

...read moreread less

Abstract: Currently, the most commonly created formal languages are configuration languages. So far source code annotations and XML are the leading notations for configuration languages. In this paper, we analyse the correspondence between these two formats. We show that there are typical XML to annotations mapping solutions (mapping patterns) that indicate a correspondence between embedded and external metadata formats in general. We argue that mapping patterns facilitate creating configuration tools and we use a case study to show how they can be used to devise a mapping between these two notations.

...read moreread less

Journal Article•DOI•

Separating Regular Languages with First-Order Logic

[...]

Thomas Place, Marc Zeitoun

13 Feb 2014-arXiv: Formal Languages and Automata Theory

...read moreread less

Abstract: Given two languages, a separator is a third language that contains the first one and is disjoint from the second one. We investigate the following decision problem: given two regular input languages of finite words, decide whether there exists a first-order definable separator. We prove that in order to answer this question, sufficient information can be extracted from semigroups recognizing the input languages, using a fixpoint computation. This yields an EXPTIME algorithm for checking first-order separability. Moreover, the correctness proof of this algorithm yields a stronger result, namely a description of a possible separator. Finally, we generalize this technique to answer the same question for regular languages of infinite words.

...read moreread less

Journal Article•DOI•

Discovering the discovery of the hierarchy of formal languages

[...]

Boris Stilman¹•Institutions (1)

University of Colorado Denver¹

01 Aug 2014-International Journal of Machine Learning and Cybernetics

TL;DR: The visual streams that were involved in the thought experiments led to the development of the formal theory of LG are revealed, including the type of formal languages and grammars, the so-called controlledgrammars; the construction of the grammARS of shortest trajectories and the grammar of zones.

...read moreread less

Abstract: The hierarchy of formal languages is a mathematical representation of linguistic geometry (LG). LG is a type of game theory for a class of extensive discrete games called abstract board games (ABG), scalable to the level of real life defense systems. LG is a formal model of human reasoning about armed conflict, a mental reality “hard-wired” in the human brain. LG, an evolutionary product of millions of years of human warfare, must be a component of the primary language of the human brain (as introduced by Von Neumann). Experiences of development of LG must be instructive for solving another major puzzle, discovering the algorithm of discovery, yet another ancient component of the primary language. This paper reports results on discovering mental processes involved in the development of the hierarchy of formal languages. Those mental processes manifesting execution of the algorithm of discovery are called visual streams. This paper reveals the visual streams that were involved in the thought experiments led to the development of the formal theory of LG. Specifically, it demonstrates the streams involved in choosing the formal-linguistic representation of LG; the type of formal languages and grammars, the so-called controlled grammars; the construction of the grammars of shortest trajectories and the grammar of zones. This paper introduces a hypothesis of how we construct and focus visual streams.

...read moreread less

Journal Article•DOI•

A Formalisation of the Myhill-Nerode Theorem Based on Regular Expressions

[...]

Chunhan Wu¹, Xingyuan Zhang, Christian Urban¹•Institutions (1)

King's College London¹

01 Apr 2014-Journal of Automated Reasoning

TL;DR: In this paper, the Myhill-Nerode theorem can be reconstructed using regular expressions, and from this theorem many closure properties of regular languages follow, including the closure of regular automata.

...read moreread less

Abstract: There are numerous textbooks on regular languages. Many of them focus on finite automata for proving properties. Unfortunately, automata are not so straightforward to formalise in theorem provers. The reason is that natural representations for automata are graphs, matrices or functions, none of which are inductive datatypes. Regular expressions can be defined straightforwardly as a datatype and a corresponding reasoning infrastructure comes for free in theorem provers. We show in this paper that a central result from formal language theory--the Myhill-Nerode Theorem--can be recreated using only regular expressions. From this theorem many closure properties of regular languages follow.

...read moreread less

Book•

Formal Languages, Automata and Numeration Systems 1: Introduction to Combinatorics on Words

[...]

Michel Rigo¹•Institutions (1)

University of Liège¹

10 Sep 2014

TL;DR: In this article, the relationship between the arithmetric properties of the integers and the syntactical properties of corresponding representations of the corresponding representations has been studied, leading to the famous Four Exponentials Conjecture.

...read moreread less

Abstract: Combinatorics on words deals with problems that can be stated in a non-commutative monoid, such as subword complexity of finite or infinite words, construction and properties of infinite words, unavoidable regularities or patterns. When considering some numeration systems, any integer can be represented as a finite word over an alphabet of digits. This simple observation leads to the study of the relationship between the arithmetical properties of the integers and the syntactical properties of the corresponding representations. One of the most profound results in this direction is given by the celebrated theorem by Cobham. Surprisingly, a recent extension of this result to complex numbers led to the famous Four Exponentials Conjecture. This is just one example of the fruitful relationship between formal language theory (including the theory of automata) and number theory.

...read moreread less

Journal Article•DOI•

Formal simulation model to optimize building sustainability

[...]

Pau Fonseca i Casas¹, Antoni Fonseca i Casas¹, Nuria Garrido-Soriano¹, Josep Casanovas¹•Institutions (1)

Polytechnic University of Catalonia¹

01 Mar 2014-Advances in Engineering Software

TL;DR: A simulation model is presented that makes it possible to find optimal values for various building parameters and the associated impacts that reduce the energy demand or consumption of the building.

...read moreread less

Posted Content•DOI•

Control Improvisation

[...]

Daniel J. Fremont, Alexandre Donzé, Sanjit A. Seshia, David Wessel

03 Nov 2014-arXiv: Formal Languages and Automata Theory

TL;DR: This work formalizes and analyzes a new automata-theoretic problem termed control improvisation, and shows how symbolic techniques based on SAT solvers can be used to approximately solve some of the intractable cases.

...read moreread less

Abstract: We formalize and analyze a new problem in formal language theory termed control improvisation Given a specification language, the problem is to produce an improviser, a probabilistic algorithm that randomly generates words in the language, subject to two additional constraints: the satisfaction of a quantitative soft constraint, and the exhibition of a specified amount of randomness Control improvisation has many applications, including for example systematically generating random test vectors satisfying format constraints or preconditions while being similar to a library of seed inputs Other applications include robotic surveillance, machine improvisation of music, and randomized variants of the supervisory control problem We describe a general framework for solving the control improvisation problem, and use it to give efficient algorithms for several practical classes of instances with finite automaton and context-free grammar specifications We also provide a detailed complexity analysis, establishing #P-hardness of the problem in many other cases For these intractable cases, we show how symbolic techniques based on Boolean satisfiability (SAT) solvers can be used to find approximate solutions Finally, we discuss an extension of control improvisation to multiple soft constraints that is useful in some applications

...read moreread less

Book Chapter•DOI•

Automata with Reversal-Bounded Counters: A Survey

[...]

Oscar H. Ibarra¹•Institutions (1)

University of California¹

05 Aug 2014

TL;DR: The closure/non-closure properties of the languages accepted by these machines as well as the decidability/undecidability of decision problems concerning these devices are discussed.

...read moreread less

Abstract: We survey the properties of automata augmented with reversal-bounded counters. In particular, we discuss the closure/non-closure properties of the languages accepted by these machines as well as the decidability/undecidability of decision problems concerning these devices. We also give applications to several problems in automata theory and formal languages.

...read moreread less

Collapse