scispace - formally typeset
Search or ask a question

Showing papers on "Formal grammar published in 2017"


Journal ArticleDOI
TL;DR: JADEL is designed to support the effective implementation of JADE agents and multi-agent systems in the scope of real-world model-driven development because it concretely helps developers by natively supporting agent-oriented abstractions, and because it is based on mature industrial-strength technologies.

30 citations


Journal ArticleDOI
Ray Jackendoff1
TL;DR: It is argued that formal theories of mental representation are crucial in any mental domain, not just for their own sake, but to guide experimental inquiry, as well as to integrate the domain into the mind as a whole.

27 citations


Journal ArticleDOI
TL;DR: This work introduces several specialized algorithms that are able to handle subclasses of ML-Rules more efficiently and compared in a performance study on the relation between expressive power and computational complexity of rule-based modeling languages.
Abstract: The domain-specific modeling and simulation language ML-Rules is aimed at facilitating the description of cell biological systems at different levels of organization Model states are chemical solutions that consist of dynamically nested, attributed entities The model dynamics are described by rules that are constrained by arbitrary functions, which can operate on the entities’ attributes, (nested) solutions, and the reaction kinetics Thus, ML-Rules supports an expressive hierarchical, variable structure modeling of cell biological systems The formal syntax and semantics of ML-Rules show that it is firmly rooted in continuous-time Markov chains In addition to a generic stochastic simulation algorithm for ML-Rules, we introduce several specialized algorithms that are able to handle subclasses of ML-Rules more efficiently The algorithms are compared in a performance study, leading to conclusions on the relation between expressive power and computational complexity of rule-based modeling languages

26 citations


Book ChapterDOI
21 Oct 2017
TL;DR: LDScript is defined, a Linked Data script language on top of the SPARQL filter expression language that provides the formal grammar of the syntax and the Natural Semantics inference rules of the semantics of the language.
Abstract: In addition to the existing standards dedicated to representation or querying, Semantic Web programmers could really benefit from a dedicated programming language enabling them to directly define functions on RDF terms, RDF graphs or SPARQL results. This is especially the case, for instance, when defining SPARQL extension functions. The ability to capitalize complex SPARQL filter expressions into extension functions or to define and reuse dedicated aggregates are real cases where a dedicated language can support modularity and maintenance of the code. Other families of use cases include the definition of functional properties associated to RDF resources or the definition of procedural attachments as functions assigned to RDFS or OWL classes with the selection of the function to be applied to a resource depending on the type of the resource. To address these needs we define LDScript, a Linked Data script language on top of the SPARQL filter expression language. We provide the formal grammar of the syntax and the Natural Semantics inference rules of the semantics of the language. We also provide a benchmark and perform an evaluation using real test bases from W3C with different implementations and approaches comparing, in particular, script interpretation and Java compilation.

20 citations


Journal ArticleDOI
TL;DR: Sign language anaphora is often realized very differently from its spoken language counterpart as mentioned in this paper, where an antecedent is associated with a position or locus in signing space, and an anaphoric link is obtained by pointing toward that locus to recover its semantic value.
Abstract: Sign language anaphora is often realized very differently from its spoken language counterpart. In simple cases, an antecedent is associated with a position or “locus” in signing space, and an anaphoric link is obtained by pointing toward that locus to recover its semantic value. This mechanism may sometimes be an overt realization of coindexation in formal syntax and semantics. I discuss two kinds of insights that sign language research can bring to the foundations of anaphora. First, in some cases the overt nature of indices in sign language allows one to bring overt evidence to bear on classic debates in semantics. I consider two: the availability of situation-denoting variables in natural language and the availability of binding without c-command. Second, in some cases sign language pronouns raise new challenges for formal semantics. Loci may function simultaneously as formal variables and as simplified depictions of what they denote, requiring the construction of a formal semantics with iconicity to ...

16 citations


Proceedings ArticleDOI
01 Nov 2017
TL;DR: This paper presents a new application of a type of formal grammar — probabilistic/stochastic context-free grammar — in the automatic generation of social media profiles using Facebook as a test case, and describes the implementation and results.
Abstract: One helpful resource to have when presenting delicate/sensitive information is hypothetical data, or placeholders that conceal the identity of concerned parties. This is crucial in environments such as medicine and criminology as volunteers in medical research, patients of dreaded diseases, and convicts of certain crimes often prefer to remain anonymous, even when they agree to their records being shared. Recently, research based on social media has raised similar ethical concerns about privacy and the use of real users' profiles. In this paper, we present a new application of a type of formal grammar — probabilistic/stochastic context-free grammar — in the automatic generation of social media profiles using Facebook as a test case. First, we present a grammar-based formalism for describing the rules governing the formulation of reasonable user attributes (e.g. full names, date of birth, addresses, phone numbers, etc). These grammar rules are specified with associated probabilistic weights that decides when (if at all) a rule is used or chosen. Secondly, we describe the implementation of these grammar rules. Our implementation results produced one million iterations of unique Facebook profiles within three hours of execution time — with an almost-impossible probability that a profile will reoccur. 100,000 of these synthesised profiles can be viewed at: tinyurl.com/synthesisedprofiles2017. These profiles may find applications in role-playing games in health, and social media research; and the described technique may find a wider application in generation of hypothetical profiles for data anonymisation in different domains.

14 citations


Journal ArticleDOI
01 Mar 2017
TL;DR: In this paper, four case studies from different corners of Dutch grammar are discussed (on cardinal numerals, on the Big Mess construction, on bare infinitive complements of auxiliaries, and on the hortative).
Abstract: Structuralism and formal grammar have, in the course of the 20th century, rightfully taken issue with more vague and unfalsifiable just-so stories of some of their predecessors. For all its merits, though, the structuralist-formal strand of linguistics has its drawbacks as well. The classical Saussurean distinction between synchrony and diachrony can be harmful: a purely synchronic description is often inferior to the insight gained from diachrony, not only because grammar is laden with heirlooms and debris of prior structures and because languages draw on a wide variety of pathways to generate new grammar, but also because variation can often only be understood fully in the light of its history. Many cases of synchronic variation are the result of competition between an innovative mutant encroaching on an obsolescent construction. In such cases, the synchronic skew in the proportion of one variant to the other is not arbitrary, but is a reflection of how far the change has progressed. To the extent that one wants to incorporate variation in grammatical description – and there are sound theoretical reasons to do so – the historical perspective is indispensable. In this article four case studies from different corners of Dutch grammar are discussed (on cardinal numerals, on the Big Mess Construction, on bare infinitive complements of auxiliaries, and on the hortative). The case studies together form a plea for the historisation of the science of linguistics, just like biology has been historised, and indeed, as is shown in this article, there are numerous parallels between linguistics and biology.

9 citations


Proceedings ArticleDOI
09 Nov 2017
TL;DR: The authors propose an original approach to text mining by making a parse tree for each sentence using regular grammar and creating an ontology and provide a demonstration of this system being implemented in a constrained scenario.
Abstract: This work describes an investigation of formal grammar with application to text mining. It is an important area since text is the most widespread type of data and it contains a lot of potentially useful information. Unstructured nature of text requires other methods for its processing, in contrast to other types of data mining. In this work, the authors propose an original approach to text mining by making a parse tree for each sentence using regular grammar and creating an ontology and provide a demonstration of this system being implemented in a constrained scenario. This ontology can be used for different tasks, ranging from expert systems to automatic machine translation. The ontology is a network consisting of concepts linked by relations. The authors developed a new system to implement proposed approach working in different languages.

8 citations


Journal ArticleDOI
TL;DR: The problem of bootstrapping knowledge in language and vision for autonomous robots is addressed through novel techniques in grammar induction and word grounding to the perceptual world through a cognitively plausible loosely-supervised manner from raw linguistic and visual data.

7 citations


Journal ArticleDOI
TL;DR: Two novel methods, based on scannerless generalized LR (SGLR) and Parsing Expression Grammars (PEGs), are presented to address these drawbacks and to mine structured fragments within unstructured data.

6 citations


Proceedings ArticleDOI
21 Jun 2017
TL;DR: The problems of parsing texts with linguistic phenomena of controversial nature which may rarely be encountered in NLP projects focusing on Indo-European languages, but are quite frequent in other languages, are analyzed.
Abstract: This article analyzes the problems of parsing texts with linguistic phenomena of controversial nature which may rarely be encountered in NLP projects focusing on Indo-European languages, but are quite frequent in other languages, e.g. in the corpus of Tibetan Indigenous Grammatical Treatises, therefore, parsing texts with such phenomena is necessary for completeness of automatic morphosyntactic annotation of textual corpora. Development of the morphosyntactic analyzer for the Tibetan language started in 2016 and had already proved to be quite useful to deal with specific phenomena of Tibetan, and with previously unsolvable issues of tokenization. The ultimate goal of the project is to create a consistent formal grammatical description (formal grammar) of the Tibetan language, including all grammar levels of the language system from morphosyntax (syntactics of morphemes) to the syntax of composite sentences and supra-phrasal entities. The previously published version of the automatic morphosyntactic annotation was created on the basis of morphologically tagged corpora of Tibetan texts and had high, but not 100 percent coverage (the ratio of the amount of atoms covered by parse trees to the total amount of atoms), precision and recall. This article describes the problems that had to be solved after that, in order to develop the current version of the morphosyntactic parser which allowed to achieve complete and correct automatic annotation of the corpus, and the chosen ways of solving them, which allowed obtaining a complete morphosyntactic annotation of units previously treated as tokens (lexical tokens, words or other atomic parse elements), but required a substantial refactoring (restructuring existing code without changing its external behavior) of the formal grammar. Thus, not only the frequent, but all the constructions turned out to be important in the construction of the formal model.

Book ChapterDOI
14 Jun 2017
TL;DR: The problem that this formalization tries to solve is the management of time inaccuracies between the components of a network with the aim of avoiding the malfunctions that are derived from them.
Abstract: We specify the behavior of a sensor network with different sensor stations distributed all along the region of Andalusia (South of Spain) The main goal of this network is the measure of air quality taking into account the maximum levels of certain pollutants The problem that we try to solve with this formalization is the management of time inaccuracies between the components of a network with the aim of avoiding the malfunctions that are derived from them We present the formal syntax and semantics of our variant of fuzzy-timed automata and define all the automata corresponding to the different parts of the netowrk

Book ChapterDOI
07 Aug 2017
TL;DR: The class of languages generated by permitting semi-conditional grammars with no erasing rules is strictly included in the class of context-sensitive languages.
Abstract: Permitting semi-conditional grammars are such extensions of context-free grammars where each rule is associated with a word v, and such a rule can be applied to a sentential form u only if v is a subword of u. In this paper we show that the class of languages generated by permitting semi-conditional grammars with no erasing rules is strictly included in the class of context-sensitive languages.

Proceedings ArticleDOI
24 Apr 2017
TL;DR: This article proposes a way to complete the natural language text of requirements by giving a formal syntax to this text by introducing and using an example to illustrate the ideas.
Abstract: Natural language is currently the basis of the majority of system specifications, even if it has several drawbacks In particular, natural language is inherently ambiguous In this article, we propose a way to complete the natural language text of requirements by giving a formal syntax to this text We introduce and use an example to illustrate our ideas

Book ChapterDOI
18 May 2017
TL;DR: The role of semantic predicates constructed from lexical and the syntactic structures in which they are placed within business communication contexts are explored in order to build formal grammar for business language.
Abstract: In recent years, the interest in the use of language for business has grown. It is recognized that the hidden persuasive linguistic potential improves the company’s positioning in the public consciousness. The language of the business world is multifarious: we try to identify its features and behaviour, considering the evolution that it has faced primarily with the globalization of markets. Business activities are so complex that they require the application of several disciplines at the same time and therefore the use of specific languages and technical terminology. In order to reach an efficient analysis of business language, this study explores the role of semantic predicates constructed from lexical and the syntactic structures in which they are placed within business communication contexts. From the point of view of LG framework, a set of lexical-syntactic structures defines the value of semantic predicates, while the arguments selected by each semantic predicate are given the value of actants, subjects included. The features of each verb are expressed by the application of the rules of co-occurrence and selection restriction, through which verbs select semantically their arguments to construct acceptable simple sentences. In this way, the entries belonging to electronic dictionaries should be classified presuming their similarity and proximity. Even if the list of semantic tags is not simply identifiable, grammars could be built for single sets of semantic predicates. LG descriptions assign correlated predicates and arguments by applying electronic dictionaries of Italian. Using NooJ environment and Italian linguistic resources to automatically processing natural language, we will process a corpus of business documents. We will show and describe the syntactic structures, semantic and syntactic properties of predicates, in order to build formal grammar for business language.

Proceedings ArticleDOI
01 Jun 2017
TL;DR: A formal specification of the heart that can help to detect abnormal patterns of behavior, and takes into account the age and gender of the patient, where age is considered to be a fuzzy parameter.
Abstract: In this paper we introduce a formalism to specify the behavior of biological systems. Our formalism copes with uncertainty, via fuzzy logic constraints, an important characteristic of these systems. We present the formal syntax and semantics of our variant of fuzzy automata. The bulk of the paper is devoted to present an application of our formalism: a formal specification of the heart that can help to detect abnormal patterns of behavior. Specifically, our model analyzes the heartbeats per minute and the longitude of the RR waves of a patient. The model takes into account the age and gender of the patient, where age is considered to be a fuzzy parameter. Finally, we use real data to analyze the reliability of the model concerning the diagnosis and prediction of potential illnesses.

Book ChapterDOI
01 Jan 2017
TL;DR: This chapter introduces the formal syntax and operational semantics of a simple, structured imperative language called IMP, with static variable allocation and no sophisticated declaration constructs for data types, functions, classes, methods and the like.
Abstract: This chapter introduces the formal syntax and operational semantics of a simple, structured imperative language called IMP, with static variable allocation and no sophisticated declaration constructs for data types, functions, classes, methods and the like. The operational semantics is defined in the natural style and it assumes an abstract machine with a very basic form of memory to associate integer values with variables. The operational semantics is used to derive a notion of program equivalence and several examples of (in)equivalence proofs are shown.

Journal ArticleDOI
01 Jan 2017
TL;DR: The authors have proposed definition of rough fuzzy automata that accepts rough fuzzy regular language and considered both uncertainty and approximations to define rough fuzzy grammar and rough fuzzy languages.
Abstract: Automata theory plays a key role in computational theory as many computational problems can be solved with its help. Formal grammar is a special type of automata designed for linguistic purposes. Formal grammar generates formal languages. Rough grammar and rough languages were introduced to incorporate the imprecision of real languages in formal languages. These languages have limitations on uncertainty. The authors have considered both uncertainty and approximations to define rough fuzzy grammar and rough fuzzy languages. Under certain restrictions, their grammar reduces to formal grammar. Furthermore, the authors have proposed definition of rough fuzzy automata that accepts rough fuzzy regular language.

27 Jan 2017
TL;DR: It is argued that transformations were introduced by Chomsky, the founder of TGG, as a way to preserve, within the new paradigm of formal grammar theory, established insights of a purely linguistic nature.
Abstract: This paper deals with a dramatic change in the presentation of linguistic examples in the linguistics literature of the twentieth century, a change coinciding (not accidentally) with the introduction of transformational generative grammar (TGG) in the 1950s. Our investigation of this change and the circumstances that gave rise to it leads us to reconsider the question of continuity and discontinuity in the history of linguistics in this crucial period, focusing on the question of the audience to which the earliest publications in TGG were directed. We argue that this was an audience of nonlinguists (information theorists and mathematical logicians) and that transformations were introduced by Chomsky, the founder of TGG, as a way to preserve, within the new paradigm of formal grammar theory, established insights of a purely linguistic nature.

Book ChapterDOI
11 Jun 2017
TL;DR: The paper presents a dependency parser for Polish language that uses a simple chain of word combining rules operating on fully morphosyntactically tagged input instead of a formal grammar model or statistical learning to generate robust dependency trees.
Abstract: The paper presents a dependency parser for Polish language It uses a simple chain of word combining rules operating on fully morphosyntactically tagged input instead of a formal grammar model or statistical learning The proposed approach generates robust dependency trees and allows parsing of uncommon texts, such as poetry This gives a significant advantage over current state-of-the-art dependency parsers

Posted Content
TL;DR: A formal study of some of the most common forms of inductive definitions found in scientific text: monotone inductive definition, definition by induction over a well-founded order and iterated induction definitions.
Abstract: The definition is a common form of human expert knowledge, a building block of formal science and mathematics, a foundation for database theory and is supported in various forms in many knowledge representation and formal specification languages and systems. This paper is a formal study of some of the most common forms of inductive definitions found in scientific text: monotone inductive definition, definition by induction over a well-founded order and iterated inductive definitions. We define a logic of definitions offering a uniform formal syntax to express definitions of the different sorts, and we define its semantics by a faithful formalization of the induction process. Several fundamental properties of definition by induction emerge: the non-determinism of the induction process, the confluence of induction processes, the role of the induction order and its relation to the inductive rules, how the induction order constrains the induction process and, ultimately, that the induction order is irrelevant: the defined set does not depend on the induction order. We propose an inductive construction capable of constructing the defined set without using the induction order. We investigate borderline definitions of the sort that appears in definitional paradoxes.

Proceedings ArticleDOI
20 Oct 2017
TL;DR: The mathematical model of userspace-based process tree reconstruction via syscall sequences is constructed on the basis of the type-0 formal grammar and prototypes as two-staged grammar analyser with 3 heuristics for grammar shortening indicate the possibility of grammatical analysis competitive application for metadata reconstruction in checkpoint-restore tools.
Abstract: The mathematical model of userspace-based process tree reconstruction via syscall sequences is constructed on the basis of the type-0 formal grammar and prototyped as two-staged grammar analyser with 3 heuristics for grammar shortening. The prototype has been developed to compare with profile-based techniques of syscall collection. The results indicate the possibility of grammatical analysis competitive application for metadata reconstruction in checkpoint-restore tools.

Posted Content
TL;DR: The Plant Model Inference Tool (PMIT) as mentioned in this paper infers deterministic context-free L-systems from an initial sequence of strings generated by the system using a genetic algorithm.
Abstract: Lindenmayer systems (L-systems) are a formal grammar system that iteratively rewrites all symbols of a string, in parallel. When visualized with a graphical interpretation, the images have self-similar shapes that appear frequently in nature, and they have been particularly successful as a concise, reusable technique for simulating plants. The L-system inference problem is to find an L-system to simulate a given plant. This is currently done mainly by experts, but this process is limited by the availability of experts, the complexity that may be solved by humans, and time. This paper introduces the Plant Model Inference Tool (PMIT) that infers deterministic context-free L-systems from an initial sequence of strings generated by the system using a genetic algorithm. PMIT is able to infer more complex systems than existing approaches. Indeed, while existing approaches are limited to L-systems with a total sum of 20 combined symbols in the productions, PMIT can infer almost all L-systems tested where the total sum is 140 symbols. This was validated using a test bed of 28 previously developed L-system models, in addition to models created artificially by bootstrapping larger models.

Journal ArticleDOI
01 Sep 2017
TL;DR: It is claimed that £20,000 is to be paid in damages for each offence committed at the 2016 Rio Olympics by the International Olympic Committee in Rio de Janeiro.
Abstract: Исследуются системы полиномиальных уравнений над полукольцом (относительно символов с некоммутативным умножением и коммутативным сложением). Такие системы уравнений интерпретируются как грамматики формальных языков и решаются относительно нетерминальных символов в виде формальных степенных рядов, зависящих от терминальных символов. Рассматривается коммутативный образ системы уравнений в предположении, что символы являются переменными, принимающими значения из поля комплексных чисел. Устанавливаются связи между решениями системы некоммутативных символьных уравнений и её коммутативного образа, тем самым методы многомерного комплексного анализа привлекаются в теорию формальных языков и грамматик. Доказывается дискретный аналог теоремы о неявном отображении для формальных грамматик: достаточным условием существования и единственности решения системы некоммутативных уравнений в виде формальных степенных рядов является отличие от нуля якобиана коммутативного образа этой системы. Предложен также новый метод синтаксического анализа мономов контекстно-свободного языка как модели языков программирования, основанный на интегральном представлении синтаксического многочлена программы. При этом показано, что интеграл фиксированной кратности по циклу позволяет найти синтаксический многочлен монома (программы) с неограниченным числом символов, что даёт новый подход к проблеме синтаксического анализа.

01 Jan 2017
TL;DR: This research suggests that BIM provides not only vocabulary but also syntactical tools that can be used to capture an architectural language, and could improve the consistency of architectural designs and their coherence to defined styles.
Abstract: Beyond its widespread use for representing technical aspects and matters of building and construction science, Building information modeling (BIM) can be used to represent architectural relationships and rules drawn from aesthetic theory. This research suggests that BIM provides not only vocabulary but also syntactical tools that can be used to capture an architectural language. In a case study using Richard Meier’s language for single-family detached houses, a BIM template has been devised to represent the aesthetic concepts and relations therein. The template employs parameterized conceptual mass objects, syntactical rules, and a library of architectonic elements, such as walls, roofs, columns, windows, doors, and railings. It constrains any design produced using the template to a grammatically consistent expression or style. The template has been used as the starting point for modeling the Smith House, the Douglas House, and others created by the authors, demonstrating that the aesthetic template is general to many variations. Designing with the template to produce a unique but conforming design further illustrates the generality and expressiveness of the language. Having made the formal language explicit, in terms of syntactical rules and vocabulary, it becomes easier to vary the formal grammar and concrete vocabulary to produce variant languages and styles. Accordingly, this approach is not limited to a specific style, such as Richard Meier's. Future research can be conducted to demonstrate how designing with BIM can support stylistic change. Adoption of this approach in practice could improve the consistency of architectural designs and their coherence to defined styles, potentially increasing the general level of aesthetic expression in our built environment.

Proceedings Article
12 Feb 2017
TL;DR: In this paper, the authors introduce a rich modeling language, for which an interior-point method computes approximate solutions in a generic way, exploiting and caching local structure using algebraic decision diagrams (ADDs).
Abstract: A recent trend in probabilistic inference emphasizes the codification of models in a formal syntax, with suitable high-level features such as individuals, relations, and connectives, enabling descriptive clarity, succinctness and circumventing the need for the modeler to engineer a custom solver. Unfortunately, bringing these linguistic and pragmatic benefits to numerical optimization has proven surprisingly challenging. In this paper, we turn to these challenges: we introduce a rich modeling language, for which an interior-point method computes approximate solutions in a generic way. While logical features easily complicates the underlying model, often yielding intricate dependencies, we exploit and cache local structure using algebraic decision diagrams (ADDs). Indeed, standard matrix-vector algebra is efficiently realizable in ADDs, but we argue and show that well-known optimization methods are not ideal for ADDs. Our engine, therefore, invokes a sophisticated matrix-free approach. We demonstrate the flexibility of the resulting symbolic-numeric optimizer on decision making and compressed sensing tasks with millions of non-zero entries.

01 Jan 2017
TL;DR: In this article, the main aim of the present study was to identify examples of practical grammar instruction methods in an EFL/ESL classroom in Swedish upper secondary school in Sweden.
Abstract: The main aim of the present study was to identify examples of practical grammar instruction methods in an EFL/ESL classroom in Swedish upper secondary school. Data was collected through classroom o ...

Journal ArticleDOI
TL;DR: In this note, a simple condition upon which a formal grammar produces a context-free language is presented.

Posted Content
TL;DR: This paper proposes Monte Carlo Action Programming, a programming language framework for autonomous systems that act in large probabilistic state spaces with high branching factors that comprises formal syntax and semantics of a nondeterministic action programming language.
Abstract: This paper proposes Monte Carlo Action Programming, a programming language framework for autonomous systems that act in large probabilistic state spaces with high branching factors. It comprises formal syntax and semantics of a nondeterministic action programming language. The language is interpreted stochastically via Monte Carlo Tree Search. Effectiveness of the approach is shown empirically.

Proceedings ArticleDOI
08 Feb 2017
TL;DR: An approach is developed by which it can be achieved to significantly improve the training process automation of artificial intelligence, which as a result will give us a higher level of self-developing skills independently from us (from users).
Abstract: For more flexibility of environmental perception by artificial intelligence it is needed to exist the supporting software modules, which will be able to automate the creation of specific language syntax and to make a further analysis for relevant decisions based on semantic functions. According of our proposed approach, of which implementation it is possible to create the couples of formal rules of given sentences (in case of natural languages) or statements (in case of special languages) by helping of computer vision, speech recognition or editable text conversion system for further automatic improvement. In other words, we have developed an approach, by which it can be achieved to significantly improve the training process automation of artificial intelligence, which as a result will give us a higher level of self-developing skills independently from us (from users). At the base of our approach we have developed a software demo version, which includes the algorithm and software code for the entire above mentioned component's implementation (computer vision, speech recognition and editable text conversion system). The program has the ability to work in a multi - stream mode and simultaneously create a syntax based on receiving information from several sources.