scispace - formally typeset
Open AccessJournal ArticleDOI

Formal properties of XML grammars and languages

Reads0
Chats0
TLDR
In this paper, the authors consider XML documents described by a document type definition (DTD) and show that every XML language has a unique XML-grammar, and give two characterizations of languages generated by XMLgrammars, one is set-theoretic, the other is by a kind of saturation property.
Abstract
We consider XML documents described by a document type definition (DTD). An XML-grammar is a formal grammar that captures the syntactic features of a DTD. We investigate properties of this family of grammars. We show that every XML-language basically has a unique XML-grammar. We give two characterizations of languages generated by XML-grammars, one is set-theoretic, the other is by a kind of saturation property. We investigate decidability problems and prove that some properties that are undecidable for general context-free languages become decidable for XML-languages. We also characterize those XML-grammars that generate regular XML-languages.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Algorithms for learning regular expressions from positive data

TL;DR: Algorithms that directly infer very simple forms of 1-unambiguous regular expressions from positive data are described, both in terms of regular expressions and of (not necessarily minimal) deterministic finite automata.
Book ChapterDOI

Deterministic automata on unranked trees

TL;DR: In this article, it was shown that for an appropriate definition of bottom-up deterministic automata, it is possible to minimize the number of states efficiently and to obtain a unique canonical representative of the accepted tree language.
Book ChapterDOI

Regularity problems for visibly pushdown languages

TL;DR: It is decidable for a given visibly pushdown automaton whether it is equivalent to a visibly counter automaton, i.e. an automaton that uses its stack only as counter.
Book ChapterDOI

XML validation for context-free grammars

TL;DR: The validation of a context-free grammar obtained by the analysis against XML schemas is considered and two algorithms for deciding inclusion L(G1)⊆L(G2) are developed, which are efficient in practice although they have exponential complexity.
Journal Article

Algorithms for learning regular expressions

TL;DR: In this article, the authors describe algorithms that directly infer regular expressions from positive data and characterize the regular language classes that can be learned this way, based on a regular language class model.
References
More filters
Book

Introduction to formal language theory

TL;DR: This volume intended to serve as a text for upper undergraduate and graduate level students and special emphasis is given to the role of algebraic techniques in formal language theory through a chapter devoted to the fixed point approach to the analysis of context-free languages.
Book ChapterDOI

The Algebraic Theory of Context-Free Languages*

TL;DR: This chapter discusses the several classes of sentence-generating devices that are closely related, in various ways, to the grammars of both natural languages and artificial languages of various kinds.
Journal ArticleDOI

One-unambiguous regular languages

TL;DR: A Kleene theorem is able to prove the decidability of whether a given regular expression denotes a 1-unambiguous language; if it does, then it can be proved that an equivalent 1- unambiguous regular expression can be constructed in worst-case optimal time.
Journal ArticleDOI

Comparative analysis of six XML schema languages

TL;DR: A comparative analysis of six noteworthy XML schema languages is presented and it is shown that there is a substantial increase of the amount of data in XML format.
Journal ArticleDOI

Parenthesis Grammars

TL;DR: A decision procedure is given which determines whether the languages defined by two parenthesis grammars are equal.