scispace - formally typeset
Search or ask a question

Showing papers in "arXiv: Formal Languages and Automata Theory in 2017"


Posted ContentDOI
TL;DR: The best known upper bound on the length of the shortest reset words of synchronizing automata is improved, using the approach of Trahtman from 2011 combined with the well-known Frankl theorem from 1982.
Abstract: We improve the best known upper bound on the length of the shortest reset words of synchronizing automata. The new bound is slightly better than $114 n^3 / 685 + O(n^2)$. The Cerný conjecture states that $(n-1)^2$ is an upper bound. So far, the best general upper bound was $(n^3-n)/6-1$ obtained by J.-E.~Pin and P.~Frankl in 1982. Despite a number of efforts, it remained unchanged for about 35 years. To obtain the new upper bound we utilize avoiding words. A word is avoiding for a state $q$ if after reading the word the automaton cannot be in $q$. We obtain upper bounds on the length of the shortest avoiding words, and using the approach of Trahtman from 2011 combined with the well known Frankl theorem from 1982, we improve the general upper bound on the length of the shortest reset words. For all the bounds, there exist polynomial algorithms finding a word of length not exceeding the bound.

44 citations


Posted Content
TL;DR: In this article, the authors investigate the computational complexity of various problems for simple recurrent neural networks (RNNs) as formal models for recognizing weighted languages, including consistency, equivalence, minimization, and the determination of the highest weighted string.
Abstract: We investigate the computational complexity of various problems for simple recurrent neural networks (RNNs) as formal models for recognizing weighted languages. We focus on the single-layer, ReLU-activation, rational-weight RNNs with softmax, which are commonly used in natural language processing applications. We show that most problems for such RNNs are undecidable, including consistency, equivalence, minimization, and the determination of the highest-weighted string. However, for consistent RNNs the last problem becomes decidable, although the solution length can surpass all computable bounds. If additionally the string is limited to polynomial length, the problem becomes NP-complete and APX-hard. In summary, this shows that approximations and heuristic algorithms are necessary in practical applications of those RNNs.

20 citations


Book ChapterDOI
TL;DR: In this paper, the Boyer-Moore (BM) algorithm is used to solve the problem of online timed pattern matching, towards embedded applications where it is vital to process a vast amount of incoming data in a timely manner.
Abstract: The timed pattern matching problem is an actively studied topic because of its relevance in monitoring of real-time systems. There one is given a log $w$ and a specification $\mathcal{A}$ (given by a timed word and a timed automaton in this paper), and one wishes to return the set of intervals for which the log $w$, when restricted to the interval, satisfies the specification $\mathcal{A}$. In our previous work we presented an efficient timed pattern matching algorithm: it adopts a skipping mechanism inspired by the classic Boyer--Moore (BM) string matching algorithm. In this work we tackle the problem of online timed pattern matching, towards embedded applications where it is vital to process a vast amount of incoming data in a timely manner. Specifically, we start with the Franek-Jennings-Smyth (FJS) string matching algorithm---a recent variant of the BM algorithm---and extend it to timed pattern matching. Our experiments indicate the efficiency of our FJS-type algorithm in online and offline timed pattern matching.

20 citations


Journal ArticleDOI
TL;DR: A new translation from linear temporal logic to deterministic Emerson-Lei automata with a Muller acceptance condition symbolically expressed as a Boolean formula is introduced, which is an enhanced product construction that exploits knowledge of its components to reduce the number of states.
Abstract: We introduce a new translation from linear temporal logic (LTL) to deterministic Emerson-Lei automata, which are omega-automata with a Muller acceptance condition symbolically expressed as a Boolean formula. The richer acceptance condition structure allows the shift of complexity from the state space to the acceptance condition. Conceptually the construction is an enhanced product construction that exploits knowledge of its components to reduce the number of states. We identify two fragments of LTL, for which one can easily construct deterministic automata and show how knowledge of these components can reduce the number of states. We extend this idea to a general LTL framework, where we can use arbitrary LTL to deterministic automata translators for parts of formulas outside the mentioned fragments. Further, we show succinctness of the translation compared to existing construction. The construction is implemented in the tool Delag, which we evaluate on several benchmarks of LTL formulas and probabilistic model checking case studies.

18 citations


Posted Content
TL;DR: A novel variant of two-way multihead automata is introduced, which reveals that the complexity of the match test is determined by a hidden combinatorial property of extended regular expressions, and it shows that a restriction of the corresponding parameter leads to rich classes with a polynomial time match test.
Abstract: In the present paper, we study the match test for extended regular expressions. We approach this NP-complete problem by introducing a novel variant of two-way multihead automata, which reveals that the complexity of the match test is determined by a hidden combinatorial property of extended regular expressions, and it shows that a restriction of the corresponding parameter leads to rich classes with a polynomial time match test. For presentational reasons, we use the concept of pattern languages in order to specify extended regular expressions. While this decision, formally, slightly narrows the scope of our results, an extension of our concepts and results to more general notions of extended regular expressions is straightforward.

17 citations


Posted Content
TL;DR: It is proved that every valid universal equality over pomset languages using these operations is a consequence of the equational theory of regular languages plus that of the commutative-regular languages.
Abstract: Pomsets constitute one of the most basic models of concurrency A pomset is a generalisation of a word over an alphabet in that letters may be partially ordered A term $t$ using the bi-Kleene operations $0,1, +, \cdot\, ,^*, \parallel, ^{(*)}$ defines a language $ \mathopen{[\![ } t \mathclose{]\!] } $ of pomsets in a natural way We prove that every valid universal equality over pomset languages using these operations is a consequence of the equational theory of regular languages (in which parallel multiplication and iteration are undefined) plus that of the commutative-regular languages (in which sequential multiplication and iteration are undefined) We also show that the class of $\textit{rational}$ pomset languages (that is, those languages generated from singleton pomsets using the bi-Kleene operations) is closed under all Boolean operations An $ \textit{ideal}$ of a pomset $p$ is a pomset using the letters of $p$, but having an ordering at least as strict as $p$ A bi-Kleene term $t$ thus defines the set $ \textbf{Id} (\mathopen{[\![ } t \mathclose{]\!] }) $ of ideals of pomsets in $ \mathopen{[\![ } t \mathclose{]\!] } $ We prove that if $t$ does not contain commutative iteration $^{(*)}$ (in our terminology, $t$ is bw-rational) then $\textbf{Id} (\mathopen{[\![ } t \mathclose{]\!] }) \cap \textbf{Pom}_{sp}$, where $ \textbf{Pom}_{sp}$ is the set of pomsets generated from singleton pomsets using sequential and parallel multiplication ($ \cdot$ and $ \parallel$) is defined by a bw-rational term, and if two such terms $t,t'$ define the same ideal language, then $t'=t$ is provable from the Kleene axioms for $0,1, +, \cdot\, ,^*$ plus the commutative idempotent semiring axioms for $0,1, +, \parallel$ plus the exchange law $ (u \parallel v)\cdot ( x \parallel y) \le v \cdot y \parallel u \cdot x $

16 citations


Posted Content
TL;DR: The authors showed that GFG automata enjoy the benefits of typeness, similarly to the case of deterministic automata, and they further studied the place of GMF automata in between deterministic and nondeterministic ones.
Abstract: In GFG automata, it is possible to resolve nondeterminism in a way that only depends on the past and still accepts all the words in the language. The motivation for GFG automata comes from their adequacy for games and synthesis, wherein general nondeterminism is inappropriate. We continue the ongoing effort of studying the power of nondeterminism in GFG automata. Initial indications have hinted that every GFG automaton embodies a deterministic one. Today we know that this is not the case, and in fact GFG automata may be exponentially more succinct than deterministic ones. We focus on the typeness question, namely the question of whether a GFG automaton with a certain acceptance condition has an equivalent GFG automaton with a weaker acceptance condition on the same structure. Beyond the theoretical interest in studying typeness, its existence implies efficient translations among different acceptance conditions. This practical issue is of special interest in the context of games, where the Buchi and co-Buchi conditions admit memoryless strategies for both players. Typeness is known to hold for deterministic automata and not to hold for general nondeterministic automata. We show that GFG automata enjoy the benefits of typeness, similarly to the case of deterministic automata. In particular, when Rabin or Streett GFG automata have equivalent Buchi or co-Buchi GFG automata, respectively, then such equivalent automata can be defined on a substructure of the original automata. Using our typeness results, we further study the place of GFG automata in between deterministic and nondeterministic ones. Specifically, considering automata complementation, we show that GFG automata lean toward nondeterministic ones, admitting an exponential state blow-up in the complementation of a Streett automaton into a Rabin automaton, as opposed to the constant blow-up in the deterministic case.

16 citations


Posted Content
TL;DR: A categorical version of Birkhoff's theorem for (finite) algebras is proved to establish a one-to-one correspondence between (pseudo)varieties of T-algebrAs and (pseud)coequational T-theories, which will be shown to be exactly the nature of Eilenberg-type correspondences.
Abstract: The purpose of the present paper is to show that: Eilenberg-type correspondences = Birkhoff's theorem for (finite) algebras + duality. We consider algebras for a monad T on a category D and we study (pseudo)varieties of T-algebras. Pseudovarieties of algebras are also known in the literature as varieties of finite algebras. Two well-known theorems that characterize varieties and pseudovarieties of algebras play an important role here: Birkhoff's theorem and Birkhoff's theorem for finite algebras, the latter also known as Reiterman's theorem. We prove, under mild assumptions, a categorical version of Birkhoff's theorem for (finite) algebras to establish a one-to-one correspondence between (pseudo)varieties of T-algebras and (pseudo)equational T-theories. Now, if C is a category that is dual to D and B is the comonad on C that is the dual of T, we get a one-to-one correspondence between (pseudo)equational T-theories and their dual, (pseudo)coequational B-theories. Particular instances of (pseudo)coequational B-theories have been already studied in language theory under the name of "varieties of languages" to establish Eilenberg-type correspondences. All in all, we get a one-to-one correspondence between (pseudo)varieties of T-algebras and (pseudo)coequational B-theories, which will be shown to be exactly the nature of Eilenberg-type correspondences.

15 citations


Posted Content
TL;DR: This work introduces a simple category-theoretic formalism that provides an appropriately abstract foundation for studying automata learning and establishes formal relations between algorithms for learning, testing, and minimization.
Abstract: Automata learning is a technique that has successfully been applied in verification, with the automaton type varying depending on the application domain. Adaptations of automata learning algorithms for increasingly complex types of automata have to be developed from scratch because there was no abstract theory offering guidelines. This makes it hard to devise such algorithms, and it obscures their correctness proofs. We introduce a simple category-theoretic formalism that provides an appropriately abstract foundation for studying automata learning. Furthermore, our framework establishes formal relations between algorithms for learning, testing, and minimization. We illustrate its generality with two examples: deterministic and weighted automata.

15 citations


Posted Content
TL;DR: A short proof of correctness of the proposed quasi-polynomial time algorithm for parity games is proposed.
Abstract: Recently Cristian S. Calude, Sanjay Jain, Bakhadyr Khoussainov, Wei Li and Frank Stephan proposed a quasi-polynomial time algorithm for parity games. This paper proposes a short proof of correctness of their algorithm.

14 citations


Posted ContentDOI
TL;DR: This work addresses the problem of verifying safety properties of concurrent programs running over the Total Store Order (TSO) memory model by addressing the known decision procedures for this model.
Abstract: We address the problem of verifying safety properties of concurrent programs running over the Total Store Order (TSO) memory model. Known decision procedures for this model are based on complex encodings of store buffers as lossy channels. These procedures assume that the number of processes is fixed. However, it is important in general to prove the correctness of a system/algorithm in a parametric way with an arbitrarily large number of processes. In this paper, we introduce an alternative (yet equivalent) semantics to the classical one for the TSO semantics that is more amenable to efficient algorithmic verification and for the extension to parametric verification. For that, we adopt a dual view where load buffers are used instead of store buffers. The flow of information is now from the memory to load buffers. We show that this new semantics allows (1) to simplify drastically the safety analysis under TSO, (2) to obtain a spectacular gain in efficiency and scalability compared to existing procedures, and (3) to extend easily the decision procedure to the parametric case, which allows obtaining a new decidability result, and more importantly, a verification algorithm that is more general and more efficient in practice than the one for bounded instances.

Posted Content
TL;DR: This paper presents natural characterizations for the constant and logarithmic space classes and establishes tight relationships to the concept of language growth and considers the decision problem whether a language given by a DFA/NFA admits a sliding window algorithm using logarathmic/constant space.
Abstract: In a recent paper we analyzed the space complexity of streaming algorithms whose goal is to decide membership of a sliding window to a fixed language. For the class of regular languages we proved a space trichotomy theorem: for every regular language the optimal space bound is either constant, logarithmic or linear. In this paper we continue this line of research: We present natural characterizations for the constant and logarithmic space classes and establish tight relationships to the concept of language growth. We also analyze the space complexity with respect to automata size and prove almost matching lower and upper bounds. Finally, we consider the decision problem whether a language given by a DFA/NFA admits a sliding window algorithm using logarithmic/constant space.

Posted Content
TL;DR: In this article, it was shown that any non-deterministic two-way transducer can be made reversible through a single exponential blow-up in the number of states.
Abstract: Deterministic two-way transducers define the robust class of regular functions which is, among other good properties, closed under composition. However, the best known algorithms for composing two-way transducers cause a double exponential blow-up in the size of the inputs. In this paper, we introduce a class of transducers for which the composition has polynomial complexity. It is the class of reversible transducers, for which the computation steps can be reversed deterministically. While in the one-way setting this class is not very expressive, we prove that any two-way transducer can be made reversible through a single exponential blow-up. As a consequence, we prove that the composition of two-way transducers can be done with a single exponential blow-up in the number of states. A uniformization of a relation is a function with the same domain and which is included in the original relation. Our main result actually states that we can uniformize any non-deterministic two-way transducer by a reversible transducer with a single exponential blow-up, improving the known result by de Souza which has a quadruple exponential complexity. As a side result, our construction also gives a quadratic transformation from copyless streaming string transducers to two-way transducers, improving the exponential previous bound.

Posted Content
TL;DR: A novel automata model over the alphabet of rational numbers, which is an extension of the well-known register automata over infinite alphabets, which allows both equality and ordering tests between values and allows to perform linear arithmetic between certain variables.
Abstract: We propose a novel automata model over the alphabet of rational numbers, which we call register automata over the rationals (RA-Q). It reads a sequence of rational numbers and outputs another rational number. RA-Q is an extension of the well-known register automata (RA) over infinite alphabets, which are finite automata equipped with a finite number of registers/variables for storing values. Like in the standard RA, the RA-Q model allows both equality and ordering tests between values. It, moreover, allows to perform linear arithmetic between certain variables. The model is quite expressive: in addition to the standard RA, it also generalizes other well-known models such as affine programs and arithmetic circuits. The main feature of RA-Q is that despite the use of linear arithmetic, the so-called invariant problem---a generalization of the standard non-emptiness problem---is decidable. We also investigate other natural decision problems, namely, commutativity, equivalence, and reachability. For deterministic RA-Q, commutativity and equivalence are polynomial-time inter-reducible with the invariant problem.

Journal ArticleDOI
TL;DR: A new model is investigated, which works without the sensing parameter (it is done by an appropriate change of the concept of configuration), Consequently, the accepted language classes of the variants are also changed.
Abstract: Watson-Crick (WK) finite automata are working on a Watson-Crick tape, that is, on a DNA molecule. Therefore, it has two reading heads. While in traditional WK automata both heads read the whole input in the same physical direction, in 5'->3' WK automata the heads start from the two extremes and read the input in opposite direction. In sensing 5'->3' WK automata the process on the input is finished when the heads meet. Since the heads of a WK automaton may read longer strings in a transition, in previous models a so-called sensing parameter took care for the proper meeting of the heads (not allowing to read the same positions of the input in the last step). In this paper, a new model is investigated, which works without the sensing parameter (it is done by an appropriate change of the concept of configuration). Consequently, the accepted language classes of the variants are also changed. Various hierarchy results are proven in the paper.

Journal ArticleDOI
TL;DR: This work proves upper bounds on the union, intersection, complementation, and inverse homomorphism of semilinear sets.
Abstract: We investigate the descriptional complexity of operations on semilinear sets. Roughly speaking, a semilinear set is the finite union of linear sets, which are built by constant and period vectors. The interesting parameters of a semilinear set are: (i) the maximal value that appears in the vectors of periods and constants and (ii) the number of such sets of periods and constants necessary to describe the semilinear set under consideration. More precisely, we prove upper bounds on the union, intersection, complementation, and inverse homomorphism. In particular, our result on the complementation upper bound answers an open problem from [G. J. LAVADO, G. PIGHIZZINI, S. SEKI: Operational State Complexity of Parikh Equivalence, 2014].

Posted Content
TL;DR: Both majority and biased majority cellular automata exhibit a threshold behavior with two phase transitions in a two-dimensional torus, as a main result.
Abstract: Consider a graph $G=(V,E)$ and a random initial vertex-coloring, where each vertex is blue independently with probability $p_{b}$, and red with probability $p_r=1-p_b$. In each step, all vertices change their current color synchronously to the most frequent color in their neighborhood and in case of a tie, a vertex conserves its current color; this model is called majority model. If in case of a tie a vertex always chooses blue color, it is called biased majority model. We are interested in the behavior of these deterministic processes, especially in a two-dimensional torus (i.e., cellular automaton with (biased) majority rule). In the present paper, as a main result we prove both majority and biased majority cellular automata exhibit a threshold behavior with two phase transitions. More precisely, it is shown that for a two-dimensional torus $T_{n,n}$, there are two thresholds $0\leq p_1, p_2\leq 1$ such that $p_b \ll p_1$, $p_1 \ll p_b \ll p_2$, and $p_2 \ll p_b$ result in monochromatic configuration by red, stable coexistence of both colors, and monochromatic configuration by blue, respectively in $\mathcal{O}(n^2)$ number of steps

Posted Content
TL;DR: A general framework for automata learning based on category theory is developed and a class of optimizations and an accompanying correctness proof for learning algorithms are developed, which provides a rich algebraic structure to capture non-determinism and other side-effects.
Abstract: Automata learning has been successfully applied in the verification of hardware and software. The size of the automaton model learned is a bottleneck for scalability and hence optimizations that enable learning of compact representations are important. In this paper, we continue the development of a general framework for automata learning based on category theory and develop a class of optimizations and an accompanying correctness proof for learning algorithms. The new algorithm is parametric on a monad, which provides a rich algebraic structure to capture non-determinism and other side-effects. These side-effects are used to learn more compact automaton models and the abstract categorical approach enables us to capture several possible optimizations under the same (p)roof.

Book ChapterDOI
TL;DR: In this paper, it was shown that the axioms for CKA with bounded parallelism are complete for the semantics proposed in the original paper; consequently, these semantics are the free model for this fragment.
Abstract: Concurrent Kleene Algebra (CKA) was introduced by Hoare, Moeller, Struth and Wehrman in 2009 as a framework to reason about concurrent programs. We prove that the axioms for CKA with bounded parallelism are complete for the semantics proposed in the original paper; consequently, these semantics are the free model for this fragment. This result settles a conjecture of Hoare and collaborators. Moreover, the techniques developed along the way are reusable; in particular, they allow us to establish pomset automata as an operational model for CKA.

Posted Content
TL;DR: In this article, the complexity of finding a synchronizing set of states of maximum size in a weakly acyclic automata is investigated, and it is shown that the problem is NP-hard.
Abstract: We study the computational complexity of various problems related to synchronization of weakly acyclic automata, a subclass of widely studied aperiodic automata. We provide upper and lower bounds on the length of a shortest word synchronizing a weakly acyclic automaton or, more generally, a subset of its states, and show that the problem of approximating this length is hard. We investigate the complexity of finding a synchronizing set of states of maximum size. We also show inapproximability of the problem of computing the rank of a subset of states in a binary weakly acyclic automaton and prove that several problems related to recognizing a synchronizing subset of states in such automata are NP-complete.

Posted ContentDOI
TL;DR: This work introduces a new setting where a population of agents, each modelled by a finite-state system, are controlled uniformly: the controller applies the same action to every agent, and the whole system is encoded as a 2-player game.
Abstract: We introduce a new setting where a population of agents, each modelled by a finite-state system, are controlled uniformly: the controller applies the same action to every agent. The framework is largely inspired by the control of a biological system, namely a population of yeasts, where the controller may only change the environment common to all cells. We study a synchronisation problem for such populations: no matter how individual agents react to the actions of the controller , the controller aims at driving all agents synchronously to a target state. The agents are naturally represented by a non-deterministic finite state automaton (NFA), the same for every agent, and the whole system is encoded as a 2-player game. The first player (Controller) chooses actions, and the second player (Agents) resolves non-determinism for each agent. The game with m agents is called the m-population game. This gives rise to a parameterized control problem (where control refers to 2 player games), namely the population control problem: can Controller control the m-population game for all $m $\in$ N$ whatever Agents does? In this paper, we prove that the population control problem is decidable, and it is a EXPTIME-complete problem. As far as we know, this is one of the first results on parameterized control. Our algorithm, not based on cutoff techniques, produces winning strategies which are symbolic, that is, they do not need to count precisely how the population is spread between states. We also show that if there is no winning strategy, then there is a population size M such that Controller wins the m-population game if and only if $m $\le$ M$. Surprisingly, M can be doubly exponential in the number of states of the NFA, with tight upper and lower bounds.

Journal ArticleDOI
TL;DR: In this article, the authors introduce inverse Lyndon factorizations, which preserve the properties of the Lyndon factorization of a nonempty word w with respect to the inverse lexicographic order.
Abstract: Motivated by applications to string processing, we introduce variants of the Lyndon factorization called inverse Lyndon factorizations. Their factors, named inverse Lyndon words, are in a class that strictly contains anti-Lyndon words, that is Lyndon words with respect to the inverse lexicographic order. The Lyndon factorization of a nonempty word w is unique but w may have several inverse Lyndon factorizations. We prove that any nonempty word w admits a canonical inverse Lyndon factorization, named ICFL(w), that maintains the main properties of the Lyndon factorization of w: it can be computed in linear time, it is uniquely determined, it preserves a compatibility property for sorting suffixes. In particular, the compatibility property of ICFL(w) is a consequence of another result: any factor in ICFL(w) is a concatenation of consecutive factors of the Lyndon factorization of w with respect to the inverse lexicographic order.

Posted Content
TL;DR: This work investigates three formalisms to specify graph languages, i.e. sets of graphs, based on type graphs, and presents decidability results and closure properties for each of the formalisms.
Abstract: We investigate three formalisms to specify graph languages, i.e. sets of graphs, based on type graphs. First, we are interested in (pure) type graphs, where the corresponding language consists of all graphs that can be mapped homomorphically to a given type graph. In this context, we also study languages specified by restriction graphs and their relation to type graphs. Second, we extend this basic approach to a type graph logic and, third, to type graphs with annotations. We present decidability results and closure properties for each of the formalisms.

Journal ArticleDOI
TL;DR: An infinite hierarchy of weakly irreversible languages, i.e., languages which are k-reversible for some k, and a procedure that given a finite automaton decides if the accepted language is weakly or strongly (i.e, not weakly) irreversible is described.
Abstract: Finite automata whose computations can be reversed, at any point, by knowing the last k symbols read from the input, for a fixed k, are considered. These devices and their accepted languages are called k-reversible automata and k-reversible languages, respectively. The existence of k-reversible languages which are not (k-1)-reversible is known, for each k>1. This gives an infinite hierarchy of weakly irreversible languages, i.e., languages which are k-reversible for some k. Conditions characterizing the class of k-reversible languages, for each fixed k, and the class of weakly irreversible languages are obtained. From these conditions, a procedure that given a finite automaton decides if the accepted language is weakly or strongly (i.e., not weakly) irreversible is described. Furthermore, a construction which allows to transform any finite automaton which is not k-reversible, but which accepts a k-reversible language, into an equivalent k-reversible finite automaton, is presented.

Posted Content
Thomas Place1, Marc Zeitoun1
TL;DR: It is shown that for suitable logically defined classes, separation for the logic enriched with the successor relation reduces to separation forThe original logic, which applies to a problem that is stronger than separation: covering.
Abstract: Given a class C of word languages, the C-separation problem asks for an algorithm that, given as input two regular languages, decides whether there exists a third language in C containing the first language, while being disjoint from the second. Separation is usually investigated as a means to obtain a deep understanding of the class C. In the paper, we are mainly interested in classes defined by logical formalisms. Such classes are often built on top of each other: given some logic, one builds a stronger one by adding new predicates to its signature. A natural construction is to enrich a logic with the successor relation. In this paper, we present a transfer result applying to this construction: we show that for suitable logically defined classes, separation for the logic enriched with the successor relation reduces to separation for the original logic. Our theorem also applies to a problem that is stronger than separation: covering. Moreover, we actually present two reductions: one for languages of finite words and the other for languages of infinite words.

Posted Content
TL;DR: It is proved that $m+n$ is a tight upper bound on the overlap assembly of unary languages, and that there are binary languages whose overlap assembly has exponential state complexity at least $m(2^{n-1}-2)+2$.
Abstract: The \emph{state complexity} of a regular language $L_m$ is the number $m$ of states in a minimal deterministic finite automaton (DFA) accepting $L_m$. The state complexity of a regularity-preserving binary operation on regular languages is defined as the maximal state complexity of the result of the operation where the two operands range over all languages of state complexities $\le m$ and $\le n$, respectively. We find a tight upper bound on the state complexity of the binary operation \emph{overlap assembly} on regular languages. This operation was introduced by Csuhaj-Varj\'u, Petre, and Vaszil to model the process of self-assembly of two linear DNA strands into a longer DNA strand, provided that their ends "overlap". We prove that the state complexity of the overlap assembly of languages $L_m$ and $L_n$, where $m\ge 2$ and $n\ge1$, is at most $2 (m-1) 3^{n-1} + 2^n$. Moreover, for $m \ge 2$ and $n \ge 3$ there exist languages $L_m$ and $L_n$ over an alphabet of size $n$ whose overlap assembly meets the upper bound and this bound cannot be met with smaller alphabets. Finally, we prove that $m+n$ is a tight upper bound on the overlap assembly of unary languages, and that there are binary languages whose overlap assembly has exponential state complexity at least $m(2^{n-1}-2)+2$.

Posted Content
TL;DR: In this paper, a new bisimulation (pseudo) metric for weighted finite automata (WFA) is proposed, which generalizes Boreale's linear bisimulations relation.
Abstract: We develop a new bisimulation (pseudo)metric for weighted finite automata (WFA) that generalizes Boreale's linear bisimulation relation. Our metrics are induced by seminorms on the state space of WFA. Our development is based on spectral properties of sets of linear operators. In particular, the joint spectral radius of the transition matrices of WFA plays a central role. We also study continuity properties of the bisimulation pseudometric, establish an undecidability result for computing the metric, and give a preliminary account of applications to spectral learning of weighted automata.

Posted Content
TL;DR: This work provides an algorithm of different flavor solving the question whether the transduction realized by a two-way transducer can be implemented by a sweeping transducers, with either known or unknown number of passes.
Abstract: Functional transductions realized by two-way transducers (equivalently, by streaming transducers and by MSO transductions) are the natural and standard notion of "regular" mappings from words to words. It was shown recently (LICS'13) that it is decidable if such a transduction can be implemented by some one-way transducer, but the given algorithm has non-elementary complexity. We provide an algorithm of different flavor solving the above question, that has double exponential space complexity. We further apply our technique to decide whether the transduction realized by a two-way transducer can be implemented by a sweeping transducer, with either known or unknown number of passes.

Posted Content
TL;DR: In this article, the problem of computing minimal forbidden factors of a word and a regular factorial language has been studied and a formal definition of the factor automaton of a circular word has been given.
Abstract: Minimal forbidden factors are a useful tool for investigating properties of words and languages. Two factorial languages are distinct if and only if they have different (antifactorial) sets of minimal forbidden factors. There exist algorithms for computing the minimal forbidden factors of a word, as well as of a regular factorial language. Conversely, Crochemore et al. [IPL, 1998] gave an algorithm that, given the trie recognizing a finite antifactorial language $M$, computes a DFA recognizing the language whose set of minimal forbidden factors is $M$. In the same paper, they showed that the obtained DFA is minimal if the input trie recognizes the minimal forbidden factors of a single word. We generalize this result to the case of a circular word. We discuss several combinatorial properties of the minimal forbidden factors of a circular word. As a byproduct, we obtain a formal definition of the factor automaton of a circular word. Finally, we investigate the case of minimal forbidden factors of the circular Fibonacci words.

Posted ContentDOI
TL;DR: In this article, the authors present efficient algorithms to reduce the size of Buchi word automata (NBA) and NFA (NFA) while retaining their languages, using criteria based on combinations of backward and forward trace inclusions and simulation relations.
Abstract: We present efficient algorithms to reduce the size of nondeterministic Buchi word automata (NBA) and nondeterministic finite word automata (NFA), while retaining their languages. Additionally, we describe methods to solve PSPACE-complete automata problems like language universality, equivalence, and inclusion for much larger instances than was previously possible ($\ge 1000$ states instead of 10-100). This can be used to scale up applications of automata in formal verification tools and decision procedures for logical theories. The algorithms are based on new techniques for removing transitions (pruning) and adding transitions (saturation), as well as extensions of classic quotienting of the state space. These techniques use criteria based on combinations of backward and forward trace inclusions and simulation relations. Since trace inclusion relations are themselves PSPACE-complete, we introduce lookahead simulations as good polynomial time computable approximations thereof. Extensive experiments show that the average-case time complexity of our algorithms scales slightly above quadratically. (The space complexity is worst-case quadratic.) The size reduction of the automata depends very much on the class of instances, but our algorithm consistently reduces the size far more than all previous techniques. We tested our algorithms on NBA derived from LTL-formulae, NBA derived from mutual exclusion protocols and many classes of random NBA and NFA, and compared their performance to the well-known automata tool GOAL.