scispace - formally typeset
Search or ask a question

Showing papers on "Equivalence class published in 2019"


Journal ArticleDOI
TL;DR: In this paper, the structural information of rough set approximations is considered, that is, the composition of an approximation in terms of equivalence classes is useful to retain structural information.
Abstract: A major application of rough set theory is concept analysis for deciding if an object is an instance of a concept based on its description. Objects with the same description form an equivalence class and the family of equivalence classes is used to define rough set approximations. When deriving the decision rules from approximations, the description of an equivalence class is the left-hand-side of a decision rule. Therefore, it is useful to retain structural information of approximations, that is, the composition of an approximation in terms of equivalence classes. However, existing studies do not explicitly consider the structural information. To address this issue, we introduce structured rough set approximations in both complete and incomplete information tables, which serve as a basis for three-way decisions with rough sets. In a complete table, we define a family of conjunctively definable concepts. The structured three-way approximations are three structured positive, boundary and negative regions given by three sets of conjunctively definable concepts. By adopting a possible-world semantics, we introduce the notion of conjunctively definable interval concepts in an incomplete table, which is used to construct the structured three-way approximations. The internal structure of structured approximations contributes to sound semantics of rough set approximations and is directly and explicitly related to three-way decision rules.

66 citations


Journal ArticleDOI
TL;DR: The proposed Local Logical Disjunction Double-quantitative Rough Sets (LLDDRS) model is proposed based on the importance, completeness and complementary nature of the relative and absolute quantitative information to describe an approximation space and provides an effective tool for discovering knowledge and making decisions in relation to large data sets.

37 citations


Proceedings Article
01 Jan 2019
TL;DR: The FCI algorithm is extended to combine observational and interventional datasets, including new orientation rules particular to this setting, and a novel notion of interventional equivalence class of causal graphs with latent variables based on these invariances is introduced.
Abstract: The challenge of learning the causal structure underlying a certain phenomenon is undertaken by connecting the set of conditional independences (CIs) readable from the observational data, on the one side, with the set of corresponding constraints implied over the graphical structure, on the other, which are tied through a graphical criterion known as d-separation (Pearl, 1988). In this paper, we investigate the more general scenario where multiple observational and experimental distributions are available. We start with the simple observation that the invariances given by CIs/d-separation are just one special type of a broader set of constraints, which follow from the careful comparison of the different distributions available. Remarkably, these new constraints are intrinsically connected with do-calculus (Pearl, 1995) in the context of soft-interventions. We introduce a novel notion of interventional equivalence class of causal graphs with latent variables based on these invariances, which associates each graphical structure with a set of interventional distributions that respect the do-calculus rules. Given a collection of distributions, two causal graphs are called interventionally equivalent if they are associated with the same family of interventional distributions, where the elements of the family are indistinguishable using the invariances obtained from a direct application of the calculus rules. We introduce a graphical representation that can be used to determine if two causal graphs are interventionally equivalent. We provide a formal graphical characterization of this equivalence. Finally, we extend the FCI algorithm, which was originally designed to operate based on CIs, to combine observational and interventional datasets, including new orientation rules particular to this setting.

35 citations


Journal ArticleDOI
TL;DR: A new k-anonymous method which is different from traditional k-Anonymous was proposed to solve the problem of privacy protection and the results show that the proposed method is more efficient and the information loss of anonymous dataset is much smaller.
Abstract: A new k-anonymous method which is different from traditional k-anonymous was proposed to solve the problem of privacy protection. Specifically, numerical data achieves k-anonymous by adding noises, and categorical data achieves k-anonymous by using randomization. Using the above two methods, the drawback that at least k elements must have the same quasi identifier in the k-anonymous data set has been solved. Since the process of finding anonymous equivalence is very time consuming, a two-step clustering method is used to divide the original data set into equivalence classes. First, the original data set is divided into several different sub-datasets, and then the equivalence classes are formed in the sub-datasets, thus greatly reducing the computational cost of finding anonymous equivalence classes. The experiments are conducted on three different data sets, and the results show that the proposed method is more efficient and the information loss of anonymous dataset is much smaller.

27 citations


Proceedings ArticleDOI
08 Jun 2019
TL;DR: The comprehensive evaluation of the tool SemCluster on benchmarks drawn from solutions to small programming assignments shows that it generates far fewer clusters, precisely identifies distinct solution strategies, and boosts the performance of clustering-based program repair, all within a reasonable amount of time.
Abstract: A fundamental challenge in automated reasoning about programming assignments at scale is clustering student submissions based on their underlying algorithms. State-of-the-art clustering techniques are sensitive to control structure variations, cannot cluster buggy solutions with similar correct solutions, and either require expensive pair-wise program analyses or training efforts. We propose a novel technique that can cluster small imperative programs based on their algorithmic essence: (A) how the input space is partitioned into equivalence classes and (B) how the problem is uniquely addressed within individual equivalence classes. We capture these algorithmic aspects as two quantitative semantic program features that are merged into a program's vector representation. Programs are then clustered using their vector representations. The computation of our first semantic feature leverages model counting to identify the number of inputs belonging to an input equivalence class. The computation of our second semantic feature abstracts the program's data flow by tracking the number of occurrences of a unique pair of consecutive values of a variable during its lifetime. The comprehensive evaluation of our tool SemCluster on benchmarks drawn from solutions to small programming assignments shows that SemCluster (1) generates far fewer clusters than other clustering techniques, (2) precisely identifies distinct solution strategies, and (3) boosts the performance of clustering-based program repair, all within a reasonable amount of time.

22 citations


Journal ArticleDOI
17 Jul 2019
TL;DR: This work considers counting and uniform sampling of DAGs that are Markov equivalent to a given DAG, and gives two algorithms that enable uniform sampling from the equivalence class at a computational cost linear in the graph size.
Abstract: Exploring directed acyclic graphs (DAGs) in a Markov equivalence class is pivotal to infer causal effects or to discover the causal DAG via appropriate interventional data. We consider counting and uniform sampling of DAGs that are Markov equivalent to a given DAG. These problems efficiently reduce to counting the moral acyclic orientations of a given undirected connected chordal graph on n vertices, for which we give two algorithms. Our first algorithm requires O(2nn4) arithmetic operations, improving a previous superexponential upper bound. The second requires O(k!2kk2n) operations, where k is the size of the largest clique in the graph; for bounded-degree graphs this bound is linear in n. After a single run, both algorithms enable uniform sampling from the equivalence class at a computational cost linear in the graph size. Empirical results indicate that our algorithms are superior to previously presented algorithms over a range of inputs; graphs with hundreds of vertices and thousands of edges are processed in a second on a desktop computer.

17 citations


Journal ArticleDOI
TL;DR: Experimental results show that the approach to learning tractable Bayesian networks from data successfully bounds the inference complexity of the learned models, while it is competitive with other state-of-the-art methods in terms of fitting to data.

17 citations


Journal ArticleDOI
TL;DR: A complete model-based equivalence class testing strategy recently developed by the authors is experimentally evaluated and a strategy extension is presented, that is based on randomised data selection from input equivalence classes, that promises a higher test strength when applied against members outside this domain.
Abstract: In this paper, a complete model-based equivalence class testing strategy recently developed by the authors is experimentally evaluated. This black-box strategy applies to deterministic systems with infinite input domains and finite internal state and output domains. It is complete with respect to a given fault model. This means that conforming behaviours will never be rejected, and all non-conforming behaviours inside a given fault domain will be uncovered. We investigate the question how this strategy performs for systems under test whose behaviours lie outside the fault domain. Furthermore, a strategy extension is presented, that is based on randomised data selection from input equivalence classes. While this extension is still complete with respect to the given fault domain, it also promises a higher test strength when applied against members outside this domain. This is confirmed by an experimental evaluation that compares mutation coverage achieved by the original and the extended strategy with the coverage obtained by random testing. For mutation generation, not only typical software errors, but also critical HW/SW integration errors are considered. The latter can be caused by mismatches between hardware and software design, even in the presence of totally correct software.

16 citations


Journal ArticleDOI
17 Jul 2019
TL;DR: In this paper, the authors propose a new technique for counting the number of DAGs in a Markov equivalence class, based on the clique tree representation of chordal graphs.
Abstract: A directed acyclic graph (DAG) is the most common graphical model for representing causal relationships among a set of variables. When restricted to using only observational data, the structure of the ground truth DAG is identifiable only up to Markov equivalence, based on conditional independence relations among the variables. Therefore, the number of DAGs equivalent to the ground truth DAG is an indicator of the causal complexity of the underlying structure–roughly speaking, it shows how many interventions or how much additional information is further needed to recover the underlying DAG. In this paper, we propose a new technique for counting the number of DAGs in a Markov equivalence class. Our approach is based on the clique tree representation of chordal graphs. We show that in the case of bounded degree graphs, the proposed algorithm is polynomial time. We further demonstrate that this technique can be utilized for uniform sampling from a Markov equivalence class, which provides a stochastic way to enumerate DAGs in the equivalence class and may be needed for finding the best DAG or for causal inference given the equivalence class as input. We also extend our counting and sampling method to the case where prior knowledge about the underlying DAG is available, and present applications of this extension in causal experiment design and estimating the causal effect of joint interventions.

14 citations


Proceedings ArticleDOI
01 Sep 2019
TL;DR: This paper introduces EQREL, a specialized parallel union-find data structure for scalable equivalence relations, and its integration into a Datalog compiler, and shows that the new data structure scales on shared-memory multi-core architectures storing up to a half-billion pairs for a static program analysis scenario.
Abstract: Modern parallelizing Datalog compilers are employed in industrial applications such as networking and static program analysis. These applications regularly reason about equivalences, e.g., computing bitcoin user groups, fast points-to analyses, and optimal network routes. State-of-the-art Datalog engines represent equivalence relations verbatim by enumerating all possible pairs in an equivalence class. This approach inhibits scalability for large datasets. In this paper, we introduce EQREL, a specialized parallel union-find data structure for scalable equivalence relations, and its integration into a Datalog compiler. Our data structure provides a quadratic worst-case speed-up and space improvement. We demonstrate the efficacy of our data structure in SOUFFLE, which is a Datalog compiler that synthesizes parallel C ++ code. We use real-world benchmarks and show that the new data structure scales on shared-memory multi-core architectures storing up to a half-billion pairs for a static program analysis scenario.

11 citations


Journal ArticleDOI
TL;DR: Experimental results obtained via seven complex real-life test problems show that using semantics-based equivalence classes for symbolic regression problems in genetic programming is a promising idea and that filters are generally helpful for improving the systems' performance.
Abstract: In this paper, we introduce the concept of semantics-based equivalence classes for symbolic regression problems in genetic programming. The idea is implemented by means of two different genetic programming systems, in which two different definitions of equivalence are used. In both systems, whenever a solution in an equivalence class is found, it is possible to generate any other solution in that equivalence class analytically. As such, these two systems allow us to shift the objective of genetic programming: instead of finding a globally optimal solution, the objective is now to find any solution that belongs to the same equivalence class as a global optimum. Further, we propose improvements to these genetic programming systems in which, once a solution that belongs to a particular equivalence class is generated, no other solution in that class is accepted in the population during the evolution anymore. We call these improved versions filtered systems. Experimental results obtained via seven complex real-life test problems show that using equivalence classes is a promising idea and that filters are generally helpful for improving the systems' performance. Furthermore, the proposed methods produce individuals with a much smaller size with respect to geometric semantic genetic programming. Finally, we show that filters are also useful to improve the performance of a state-of-the-art method, not explicitly based on semantic equivalence classes, like linear scaling.

Posted Content
TL;DR: In this article, the underlying geometry of Koopman eigenfunctions involves an extreme multiplicity whereby infinitely many eigen functions correspond to each eigenvalue that is resolved by a quotient set of functions, in terms of matched level sets.
Abstract: Representation of a dynamical system in terms of simplifying modes is a central premise of reduced order modelling and a primary concern of the increasingly popular DMD (dynamic mode decomposition) empirical interpretation of Koopman operator analysis of complex systems. In the spirit of optimal approximation and reduced order modelling the goal of DMD methods and variants are to describe the dynamical evolution as a linear evolution in an appropriately transformed lower rank space, as best as possible. However, as far as we know there has not been an in depth study regarding the underlying geometry as related to an efficient representation. To this end we present that a good dictionary, that quite different from other's constructions, we need only to construct optimal initial data functions on a transverse co-dimension one set. Then the eigenfunctions on a subdomain follows the method of characteristics. The underlying geometry of Koopman eigenfunctions involves an extreme multiplicity whereby infinitely many eigenfunctions correspond to each eigenvalue that we resolved by our new concept as a quotient set of functions, in terms of matched level sets. We call this equivalence class of functions a ``primary eigenfunction" to further help us to resolve the relationship between the large number of eigenfunctions in perhaps an otherwise low dimensional phase space. This construction allows us to understand the geometric relationships between the numerous eigenfunctions in a useful way. Aspects are discussed how the underlying spectral decomposition as the point spectrum and continuous spectrum fundamentally relate to the domain of the eigenfunctions functions.

Journal ArticleDOI
TL;DR: An extension of FDR, the model checker for the process algebra CSP, that exploits symmetry to reduce the size of the state space searched and proves a powerful syntactic result, identifying conditions under which a process will be symmetric in a particular type.
Abstract: We present an extension of FDR, the model checker for the process algebra CSP, that exploits symmetry to reduce the size of the state space searched We define what it means for a process to be symmetric with respect to a group of permutations on the transition labels We factor the state space of the search by symmetry equivalence, mapping each state to a representative of its equivalence class, thereby considering all symmetric states together We prove a powerful syntactic result, identifying conditions under which a process will be symmetric in a particular type We show how to implement such a search using the powerful technique of supercombinators used in the implementation of FDR: we identify conditions on a supercombinator for it to be symmetric and explain how to apply a permutation to a state Finally, we present a novel efficient technique for calculating representatives of equivalence classes, which normally finds unique representatives; our experiments suggest that this technique typically works faster than other techniques and in particular scales better

Journal ArticleDOI
TL;DR: The proposed method labels crime reports in an incremental way using existing labelled clusters with the help of rough set theory and is validated by various indices and compared with many state-of-the-art clustering and classification algorithms.

Journal ArticleDOI
TL;DR: In this paper, the electroencephalogram (EEG)-based N400 component is often described as an index of a semantic relation and has been used as an electrophysiological measure of equivalence class formation.
Abstract: The electroencephalogram (EEG)-based N400 component is often described as an index of a semantic relation. Recent studies suggested that the N400 component can also be used as an electrophysiological measure of equivalence class formation, yet more research is needed to clarify the effects of experimental conditions on the N400 response. In Experiment 1 of the present study, the participants were trained on six conditional discriminations and tested for the formation of three 3-member classes. If they formed equivalence classes with half of the possible emerged relations, the participants were given a priming test with the other half. Related and unrelated stimulus pairs were presented, and the participant had to decide whether the stimuli were related or not. The results showed that a nonsignificant N400 response was observed after unrelated stimulus pairs were presented but not when related stimulus pairs were presented. The strength of the N400 response weakened over the number of stimuli presentations. In Experiment 2, we examined whether changes to the methodology of Experiment 1 would produce stronger N400 responses. The participants also underwent a word priming procedure, which has been shown to produce robust N400 responses. We found more robust N400 responses in Experiment 2 than in Experiment 1, and the N400 response was larger in transitivity/equivalence relations than in symmetry relations. There was also a significant relation effect in the word priming procedure. Together, these findings support the notion that the N400 component can be used as an electrophysiological measure of equivalence class formation and illustrate how experimental conditions can influence the N400 response.

Journal ArticleDOI
TL;DR: The authors evaluated the combined effects of nodal number and relational type on relatedness of stimuli within equivalence classes and found that transitive relations are preferred to equivalence relations when the nodal numbers are held constant.
Abstract: In equivalence classes, stimulus relatedness is an inverse function of the nodal number for the same type of derived relation. Also, transitive relations are preferred to equivalence relations when the nodal number is held constant. The current study evaluated the combined effects of nodal number and relational type on relatedness of stimuli within equivalence classes. After eight college students formed two 7-node, 9-member equivalence classes, participants were presented with trials during within-class relational preference tests that pitted 1-node equivalence relations against 1- to 5-node transitive relations. Consistent with prior research, 1-node transitive relations were always preferred to 1-node equivalence relations. For the six participants who formed classes rapidly, preference for the 1-node equivalence relation increased as a direct function of increases in the number of nodes in the competing transitive relation. In addition, the 1-node equivalence relation was equally preferred to an approximately 2-node transitive relation. Because equivalence classes remained intact after preference testing, performances documented the coexistence of equal and differential relatedness of class members. Two participants formed the classes on a delayed basis and produced inverted U-shaped preference functions instead of monotonic preference functions. Because the preference functions differed in terms of speed of emergence, the same nominal equivalence classes were not functionally equivalent to each other with regard to stimulus relatedness.

Book ChapterDOI
TL;DR: This review presents two algorithms based on an algorithm of Bodlaender and Kloks that computes an optimal tree decomposition given a non-optimal tree decompositions of bounded width and further simplifies Perkovic and Reed's simplification.
Abstract: In 1996, Bodlaender showed the celebrated result that an optimal tree decomposition of a graph of bounded treewidth can be found in linear time. The algorithm is based on an algorithm of Bodlaender and Kloks that computes an optimal tree decomposition given a non-optimal tree decomposition of bounded width. Both algorithms, in particular the second, are hardly accessible. We present the second algorithm in a much simpler way in this paper and refer to an extended version for the first. In our description of the second algorithm, we start by explaining how all tree decompositions of subtrees defined by the nodes of the given tree decomposition can be enumerated. We group tree decompositions into equivalence classes depending on the current node of the given tree decomposition, such that it suffices to enumerate one tree decomposition per equivalence class and, for each node of the given tree decomposition, there are only a constant number of classes which can be represented in constant space.

Book ChapterDOI
08 Dec 2019
TL;DR: Structure-preserving signatures on equivalence classes (SPS-EQ) introduced at ASIACRYPT 2014 are a variant of SPS where a message is considered as a projective equivalence class, and a new representative of the same class can be obtained by multiplying a vector by a scalar.
Abstract: Structure-preserving signatures on equivalence classes (SPS-EQ) introduced at ASIACRYPT 2014 are a variant of SPS where a message is considered as a projective equivalence class, and a new representative of the same class can be obtained by multiplying a vector by a scalar. Given a message and corresponding signature, anyone can produce an updated and randomized signature on an arbitrary representative from the same equivalence class. SPS-EQ have proven to be a very versatile building block for many cryptographic applications.

Journal ArticleDOI
TL;DR: An alternative/complementary method is proposed that reduces the search space by defining large equivalence classes of topologically identical matrices through row and column permutations using additive, structural, and multiplicative transformations.
Abstract: The construction of a quasi-cyclic low density parity-check (QC-LDPC) matrix is usually carried out in two steps. In the first step, a prototype matrix is defined according to certain criteria (size, girth, check and variable node degrees, and so on). The second step involves the expansion of the prototype matrix. During this last phase, an integer value is assigned to each non-null position in the prototype matrix corresponding to the right-rotation of the identity matrix. The problem of determining these integer values is complex. The state-of-the-art solutions use either some mathematical constructions to guarantee a given girth of the final QC-LDPC code, or a random search of values until the target girth is satisfied. In this paper, we propose an alternative/complementary method that reduces the search space by defining large equivalence classes of topologically identical matrices through row and column permutations using additive, structural, and multiplicative transformations. Selecting only a single element per equivalence class can reduce the search space by a few orders of magnitude. Then, we use the formalism of constraint programming to list the exhaustive sets of solutions for a given girth and a given expansion factor. An example is presented in all sections of the paper to illustrate the methodology.

Posted Content
TL;DR: In this article, a complete list of cobounded actions of solvable Baumslag-Solitar groups on hyperbolic metric spaces up to a natural equivalence relation is given.
Abstract: We give a complete list of the cobounded actions of solvable Baumslag-Solitar groups on hyperbolic metric spaces up to a natural equivalence relation. The set of equivalence classes carries a natural partial order first introduced by Abbott-Balasubramanya-Osin, and we describe the resulting poset completely. There are finitely many equivalence classes of actions, and each equivalence class contains the action on a point, a tree, or the hyperbolic plane.

Proceedings ArticleDOI
Naixuan Guo, Ming Yang1, Qiyuan Gong1, Zhouguo Chen, Junzhou Luo1 
06 May 2019
TL;DR: This paper proposes a novel clustering-based anonymization algorithm, which tries to cluster records without separating any natural equivalent class, and proves that the natural equivalentclass can effectively reduce the computational complexity of clustering algorithms as well as information loss.
Abstract: Data anonymization is widely used to preserve the utility of published datasets without compromising privacy. The state-of-the-art data anonymization approaches are mainly single-record-based algorithms. They group similar records together one by one, then form equivalence classes through generalization. However, these algorithms didn't utilize equivalence classes which exist in the raw dataset. In this paper, we propose a new concept named natural equivalent class. It refers to the record set with the same quasi-identifier values naturally existing in the raw dataset. We theoretically prove that the natural equivalent class can effectively reduce the computational complexity of clustering algorithms as well as information loss. Then, we propose a novel clustering-based anonymization algorithm, which tries to cluster records without separating any natural equivalent class. Extensive experiments on real world datasets show that our approach outperforms the previous clustering-based anonymization algorithms in terms of efficiency and data utility.

Journal ArticleDOI
TL;DR: HcRPC, a Highly compact Reachability Preserving Graph compression algorithm with Corrections, which is capable of preserving the reachability relations between the nodes in original graph is proposed.
Abstract: Graphs are used in numerous applications to model real-world systems and phenomena. The ever increasing size of graphs makes them difficult to query and analyze. In this paper, we propose HcRPC , a Highly compact Reachability Preserving Graph compression algorithm with Corrections, which is capable of preserving the reachability relations between the nodes in original graph. The highly compressed representation of a given graph consists of a compressed graph and a set of corrections. The original graph is compressed on the basis of equivalence class obtained via the reachability relations between nodes in the original graph. In the compressed graph, each node corresponds to a set of nodes from the original graph with similar ancestors and descendants, and each edge represents linkage between the original nodes in any two node sets. The corrections portion specifies the set of corrections, including equivalent class-node corrections and node-node corrections. MinHash technique is utilized to speed up checking whether equivalence classes are structure-similar and the pair of equivalence classes with high similarity are thus merged to acquire a highly compressed graph. Besides, we develop an algorithm for preserving compressed graph with a set of corrections in response to changes to the original graph. We evaluate our algorithms on real-life graph data sets and the results indicate that graph data sets can be highly compressed while preserving the reachability relations between nodes.

Journal ArticleDOI
TL;DR: The chaotic behavior of 1-dimensional non-uniform cellular automata (CAs) is studied and a parametrization technique is developed based on the transitivity, communication class and blocking words.

Journal ArticleDOI
TL;DR: This work considers the unextendible product bases (UPBs) of fixed cardinality m in quantum systems of n qubits, and uses this partial order to study the topological closure of an equivalence class of UPBs.
Abstract: We consider the unextendible product bases (UPBs) of fixed cardinality m in quantum systems of n qubits. These UPBs are divided into finitely many equivalence classes with respect to an equivalence relation introduced by N. Johnston. There is a natural partial order “ $$\le $$ ” on the set of these equivalence classes for fixed m, and we use this partial order to study the topological closure of an equivalence class of UPBs. In the case of four qubits, for $$m=8,9,10$$ , we construct explicitly the Hasse diagram of this partial order.

Proceedings ArticleDOI
01 Aug 2019
TL;DR: A novel hybrid algorithm that maps a Boolean function to a representative function for the equivalence class containing the original function is presented and can be used to determine a sequence of translations that maps one function to an equivalent function.
Abstract: The equivalence of Boolean functions with respect to five invariance (aka translation) operations has been well considered with respect to the Rademacher-Walsh spectral domain. In this paper, we introduce a hybrid approach that uses both the Reed-Muller and the Rademacher-Walsh spectra. A novel hybrid algorithm that maps a Boolean function to a representative function for the equivalence class containing the original function is presented. The algorithm can be used to determine a sequence of translations that maps one function to an equivalent function. We present experimental results that show the hybrid algorithm can determine the equivalence classes for 5 variables much more efficiently than before. We also show that for 6 variables where there are 150,357 equivalence classes, 8 are very difficult, a further 58 are difficult and the remainder are straightforward in terms of the CPU time required by the hybrid algorithm.

Proceedings ArticleDOI
01 Dec 2019
TL;DR: An efficient canonical labelling algorithm (via partial Latin squares, PLSs) is designed which does not require graph conversion, facilitates compression, and the labels are more humanly meaningful.
Abstract: Latin squares are combinatorial matrices that are widely used in diverse areas of research such as codes and cryptography, software testing, mathematical research, and experimental designs. All of these fields would benefit from a search engine for Latin squares. One major obstacle to developing a Latin-square search engine is that any Latin square has a large number of equivalent Latin squares, which are contained in multiple equivalence classes, and thus we need an efficient online method for canonical labelling Latin squares. Canonical labelling usually proceeds via the Nauty graph isomorphism software, but this incurs conversion costs. Moreover, the canonical labels are practically random members of their equivalence classes. A second obstacle is how large amounts of searchable Latin-square data may be stored efficiently. In this paper, we design data structures and algorithms suitable for a Latin-square search engine. We use a tree-based data structure for storing large numbers of Latin squares that also enables efficient search capabilities. We design an efficient canonical labelling algorithm (via partial Latin squares, PLSs) which does not require graph conversion, facilitates compression, and the labels are more humanly meaningful. We implement and experiment with a skeletal prototype of the Latin-square search engine. Experimental results confirm that the PLS method is faster than Nauty, and has reduced space requirements.

Book ChapterDOI
01 Dec 2019
TL;DR: Theoretical analysis and the experimental results demonstrate that the improved L-diversity algorithm can not only improve the privacy protection degree of sensitive data, but also effectively reduce the information loss.
Abstract: Mass data has been collected and released everyday, at the same time, the published data contains a lot of sensitive information related to individuals. K-anonymity privacy preserving mechanisms can prevent the disclosure of individual privacy information in the scenarios of data publication. L-diversity further considers the distribution of sensitive attributes in equivalence classes to avoid homogeneity attacks. In this paper, we propose an improved L-diversity algorithm based on clustering, and we consider the L-diversity demand of sensitive attributes while clustering to achieve K-anonymity. We minimize the total information loss of each equivalence class by choosing records which has minimal loss of information, regardless of whether they have different sensitive attributes, until the number of distinct values of sensitive attribute in the equivalence class reaches L. This algorithm we conduct experiments on UCI Adult data set and compared with traditional (K,L)-member algorithm. Theoretical analysis and the experimental results demonstrate that the improved L-diversity algorithm can not only improve the privacy protection degree of sensitive data, but also effectively reduce the information loss.

Journal ArticleDOI
TL;DR: In this paper, a universal action of a countable locally finite group (the Hall group) on a separable metric space by isometries is constructed, which contains all actions of all locally finite groups as subactions.
Abstract: We construct a universal action of a countable locally finite group (the Hall group) on a separable metric space by isometries. This single action contains all actions of all countable locally finite groups on all separable metric spaces as subactions. The main ingredient is the amalgamation of actions by isometries. We show that an equivalence class of this universal action is generic. We show that the restriction to locally finite groups in our results is necessary as analogous results do not hold for infinite non-locally finite groups. We discuss the problem also for actions by linear isometries on Banach spaces.

01 Jan 2019
TL;DR: In this article, it was shown that the independence equivalence class of every odd path has size 1, while the class can contain arbitrarily many graphs for even paths, except for the odd case.
Abstract: The independence polynomial of a graph is the generating polynomial for the number of independent sets of each size. Two graphs are said to be \textit{independence equivalent} if they have equivalent independence polynomials. We extend previous work by showing that independence equivalence class of every odd path has size 1, while the class can contain arbitrarily many graphs for even paths. We also prove that the independence equivalence class of every even cycle consists of two graphs when $n\ge 2$ except the independence equivalence class of $C_6$ which consists of three graphs. The odd case remains open, although, using irreducibility results from algebra, we were able show that for a prime $p \geq 5$ and $n\ge 1$ the independence equivalence class of $C_{p^n}$ consists of only two graphs.

Book ChapterDOI
22 Jul 2019
TL;DR: A characterization of the linear sets of vectors with a finite number of positive Myhill-Nerode classes, which uses rational cones and a crucial role is played by lattices, which are special semi-linear sets that are defined as a natural way to extend “the pattern” of a linear set to the whole set of vectors of natural numbers in a given dimension.
Abstract: Right one-way jumping automata (ROWJFAs) are an automaton model that was recently introduced for processing the input in a discontinuous way. In [S. Beier, M. Holzer: Properties of right one-way jumping finite automata. In Proc. 20th DCFS, number 10952 in LNCS, 2018] it was shown that the permutation closed languages accepted by ROWJFAs are exactly those with a finite number of positive Myhill-Nerode classes. Here a Myhill-Nerode equivalence class \([w]_L\) of a language L is said to be positive if w belongs to L. Obviously, this notion of positive Myhill-Nerode classes generalizes to sets of vectors of natural numbers. We give a characterization of the linear sets of vectors with a finite number of positive Myhill-Nerode classes, which uses rational cones. Furthermore, we investigate when a set of vectors can be decomposed as a finite union of sets of vectors with a finite number of positive Myhill-Nerode classes. A crucial role is played by lattices, which are special semi-linear sets that are defined as a natural way to extend “the pattern” of a linear set to the whole set of vectors of natural numbers in a given dimension. We show connections of lattices to the Myhill-Nerode relation and to rational cones. Some of these results will be used to give characterization results about ROWJFAs with multiple initial states. For binary alphabets we show connections of these and related automata to counter automata.