scispace - formally typeset
Search or ask a question

Showing papers on "Logical matrix published in 2020"


Journal ArticleDOI
TL;DR: A Boolean network encryption algorithm for a synchronous update process is proposed, and a matrix semi-tensor product technique to generate an encrypted image in a second round of diffusion shows good security characteristics.

295 citations


Journal ArticleDOI
TL;DR: This work proposes a novel Semantic Guided Hashing method coupled with binary matrix factorization to perform more effective nearest neighbor image search by simultaneously exploring the weakly-supervised rich community-contributed information and the underlying data structures.
Abstract: Hashing has been widely investigated for large-scale image retrieval due to its search effectiveness and computation efficiency. In this work, we propose a novel Semantic Guided Hashing method coupled with binary matrix factorization to perform more effective nearest neighbor image search by simultaneously exploring the weakly-supervised rich community-contributed information and the underlying data structures. To uncover the underlying semantic information from the weakly-supervised user-provided tags, the binary matrix factorization model is leveraged for learning the binary features of images while the problem of imperfect tags is well addressed. The uncovered semantic information enables to well guide the discrete hash code learning. The underlying data structures are discovered by adaptively learning a discriminative data graph, which makes the learned hash codes preserve the meaningful neighbors. To the best of our knowledge, the proposed method is the first work that incorporates the hash code learning, the semantic information mining and the data structure discovering into one unified framework. Besides, the proposed method is extended to one deep approach for the optimal compatibility of discriminative feature learning and hash code learning. Experiments are conducted on two widely-used social image datasets and the proposed method achieves encouraging performance compared with the state-of-the-art hashing methods.

129 citations


Journal ArticleDOI
TL;DR: This paper proposes a joint model to simultaneously compute the optimal real matrix and binary matrix which is an orthonormal matrix for spectral clustering and demonstrates the effectiveness of the proposed method on benchmark datasets.
Abstract: Spectral clustering is an important clustering method widely used for pattern recognition and image segmentation. Classical spectral clustering algorithms consist of two separate stages: 1) solving a relaxed continuous optimization problem to obtain a real matrix followed by 2) applying ${K}$ -means or spectral rotation to round the real matrix (i.e., continuous clustering result) into a binary matrix called the cluster indicator matrix. Such a separate scheme is not guaranteed to achieve jointly optimal result because of the loss of useful information. To obtain a better clustering result, in this paper, we propose a joint model to simultaneously compute the optimal real matrix and binary matrix. The existing joint model adopts an orthonormal real matrix to approximate the orthogonal but nonorthonormal cluster indicator matrix. It is noted that only in a very special case (i.e., all clusters have the same number of samples), the cluster indicator matrix is an orthonormal matrix multiplied by a real number. The error of approximating a nonorthonormal matrix is inevitably large. To overcome the drawback, we propose replacing the nonorthonormal cluster indicator matrix with a scaled cluster indicator matrix which is an orthonormal matrix. Our method is capable of obtaining better performance because it is easy to minimize the difference between two orthonormal matrices. Experimental results on benchmark datasets demonstrate the effectiveness of the proposed method (called JSESR).

57 citations


Journal ArticleDOI
TL;DR: From the new perspective of logical matrix equations, observability of Boolean networks (BN) is investigated and it is shown that one BN is locally observable on the set of reachable states if and only if the constructed matrix equations have a unique canonical solution.

49 citations


Journal ArticleDOI
TL;DR: This article investigates the partial stabilization problem of probabilistic Boolean control networks (PBCNs) under sample-data state-feedback control (SDSFC) with a control Lyapunov function (CLF) approach and finds that the existence of a CLF is equivalent to that of SDSFC.
Abstract: This article investigates the partial stabilization problem of probabilistic Boolean control networks (PBCNs) under sample-data state-feedback control (SDSFC) with a control Lyapunov function (CLF) approach. First, the probability structure matrix of the considered PBCN is represented by a Boolean matrix, based on which, a new algebraic form of the system is obtained. Second, we convert the partial stabilization problem of PBCNs into the global set stabilization one. Third, we define CLF and its structural matrix under SDSFC. It is found that the existence of a CLF is equivalent to that of SDSFC. Then, a necessary and sufficient condition is obtained for the existence of CLF under SDSFC, based on which, all possible sample-data state-feedback controllers and corresponding structural matrices of CLF are designed by two different methods. Finally, examples are given to illustrate the efficiency of the obtained results.

38 citations


Journal ArticleDOI
Mingbao Lin1, Rongrong Ji1, Hong Liu1, Xiaoshuai Sun1, Shen Chen1, Qi Tian2, Qi Tian1 
TL;DR: A novel supervised online hashing scheme termed H adamard M atrix Guided O nline H ashing (HMOH) is proposed, which introduces Hadamard matrix, which is an orthogonal binary matrix built via Sylvester method, which satisfies several desired properties of hashing codes.
Abstract: Online image hashing has attracted increasing research attention recently, which receives large-scale data in a streaming manner to update the hash functions on-the-fly. Its key challenge lies in the difficulty of balancing the learning timeliness and model accuracy. To this end, most works follow a supervised setting, i.e., using class labels to boost the hashing performance, which defects in two aspects: first, strong constraints, e.g., orthogonal or similarity preserving, are used, which however are typically relaxed and lead to large accuracy drops. Second, large amounts of training batches are required to learn the up-to-date hash functions, which largely increase the learning complexity. To handle the above challenges, a novel supervised online hashing scheme termed Hadamard Matrix Guided Online Hashing (HMOH) is proposed in this paper. Our key innovation lies in introducing Hadamard matrix, which is an orthogonal binary matrix built via Sylvester method. In particular, to release the need of strong constraints, we regard each column of Hadamard matrix as the target code for each class label, which by nature satisfies several desired properties of hashing codes. To accelerate the online training, LSH is first adopted to align the lengths of target code and to-be-learned binary code. We then treat the learning of hash functions as a set of binary classification problems to fit the assigned target code. Finally, extensive experiments on four widely-used benchmarks demonstrate the superior accuracy and efficiency of HMOH over various state-of-the-art methods. Codes can be available at https://github.com/lmbxmu/mycode .

29 citations


Proceedings ArticleDOI
09 Jul 2020
TL;DR: A concise summary of the efforts of all of the communities studying Boolean Matrix Factorization is given and some open questions which in this opinion require further investigation are raised.
Abstract: The goal of Boolean Matrix Factorization (BMF) is to approximate a given binary matrix as the product of two low-rank binary factor matrices, where the product of the factor matrices is computed under the Boolean algebra. While the problem is computationally hard, it is also attractive because the binary nature of the factor matrices makes them highly interpretable. In the last decade, BMF has received a considerable amount of attention in the data mining and formal concept analysis communities and, more recently, the machine learning and the theory communities also started studying BMF. In this survey, we give a concise summary of the efforts of all of these communities and raise some open questions which in our opinion require further investigation.

24 citations


Journal ArticleDOI
TL;DR: A matrix-based dynamic framework for updating three-way regions (positive, boundary and negative regions) in TWNDM based on the data-driven neighborhood relation in terms of two pseudo-distance functions only satisfying the reflexivity is presented.

24 citations


Journal ArticleDOI
TL;DR: In this article, the problem of compressed sensing using binary measurement matrices and base-pursuit (basis pursuit) as the recovery algorithm is studied and upper and lower bounds on the number of measurements required to achieve robust sparse recovery with binary matrices are derived.
Abstract: In this paper, we study the problem of compressed sensing using binary measurement matrices and $\ell _1$ -norm minimization (basis pursuit) as the recovery algorithm. We derive new upper and lower bounds on the number of measurements to achieve robust sparse recovery with binary matrices. We establish sufficient conditions for a column-regular binary matrix to satisfy the robust null space property (RNSP) and show that the associated sufficient conditions for robust sparse recovery obtained using the RNSP are better by a factor of $(3 \sqrt{3})/2 \approx 2.6$ compared to the sufficient conditions obtained using the restricted isometry property (RIP). Next we derive universal lower bounds on the number of measurements that any binary matrix needs to have in order to satisfy the weaker sufficient condition based on the RNSP and show that bipartite graphs of girth six are optimal. Then we display two classes of binary matrices, namely parity check matrices of array codes and Euler squares, which have girth six and are nearly optimal in the sense of almost satisfying the lower bound. In principle, randomly generated Gaussian measurement matrices are “order-optimal.” So we compare the phase transition behavior of the basis pursuit formulation using binary array codes and Gaussian matrices and show that (i) there is essentially no difference between the phase transition boundaries in the two cases and (ii) the CPU time of basis pursuit with binary matrices is hundreds of times faster than with Gaussian matrices and the storage requirements are less. Therefore it is suggested that binary matrices are a viable alternative to Gaussian matrices for compressed sensing using basis pursuit.

23 citations


Journal ArticleDOI
TL;DR: This work starts the systematic algorithmic study of low-rank binary matrix approximation from the perspective of parameterized complexity and shows in which cases and under what conditions the problem is fixed-parameter tractable, admits a polynomial kernel and can be solved in parameterized subexponential time.
Abstract: Low-rank binary matrix approximation is a generic problem where one seeks a good approximation of a binary matrix by another binary matrix with some specific properties. A good approximation means that the difference between the two matrices in some matrix norm is small. The properties of the approximation binary matrix could be: a small number of different columns, a small binary rank or a small Boolean rank. Unfortunately, most variants of these problems are NP-hard. Due to this, we initiate the systematic algorithmic study of low-rank binary matrix approximation from the perspective of parameterized complexity. We show in which cases and under what conditions the problem is fixed-parameter tractable, admits a polynomial kernel and can be solved in parameterized subexponential time.

19 citations


Posted Content
TL;DR: Simulated annealing can outperform QAOA for BLLS at a QAoa depth of p\leq3p≤3 for the probability of sampling the ground state, and some of the challenges involved in current-day experimental implementations of this technique on cloud-based quantum computers are pointed out.
Abstract: The Quantum Approximate Optimization Algorithm (QAOA) by Farhi et al is a quantum computational framework for solving quantum or classical optimization tasks Here, we explore using QAOA for Binary Linear Least Squares (BLLS); a problem that can serve as a building block of several other hard problems in linear algebra, such as the Non-negative Binary Matrix Factorization (NBMF) and other variants of the Non-negative Matrix Factorization (NMF) problem Most of the previous efforts in quantum computing for solving these problems were done using the quantum annealing paradigm For the scope of this work, our experiments were done on noiseless quantum simulators, a simulator including a device-realistic noise-model, and two IBM Q 5-qubit machines We highlight the possibilities of using QAOA and QAOA-like variational algorithms for solving such problems, where trial solutions can be obtained directly as samples, rather than being amplitude-encoded in the quantum wavefunction Our numerics show that Simulated Annealing can outperform QAOA for BLLS at a QAOA depth of $p\leq3$ for the probability of sampling the ground state Finally, we point out some of the challenges involved in current-day experimental implementations of this technique on cloud-based quantum computers

Journal ArticleDOI
TL;DR: An improved parameter adaptive and regional query density clustering algorithm is proposed, which can effectively delete the redundant data in the high-level complex data space on the premise of retaining the internal nonlinear structure of the IoT data.

Journal ArticleDOI
TL;DR: This paper presents an extended rough set model, named as multi-source composite rough sets (MCRS), by integrating different types of attributes and fusing multiple composite relations derived from different information sources, and develops the incremental algorithms for updating composite rough approximations.

Journal ArticleDOI
03 Apr 2020
TL;DR: MEBF (Median Expansion for Boolean Factorization) demonstrated superior performances in lower reconstruction error, and higher computational efficiency, as well as more accurate density patterns than popular methods such as ASSO, PANDA and Message Passing.
Abstract: Boolean matrix has been used to represent digital information in many fields, including bank transaction, crime records, natural language processing, protein-protein interaction, etc. Boolean matrix factorization (BMF) aims to find an approximation of a binary matrix as the Boolean product of two low rank Boolean matrices, which could generate vast amount of information for the patterns of relationships between the features and samples. Inspired by binary matrix permutation theories and geometric segmentation, we developed a fast and efficient BMF approach, called MEBF (Median Expansion for Boolean Factorization). Overall, MEBF adopted a heuristic approach to locate binary patterns presented as submatrices that are dense in 1's. At each iteration, MEBF permutates the rows and columns such that the permutated matrix is approximately Upper Triangular-Like (UTL) with so-called Simultaneous Consecutive-ones Property (SC1P). The largest submatrix dense in 1 would lie on the upper triangular area of the permutated matrix, and its location was determined based on a geometric segmentation of a triangular. We compared MEBF with other state of the art approaches on data scenarios with different density and noise levels. MEBF demonstrated superior performances in lower reconstruction error, and higher computational efficiency, as well as more accurate density patterns than popular methods such as ASSO, PANDA and Message Passing. We demonstrated the application of MEBF on both binary and non-binary data sets, and revealed its further potential in knowledge retrieving and data denoising.

Journal ArticleDOI
TL;DR: In this paper, the stabilization with a probability of one for PFAs is investigated under the framework of semi-tensor product (STP) of matrices, and an optimal state feedback controller is established by a constructive logical matrix.
Abstract: Probabilistic finite automata (PFAs) can exhibit a stochastic behavior, and its stabilization is the basic and important problem in the control theory. In this paper, the stabilization with a probability of one for PFAs is investigated under the framework of semi-tensor product (STP) of matrices. First, some specific state is selected to be marked for PFAs, and using the STP of matrices, a necessary and sufficient condition to verify its controllability with a probability of one is derived in a simple and convenient way. Second, based on the controllability condition, a novel sequence of prereachability set is defined, and a necessary and sufficient condition for stabilization with a probability of one is provided by discussing the prereachability set. Meanwhile, an optimal state feedback controller is established by a constructive logical matrix. In the end, an example is illustrated to demonstrate the obtained results.

Journal ArticleDOI
TL;DR: The proposed metagenomic binning method, MetaBMF, can not only bin DNA fragments accurately at a species level but also at a strain level, and can accurately identify the Shiga-toxigenic E. coli O104:H4 strain which led to the 2011 German E.coli outbreak.
Abstract: Motivation Metagenomics studies microbial genomes in an ecosystem such as the gastrointestinal tract of a human. Identification of novel microbial species and quantification of their distributional variations among different samples that are sequenced using next-generation-sequencing technology hold the key to the success of most metagenomic studies. To achieve these goals, we propose a simple yet powerful metagenomic binning method, MetaBMF. The method does not require prior knowledge of reference genomes and produces highly accurate results, even at a strain level. Thus, it can be broadly used to identify disease-related microbial organisms that are not well-studied. Results Mathematically, we count the number of mapped reads on each assembled genomic fragment cross different samples as our input matrix and propose a scalable stratified angle regression algorithm to factorize this count matrix into a product of a binary matrix and a nonnegative matrix. The binary matrix can be used to separate microbial species and the nonnegative matrix quantifies the species distributions in different samples. In simulation and empirical studies, we demonstrate that MetaBMF has a high binning accuracy. It can not only bin DNA fragments accurately at a species level but also at a strain level. As shown in our example, we can accurately identify the Shiga-toxigenic Escherichia coli O104: H4 strain which led to the 2011 German E.coli outbreak. Our efforts in these areas should lead to (i) fundamental advances in metagenomic binning, (ii) development and refinement of technology for the rapid identification and quantification of microbial distributions and (iii) finding of potential probiotics or reliable pathogenic bacterial strains. Availability and implementation The software is available at https://github.com/didi10384/MetaBMF.

Journal ArticleDOI
03 May 2020
TL;DR: By systematically running weighted rank-one binary matrix factorization, one can effectively perform various binary data analysis tasks, like compression, clustering, and pattern discovery.
Abstract: Many applications use data that are better represented in the binary matrix form, such as click-stream data, market basket data, document-term data, user-permission data in access control, and others. Matrix factorization methods have been widely used tools for the analysis of high-dimensional data, as they automatically extract sparse and meaningful features from data vectors. However, existing matrix factorization methods do not work well for the binary data. One crucial limitation is interpretability, as many matrix factorization methods decompose an input matrix into matrices with fractional or even negative components, which are hard to interpret in many real settings. Some matrix factorization methods, like binary matrix factorization, do limit decomposed matrices to binary values. However, these models are not flexible to accommodate some data analysis tasks, like trading off summary size with quality and discriminating different types of approximation errors. To address those issues, this article presents weighted rank-one binary matrix factorization, which is to approximate a binary matrix by the product of two binary vectors, with parameters controlling different types of approximation errors. By systematically running weighted rank-one binary matrix factorization, one can effectively perform various binary data analysis tasks, like compression, clustering, and pattern discovery. Theoretical properties on weighted rank-one binary matrix factorization are investigated and its connection to problems in other research domains are examined. As weighted rank-one binary matrix factorization in general is NP-hard, efficient and effective algorithms are presented. Extensive studies on applications of weighted rank-one binary matrix factorization are also conducted.

Journal ArticleDOI
TL;DR: By refining the interleaving relation in a behavioral profile into six types, this paper proposes the notion of a relation profile based on behavioral profile, and presents a new formula to measure the behavior similarity of two WF-nets.
Abstract: This paper focuses on the behavior similarity of workflow nets (WF-nets) The similarity of two WF-nets reflects their consistent degree in behaviors It explores the behavioral relations of subsets of transitions based on the interleaving semantics, and more accurate relations are defined than the existing work Therefore, a more accurate similarity of two WF-nets (in their behaviors) can be obtained than that in the existing work that usually do not consider the loop and complex correspondence By refining the interleaving relation in a behavioral profile into six types, this paper proposes the notion of a relation profile based on behavioral profile Based on the relation profile of a WF-net, behavioral relation matrix can be constructed Additionally, we refine the complex correspondence and generate a group of behavioral relation submatrices from the behavioral relation matrix By using them we present a new formula to measure the behavior similarity of two WF-nets Finally, examples illustrate that our method can measure the similarity degree more accurately

Journal ArticleDOI
TL;DR: This work studies the problem of constructing superimposed codes that have the additional constraints that the number of 1's in each column of the matrix is constant, and equal to an input parameter w, and improves on the known literature in the area.

Journal ArticleDOI
06 Oct 2020
TL;DR: In this paper, the dimension of a simple module for the algebra of the monoid of all relations on a finite set was determined, which is in fact the same question as the determination of the dimension for every evaluation of every simple correspondence functor.
Abstract: We determine the dimension of every simple module for the algebra of the monoid of all relations on a finite set (i.e. Boolean matrices). This is in fact the same question as the determination of the dimension of every evaluation of a simple correspondence functor. The method uses the theory of such functors developed in [BT2, BT3], as well as some new ingredients in the theory of finite lattices.

Journal ArticleDOI
TL;DR: This work proposes a unified framework for Bayesian mean-parameterized nonnegative binary matrix factorization models (NBMF) and derives a novel collapsed Gibbs sampler and a collapsed variational algorithm to infer the posterior distribution of the factors.
Abstract: Binary data matrices can represent many types of data such as social networks, votes, or gene expression. In some cases, the analysis of binary matrices can be tackled with nonnegative matrix factorization (NMF), where the observed data matrix is approximated by the product of two smaller nonnegative matrices. In this context, probabilistic NMF assumes a generative model where the data is usually Bernoulli-distributed. Often, a link function is used to map the factorization to the [0, 1] range, ensuring a valid Bernoulli mean parameter. However, link functions have the potential disadvantage to lead to uninterpretable models. Mean-parameterized NMF, on the contrary, overcomes this problem. We propose a unified framework for Bayesian mean-parameterized nonnegative binary matrix factorization models (NBMF). We analyze three models which correspond to three possible constraints that respect the mean-parameterization without the need for link functions. Furthermore, we derive a novel collapsed Gibbs sampler and a collapsed variational algorithm to infer the posterior distribution of the factors. Next, we extend the proposed models to a nonparametric setting where the number of used latent dimensions is automatically driven by the observed data. We analyze the performance of our NBMF methods in multiple datasets for different tasks such as dictionary learning and prediction of missing data. Experiments show that our methods provide similar or superior results than the state of the art, while automatically detecting the number of relevant components.

Posted Content
TL;DR: This work develops a data-driven approach that does not require hand-crafted rules but learns by itself the RoR, using Graph Neural Networks and a relation matrix transformer, and shows that it outperforms the state-of-the-art approaches.
Abstract: In natural language, often multiple entities appear in the same text. However, most previous works in Relation Extraction (RE) limit the scope to identifying the relation between two entities at a time. Such an approach induces a quadratic computation time, and also overlooks the interdependency between multiple relations, namely the relation of relations (RoR). Due to the significance of RoR in existing datasets, we propose a new paradigm of RE that considers as a whole the predictions of all relations in the same context. Accordingly, we develop a data-driven approach that does not require hand-crafted rules but learns by itself the RoR, using Graph Neural Networks and a relation matrix transformer. Experiments show that our model outperforms the state-of-the-art approaches by +1.12\% on the ACE05 dataset and +2.55\% on SemEval 2018 Task 7.2, which is a substantial improvement on the two competitive benchmarks.

Posted Content
TL;DR: A concise summary of the efforts of all of the communities studying Boolean Matrix Factorization is given and some open questions which in this opinion require further investigation are raised.
Abstract: The goal of Boolean Matrix Factorization (BMF) is to approximate a given binary matrix as the product of two low-rank binary factor matrices, where the product of the factor matrices is computed under the Boolean algebra. While the problem is computationally hard, it is also attractive because the binary nature of the factor matrices makes them highly interpretable. In the last decade, BMF has received a considerable amount of attention in the data mining and formal concept analysis communities and, more recently, the machine learning and the theory communities also started studying BMF. In this survey, we give a concise summary of the efforts of all of these communities and raise some open questions which in our opinion require further investigation.

Proceedings ArticleDOI
01 Jan 2020
TL;DR: A novel FPM algorithm using boolean matrix is proposed, here the transactional dataset is transformed into a boolean matrix, that further decomposed vertically into multiple loads, which improves efficiency in terms of computational time.
Abstract: Frequent Pattern Mining (FPM) plays an essential role in many data analytic tasks. In literature, many techniques have been proposed to discover interesting patterns. However, the complexity in terms of computational time is a major concern. In this paper, we have proposed a novel FPM algorithm using boolean matrix. Here the transactional dataset is transformed into a boolean matrix, that further decomposed vertically into multiple loads. Our proposed algorithm improves efficiency in terms of computational time and further it may be used in a distributed environment to deal with huge dataset.

Journal ArticleDOI
TL;DR: It is demonstrated that using dictionaries (Beylkin and Discrete Cosine Transform (DCT) may improve performance tangibly only for a high compression ratio (CR) of 80% and with smaller block sizes, as compared to using no dictionaries.
Abstract: Online monitoring of electroencephalogram (EEG) signals is challenging due to the high volume of data and power requirements. Compressed sensing (CS) may be employed to address these issues. Compressed sensing using sparse binary matrix, owing to its low power features, and reconstruction/decompression using spatiotemporal sparse Bayesian learning have been shown to constitute a robust framework for fast, energy efficient and accurate multichannel bio-signal monitoring. EEG signal, however, does not show a strong temporal correlation. Therefore, the use of sparsifying dictionaries has been proposed to exploit the sparsity in a transformed domain instead. Assuming sparsification adds values, a challenge, therefore, in employing this CS framework for the EEG signal is to identify the suitable dictionary. Using real multichannel EEG data from 15 subjects, in this paper, we systematically evaluated the performance of the framework when using various wavelet bases while considering their key attributes of number of vanishing moments and coherence with sensing matrix. We identified Beylkin as the wavelet dictionary leading to the best performance. Using the same dataset, we then compared the performance of Beylkin with discrete cosine basis, often used in the literature, and the case of using no sparsifying dictionary. We further demonstrate that using dictionaries (Beylkin and DCT) may improve performance tangibly only for a high compression ratio (CR) of 80% and with smaller block sizes; as compared to when using no dictionaries.

Journal ArticleDOI
Dong Xu1, Wu-Jun Li1
03 Apr 2020
TL;DR: A novel method, called hashing based answer selection (HAS), which adopts a hashing strategy to learn a binary matrix representation for each answer, which can dramatically reduce the memory cost for storing the matrix representations of answers.
Abstract: Answer selection is an important subtask of question answering (QA), in which deep models usually achieve better performance than non-deep models. Most deep models adopt question-answer interaction mechanisms, such as attention, to get vector representations for answers. When these interaction based deep models are deployed for online prediction, the representations of all answers need to be recalculated for each question. This procedure is time-consuming for deep models with complex encoders like BERT which usually have better accuracy than simple encoders. One possible solution is to store the matrix representation (encoder output) of each answer in memory to avoid recalculation. But this will bring large memory cost. In this paper, we propose a novel method, called hashing based answer selection (HAS), to tackle this problem. HAS adopts a hashing strategy to learn a binary matrix representation for each answer, which can dramatically reduce the memory cost for storing the matrix representations of answers. Hence, HAS can adopt complex encoders like BERT in the model, but the online prediction of HAS is still fast with a low memory cost. Experimental results on three popular answer selection datasets show that HAS can outperform existing models to achieve state-of-the-art performance.

Posted Content
TL;DR: This work proposes a binary data denoising framework, namely BIND, which optimizes the detection of true patterns by estimating the row- or column-wise mixture distribution of patterns and disparate background, and eliminating the binary attributes that are more likely from the background.
Abstract: Low rank representation of binary matrix is powerful in disentangling sparse individual-attribute associations, and has received wide applications. Existing binary matrix factorization (BMF) or co-clustering (CC) methods often assume i.i.d background noise. However, this assumption could be easily violated in real data, where heterogeneous row- or column-wise probability of binary entries results in disparate element-wise background distribution, and paralyzes the rationality of existing methods. We propose a binary data denoising framework, namely BIND, which optimizes the detection of true patterns by estimating the row- or column-wise mixture distribution of patterns and disparate background, and eliminating the binary attributes that are more likely from the background. BIND is supported by thoroughly derived mathematical property of the row- and column-wise mixture distributions. Our experiment on synthetic and real-world data demonstrated BIND effectively removes background noise and drastically increases the fairness and accuracy of state-of-the arts BMF and CC methods.

Journal ArticleDOI
TL;DR: The Rectangle Loop algorithm is proposed, a Markov chain Monte Carlo algorithm to sample binary matrices with fixed margins uniformly, and theoretical and empirical studies show it is better than the swap algorithm in Peskun's order.
Abstract: Uniform sampling of binary matrix with fixed margins is an important and difficult problem in statistics, computer science, ecology and so on. The well-known swap algorithm would be inefficient when the size of the matrix becomes large or when the matrix is too sparse/dense. Here we propose the Rectangle Loop algorithm, a Markov chain Monte Carlo algorithm to sample binary matrices with fixed margins uniformly. Theoretically the Rectangle Loop algorithm is better than the swap algorithm in Peskun’s order. Empirically studies also demonstrates the Rectangle Loop algorithm is remarkablely more efficient than the swap algorithm.

Proceedings ArticleDOI
14 Jun 2020
TL;DR: This work implements a matrix-based CFPQ algorithm by using appropriate high-performance libraries for linear algebra and integrate it with RedisGraph graph database and introduces a new CFPZ algorithm with single-path query semantics that allows us to extract one found path for each pair of nodes.
Abstract: Context-Free Path Querying (CFPQ) allows one to use context-free grammars as path constraints in navigational graph queries. Many algorithms for CFPQ were proposed but recently showed that the state-of-the-art CFPQ algorithms are still not performant enough for practical use. One promising way to achieve high-performance solutions for graph querying problems is to reduce them to linear algebra operations. Recently, there are two CFPQ solutions formulated in terms of linear algebra: the one based on the Boolean matrix multiplication operation proposed by Azimov et al. (2018) and the Kronecker product-based CFPQ algorithm proposed by Orachev et al. (2020). However, the algorithm based on matrix multiplication still does not support the most expressive all-path query semantics and cannot be truly compared with Kronecker product-based CFPQ algorithm. In this work, we introduce a new matrix-based CFPQ algorithm with all-path query semantics that allows us to extract all found paths for each pair of vertices. Also, we implement our algorithm by using appropriate high-performance libraries for linear algebra. Finally, we provide a comparison of the most performant linear algebra-based CFPQ algorithms for different query semantics.

Journal ArticleDOI
TL;DR: This research can construct new differentially 4-uniform permutations from known ones and find some clues about the existence of APN permutations of F 2 n $\mathbb {F}_{2^{n}}$ for even n ≥ 8.
Abstract: In this paper, we study the differential uniformity of the composition of two functions with the help of Boolean matrix theory. Based on the result of our research, we can construct new differentially 4-uniform permutations from known ones. In addition, we find some clues about the existence of APN permutations of $\mathbb {F}_{2^{n}}$ for even n ≥ 8.