Showing papers on "Logical matrix published in 2020"

PDF

Open Access

Journal Article•DOI•

Image encryption algorithm for synchronously updating Boolean networks based on matrix semi-tensor product theory

[...]

Xingyuan Wang¹, Suo Gao¹•Institutions (1)

01 Jan 2020-Information Sciences

TL;DR: A Boolean network encryption algorithm for a synchronous update process is proposed, and a matrix semi-tensor product technique to generate an encrypted image in a second round of diffusion shows good security characteristics.

...read moreread less

295 citations

Journal Article•DOI•

Weakly-supervised Semantic Guided Hashing for Social Image Retrieval

[...]

Zechao Li¹, Jinhui Tang¹, Liyan Zhang², Jian Yang¹•Institutions (2)

Nanjing University of Science and Technology¹, Nanjing University of Aeronautics and Astronautics²

01 Sep 2020-International Journal of Computer Vision

TL;DR: This work proposes a novel Semantic Guided Hashing method coupled with binary matrix factorization to perform more effective nearest neighbor image search by simultaneously exploring the weakly-supervised rich community-contributed information and the underlying data structures.

...read moreread less

Abstract: Hashing has been widely investigated for large-scale image retrieval due to its search effectiveness and computation efficiency. In this work, we propose a novel Semantic Guided Hashing method coupled with binary matrix factorization to perform more effective nearest neighbor image search by simultaneously exploring the weakly-supervised rich community-contributed information and the underlying data structures. To uncover the underlying semantic information from the weakly-supervised user-provided tags, the binary matrix factorization model is leveraged for learning the binary features of images while the problem of imperfect tags is well addressed. The uncovered semantic information enables to well guide the discrete hash code learning. The underlying data structures are discovered by adaptively learning a discriminative data graph, which makes the learned hash codes preserve the meaningful neighbors. To the best of our knowledge, the proposed method is the first work that incorporates the hash code learning, the semantic information mining and the data structure discovering into one unified framework. Besides, the proposed method is extended to one deep approach for the optimal compatibility of discriminative feature learning and hash code learning. Experiments are conducted on two widely-used social image datasets and the proposed method achieves encouraging performance compared with the state-of-the-art hashing methods.

...read moreread less

129 citations

Journal Article•DOI•

Spectral Clustering by Joint Spectral Embedding and Spectral Rotation

[...]

Yanwei Pang¹, Jin Xie¹, Feiping Nie², Xuelong Li³•Institutions (3)

Tianjin University¹, Northwestern Polytechnical University², Chinese Academy of Sciences³

01 Jan 2020-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: This paper proposes a joint model to simultaneously compute the optimal real matrix and binary matrix which is an orthonormal matrix for spectral clustering and demonstrates the effectiveness of the proposed method on benchmark datasets.

...read moreread less

Abstract: Spectral clustering is an important clustering method widely used for pattern recognition and image segmentation. Classical spectral clustering algorithms consist of two separate stages: 1) solving a relaxed continuous optimization problem to obtain a real matrix followed by 2) applying ${K}$ -means or spectral rotation to round the real matrix (i.e., continuous clustering result) into a binary matrix called the cluster indicator matrix. Such a separate scheme is not guaranteed to achieve jointly optimal result because of the loss of useful information. To obtain a better clustering result, in this paper, we propose a joint model to simultaneously compute the optimal real matrix and binary matrix. The existing joint model adopts an orthonormal real matrix to approximate the orthogonal but nonorthonormal cluster indicator matrix. It is noted that only in a very special case (i.e., all clusters have the same number of samples), the cluster indicator matrix is an orthonormal matrix multiplied by a real number. The error of approximating a nonorthonormal matrix is inevitably large. To overcome the drawback, we propose replacing the nonorthonormal cluster indicator matrix with a scaled cluster indicator matrix which is an orthonormal matrix. Our method is capable of obtaining better performance because it is easy to minimize the difference between two orthonormal matrices. Experimental results on benchmark datasets demonstrate the effectiveness of the proposed method (called JSESR).

...read moreread less

57 citations

Journal Article•DOI•

Observability of Boolean networks via matrix equations

[...]

Yongyuan Yu¹, Yongyuan Yu², Min Meng³, Jun-e Feng²•Institutions (3)

Chinese Academy of Sciences¹, Shandong University², Nanyang Technological University³

01 Jan 2020-Automatica

TL;DR: From the new perspective of logical matrix equations, observability of Boolean networks (BN) is investigated and it is shown that one BN is locally observable on the set of reachable states if and only if the constructed matrix equations have a unique canonical solution.

...read moreread less

49 citations

Journal Article•DOI•

Sampled-Data State-Feedback Stabilization of Probabilistic Boolean Control Networks: A Control Lyapunov Function Approach

[...]

Jiayang Liu¹, Yang Liu¹, Yuqian Guo², Weihua Gui²•Institutions (2)

Zhejiang Normal University¹, Central South University²

01 Sep 2020-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: This article investigates the partial stabilization problem of probabilistic Boolean control networks (PBCNs) under sample-data state-feedback control (SDSFC) with a control Lyapunov function (CLF) approach and finds that the existence of a CLF is equivalent to that of SDSFC.

...read moreread less

Abstract: This article investigates the partial stabilization problem of probabilistic Boolean control networks (PBCNs) under sample-data state-feedback control (SDSFC) with a control Lyapunov function (CLF) approach. First, the probability structure matrix of the considered PBCN is represented by a Boolean matrix, based on which, a new algebraic form of the system is obtained. Second, we convert the partial stabilization problem of PBCNs into the global set stabilization one. Third, we define CLF and its structural matrix under SDSFC. It is found that the existence of a CLF is equivalent to that of SDSFC. Then, a necessary and sufficient condition is obtained for the existence of CLF under SDSFC, based on which, all possible sample-data state-feedback controllers and corresponding structural matrices of CLF are designed by two different methods. Finally, examples are given to illustrate the efficiency of the obtained results.

...read moreread less

38 citations

Journal Article•DOI•

Hadamard Matrix Guided Online Hashing

[...]

Mingbao Lin¹, Rongrong Ji¹, Hong Liu¹, Xiaoshuai Sun¹, Shen Chen¹, Qi Tian², Qi Tian¹ - Show less +3 more•Institutions (2)

Xiamen University¹, Huawei²

01 Sep 2020-International Journal of Computer Vision

TL;DR: A novel supervised online hashing scheme termed H adamard M atrix Guided O nline H ashing (HMOH) is proposed, which introduces Hadamard matrix, which is an orthogonal binary matrix built via Sylvester method, which satisfies several desired properties of hashing codes.

...read moreread less

Abstract: Online image hashing has attracted increasing research attention recently, which receives large-scale data in a streaming manner to update the hash functions on-the-fly. Its key challenge lies in the difficulty of balancing the learning timeliness and model accuracy. To this end, most works follow a supervised setting, i.e., using class labels to boost the hashing performance, which defects in two aspects: first, strong constraints, e.g., orthogonal or similarity preserving, are used, which however are typically relaxed and lead to large accuracy drops. Second, large amounts of training batches are required to learn the up-to-date hash functions, which largely increase the learning complexity. To handle the above challenges, a novel supervised online hashing scheme termed Hadamard Matrix Guided Online Hashing (HMOH) is proposed in this paper. Our key innovation lies in introducing Hadamard matrix, which is an orthogonal binary matrix built via Sylvester method. In particular, to release the need of strong constraints, we regard each column of Hadamard matrix as the target code for each class label, which by nature satisfies several desired properties of hashing codes. To accelerate the online training, LSH is first adopted to align the lengths of target code and to-be-learned binary code. We then treat the learning of hash functions as a set of binary classification problems to fit the assigned target code. Finally, extensive experiments on four widely-used benchmarks demonstrate the superior accuracy and efficiency of HMOH over various state-of-the-art methods. Codes can be available at https://github.com/lmbxmu/mycode .

...read moreread less

29 citations

Proceedings Article•DOI•

Recent Developments in Boolean Matrix Factorization

[...]

Pauli Miettinen¹, Stefan Neumann²•Institutions (2)

University of Eastern Finland¹, University of Vienna²

09 Jul 2020

TL;DR: A concise summary of the efforts of all of the communities studying Boolean Matrix Factorization is given and some open questions which in this opinion require further investigation are raised.

...read moreread less

Abstract: The goal of Boolean Matrix Factorization (BMF) is to approximate a given binary matrix as the product of two low-rank binary factor matrices, where the product of the factor matrices is computed under the Boolean algebra. While the problem is computationally hard, it is also attractive because the binary nature of the factor matrices makes them highly interpretable. In the last decade, BMF has received a considerable amount of attention in the data mining and formal concept analysis communities and, more recently, the machine learning and the theory communities also started studying BMF. In this survey, we give a concise summary of the efforts of all of these communities and raise some open questions which in our opinion require further investigation.

...read moreread less

24 citations

Journal Article•DOI•

Incremental three-way neighborhood approach for dynamic incomplete hybrid data

[...]

Qianqian Huang¹, Tianrui Li¹, Yanyong Huang², Xin Yang²•Institutions (2)

Southwest Jiaotong University¹, Southwestern University of Finance and Economics²

01 Dec 2020-Information Sciences

TL;DR: A matrix-based dynamic framework for updating three-way regions (positive, boundary and negative regions) in TWNDM based on the data-driven neighborhood relation in terms of two pseudo-distance functions only satisfying the reflexivity is presented.

...read moreread less

24 citations

Journal Article•DOI•

Compressed Sensing Using Binary Matrices of Nearly Optimal Dimensions

[...]

Mahsa Lotfi¹, Mathukumalli Vidyasagar¹•Institutions (1)

University of Texas at Dallas¹

27 Apr 2020-IEEE Transactions on Signal Processing

TL;DR: In this article, the problem of compressed sensing using binary measurement matrices and base-pursuit (basis pursuit) as the recovery algorithm is studied and upper and lower bounds on the number of measurements required to achieve robust sparse recovery with binary matrices are derived.

...read moreread less

Abstract: In this paper, we study the problem of compressed sensing using binary measurement matrices and $\ell _1$ -norm minimization (basis pursuit) as the recovery algorithm. We derive new upper and lower bounds on the number of measurements to achieve robust sparse recovery with binary matrices. We establish sufficient conditions for a column-regular binary matrix to satisfy the robust null space property (RNSP) and show that the associated sufficient conditions for robust sparse recovery obtained using the RNSP are better by a factor of $(3 \sqrt{3})/2 \approx 2.6$ compared to the sufficient conditions obtained using the restricted isometry property (RIP). Next we derive universal lower bounds on the number of measurements that any binary matrix needs to have in order to satisfy the weaker sufficient condition based on the RNSP and show that bipartite graphs of girth six are optimal. Then we display two classes of binary matrices, namely parity check matrices of array codes and Euler squares, which have girth six and are nearly optimal in the sense of almost satisfying the lower bound. In principle, randomly generated Gaussian measurement matrices are “order-optimal.” So we compare the phase transition behavior of the basis pursuit formulation using binary array codes and Gaussian matrices and show that (i) there is essentially no difference between the phase transition boundaries in the two cases and (ii) the CPU time of basis pursuit with binary matrices is hundreds of times faster than with Gaussian matrices and the storage requirements are less. Therefore it is suggested that binary matrices are a viable alternative to Gaussian matrices for compressed sensing using basis pursuit.

...read moreread less

23 citations

Journal Article•DOI•

Parameterized Low-Rank Binary Matrix Approximation

[...]

Fedor V. Fomin¹, Petr A. Golovach¹, Fahad Panolan²•Institutions (2)

University of Bergen¹, Indian Institute of Technology, Hyderabad²

01 Mar 2020-Data Mining and Knowledge Discovery

TL;DR: This work starts the systematic algorithmic study of low-rank binary matrix approximation from the perspective of parameterized complexity and shows in which cases and under what conditions the problem is fixed-parameter tractable, admits a polynomial kernel and can be solved in parameterized subexponential time.

...read moreread less

Abstract: Low-rank binary matrix approximation is a generic problem where one seeks a good approximation of a binary matrix by another binary matrix with some specific properties. A good approximation means that the difference between the two matrices in some matrix norm is small. The properties of the approximation binary matrix could be: a small number of different columns, a small binary rank or a small Boolean rank. Unfortunately, most variants of these problems are NP-hard. Due to this, we initiate the systematic algorithmic study of low-rank binary matrix approximation from the perspective of parameterized complexity. We show in which cases and under what conditions the problem is fixed-parameter tractable, admits a polynomial kernel and can be solved in parameterized subexponential time.

...read moreread less

19 citations

Posted Content•

Quantum Approximate Optimization for Hard Problems in Linear Algebra

[...]

Ajinkya Borle¹, Vincent E. Elfving, Samuel J. Lomonaco¹•Institutions (1)

University of Maryland, Baltimore County¹

27 Jun 2020-arXiv: Quantum Physics

TL;DR: Simulated annealing can outperform QAOA for BLLS at a QAoa depth of p\leq3p≤3 for the probability of sampling the ground state, and some of the challenges involved in current-day experimental implementations of this technique on cloud-based quantum computers are pointed out.

...read moreread less

Abstract: The Quantum Approximate Optimization Algorithm (QAOA) by Farhi et al is a quantum computational framework for solving quantum or classical optimization tasks Here, we explore using QAOA for Binary Linear Least Squares (BLLS); a problem that can serve as a building block of several other hard problems in linear algebra, such as the Non-negative Binary Matrix Factorization (NBMF) and other variants of the Non-negative Matrix Factorization (NMF) problem Most of the previous efforts in quantum computing for solving these problems were done using the quantum annealing paradigm For the scope of this work, our experiments were done on noiseless quantum simulators, a simulator including a device-realistic noise-model, and two IBM Q 5-qubit machines We highlight the possibilities of using QAOA and QAOA-like variational algorithms for solving such problems, where trial solutions can be obtained directly as samples, rather than being amplitude-encoded in the quantum wavefunction Our numerics show that Simulated Annealing can outperform QAOA for BLLS at a QAOA depth of $p\leq3$ for the probability of sampling the ground state Finally, we point out some of the challenges involved in current-day experimental implementations of this technique on cloud-based quantum computers

...read moreread less

Journal Article•DOI•

Energy analysis of Internet of things data mining algorithm for smart green communication networks

[...]

Ziping Du

15 Feb 2020-Computer Communications

TL;DR: An improved parameter adaptive and regional query density clustering algorithm is proposed, which can effectively delete the redundant data in the high-level complex data space on the premise of retaining the internal nonlinear structure of the IoT data.

...read moreread less

Journal Article•DOI•

Dynamic maintenance of rough approximations in multi-source hybrid information systems

[...]

Yanyong Huang¹, Yanyong Huang², Tianrui Li², Chuan Luo³, Hamido Fujita⁴, Hamido Fujita⁵, Hamido Fujita⁶, Shi-Jinn Horng⁷, Shi-Jinn Horng², Bin Wang² - Show less +6 more•Institutions (7)

Southwestern University of Finance and Economics¹, Southwest Jiaotong University², Sichuan University³, Iwate Prefectural University⁴, Ho Chi Minh City University of Technology⁵, University of Granada⁶, National Taiwan University of Science and Technology⁷

01 Aug 2020-Information Sciences

TL;DR: This paper presents an extended rough set model, named as multi-source composite rough sets (MCRS), by integrating different types of attributes and fusing multiple composite relations derived from different information sources, and develops the incremental algorithms for updating composite rough approximations.

...read moreread less

Journal Article•DOI•

Fast and Efficient Boolean Matrix Factorization by Geometric Segmentation

[...]

Changlin Wan¹, Wennan Chang¹, Tong Zhao², Mengya Li³, Sha Cao³, Chi Zhang³ - Show less +2 more•Institutions (3)

Purdue University¹, Amazon.com², Indiana University³

03 Apr 2020

TL;DR: MEBF (Median Expansion for Boolean Factorization) demonstrated superior performances in lower reconstruction error, and higher computational efficiency, as well as more accurate density patterns than popular methods such as ASSO, PANDA and Message Passing.

...read moreread less

Abstract: Boolean matrix has been used to represent digital information in many fields, including bank transaction, crime records, natural language processing, protein-protein interaction, etc. Boolean matrix factorization (BMF) aims to find an approximation of a binary matrix as the Boolean product of two low rank Boolean matrices, which could generate vast amount of information for the patterns of relationships between the features and samples. Inspired by binary matrix permutation theories and geometric segmentation, we developed a fast and efficient BMF approach, called MEBF (Median Expansion for Boolean Factorization). Overall, MEBF adopted a heuristic approach to locate binary patterns presented as submatrices that are dense in 1's. At each iteration, MEBF permutates the rows and columns such that the permutated matrix is approximately Upper Triangular-Like (UTL) with so-called Simultaneous Consecutive-ones Property (SC1P). The largest submatrix dense in 1 would lie on the upper triangular area of the permutated matrix, and its location was determined based on a geometric segmentation of a triangular. We compared MEBF with other state of the art approaches on data scenarios with different density and noise levels. MEBF demonstrated superior performances in lower reconstruction error, and higher computational efficiency, as well as more accurate density patterns than popular methods such as ASSO, PANDA and Message Passing. We demonstrated the application of MEBF on both binary and non-binary data sets, and revealed its further potential in knowledge retrieving and data denoising.

...read moreread less

Journal Article•DOI•

Stabilization of probabilistic finite automata based on semi-tensor product of matrices

[...]

Zhipeng Zhang¹, Zengqiang Chen¹, Xiaoguang Han², Zhongxin Liu¹•Institutions (2)

Nankai University¹, Tianjin University of Science and Technology²

01 Jun 2020-Journal of The Franklin Institute-engineering and Applied Mathematics

TL;DR: In this paper, the stabilization with a probability of one for PFAs is investigated under the framework of semi-tensor product (STP) of matrices, and an optimal state feedback controller is established by a constructive logical matrix.

...read moreread less

Abstract: Probabilistic finite automata (PFAs) can exhibit a stochastic behavior, and its stabilization is the basic and important problem in the control theory. In this paper, the stabilization with a probability of one for PFAs is investigated under the framework of semi-tensor product (STP) of matrices. First, some specific state is selected to be marked for PFAs, and using the STP of matrices, a necessary and sufficient condition to verify its controllability with a probability of one is derived in a simple and convenient way. Second, based on the controllability condition, a novel sequence of prereachability set is defined, and a necessary and sufficient condition for stabilization with a probability of one is provided by discussing the prereachability set. Meanwhile, an optimal state feedback controller is established by a constructive logical matrix. In the end, an example is illustrated to demonstrate the obtained results.

...read moreread less

Journal Article•DOI•

MetaBMF: a scalable binning algorithm for large-scale reference-free metagenomic studies.

[...]

Terry Ma¹, Di Xiao¹, Xin Xing²•Institutions (2)

University of Georgia¹, Harvard University²

15 Jan 2020-Bioinformatics

TL;DR: The proposed metagenomic binning method, MetaBMF, can not only bin DNA fragments accurately at a species level but also at a strain level, and can accurately identify the Shiga-toxigenic E. coli O104:H4 strain which led to the 2011 German E.coli outbreak.

...read moreread less

Abstract: Motivation Metagenomics studies microbial genomes in an ecosystem such as the gastrointestinal tract of a human. Identification of novel microbial species and quantification of their distributional variations among different samples that are sequenced using next-generation-sequencing technology hold the key to the success of most metagenomic studies. To achieve these goals, we propose a simple yet powerful metagenomic binning method, MetaBMF. The method does not require prior knowledge of reference genomes and produces highly accurate results, even at a strain level. Thus, it can be broadly used to identify disease-related microbial organisms that are not well-studied. Results Mathematically, we count the number of mapped reads on each assembled genomic fragment cross different samples as our input matrix and propose a scalable stratified angle regression algorithm to factorize this count matrix into a product of a binary matrix and a nonnegative matrix. The binary matrix can be used to separate microbial species and the nonnegative matrix quantifies the species distributions in different samples. In simulation and empirical studies, we demonstrate that MetaBMF has a high binning accuracy. It can not only bin DNA fragments accurately at a species level but also at a strain level. As shown in our example, we can accurately identify the Shiga-toxigenic Escherichia coli O104: H4 strain which led to the 2011 German E.coli outbreak. Our efforts in these areas should lead to (i) fundamental advances in metagenomic binning, (ii) development and refinement of technology for the rapid identification and quantification of microbial distributions and (iii) finding of potential probiotics or reliable pathogenic bacterial strains. Availability and implementation The software is available at https://github.com/didi10384/MetaBMF.

...read moreread less

Journal Article•DOI•

Algorithms and Applications to Weighted Rank-one Binary Matrix Factorization

[...]

Haibing Lu, Xi Chen, Junmin Shi¹, Jaideep Vaidya², Vijayalakshmi Atluri², Yuan Hong³, Wei Huang⁴ - Show less +3 more•Institutions (4)

New Jersey Institute of Technology¹, University of Washington², Illinois Institute of Technology³, Xi'an Jiaotong University⁴

03 May 2020

TL;DR: By systematically running weighted rank-one binary matrix factorization, one can effectively perform various binary data analysis tasks, like compression, clustering, and pattern discovery.

...read moreread less

Abstract: Many applications use data that are better represented in the binary matrix form, such as click-stream data, market basket data, document-term data, user-permission data in access control, and others. Matrix factorization methods have been widely used tools for the analysis of high-dimensional data, as they automatically extract sparse and meaningful features from data vectors. However, existing matrix factorization methods do not work well for the binary data. One crucial limitation is interpretability, as many matrix factorization methods decompose an input matrix into matrices with fractional or even negative components, which are hard to interpret in many real settings. Some matrix factorization methods, like binary matrix factorization, do limit decomposed matrices to binary values. However, these models are not flexible to accommodate some data analysis tasks, like trading off summary size with quality and discriminating different types of approximation errors. To address those issues, this article presents weighted rank-one binary matrix factorization, which is to approximate a binary matrix by the product of two binary vectors, with parameters controlling different types of approximation errors. By systematically running weighted rank-one binary matrix factorization, one can effectively perform various binary data analysis tasks, like compression, clustering, and pattern discovery. Theoretical properties on weighted rank-one binary matrix factorization are investigated and its connection to problems in other research domains are examined. As weighted rank-one binary matrix factorization in general is NP-hard, efficient and effective algorithms are presented. Extensive studies on applications of weighted rank-one binary matrix factorization are also conducted.

...read moreread less

Journal Article•DOI•

Measurement and Computation of Profile Similarity of Workflow Nets Based on Behavioral Relation Matrix

[...]

Mimi Wang¹, Zhijun Ding¹, Guanjun Liu¹, Changjun Jiang¹, MengChu Zhou² - Show less +1 more•Institutions (2)

Tongji University¹, New Jersey Institute of Technology²

01 Oct 2020-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: By refining the interleaving relation in a behavioral profile into six types, this paper proposes the notion of a relation profile based on behavioral profile, and presents a new formula to measure the behavior similarity of two WF-nets.

...read moreread less

Abstract: This paper focuses on the behavior similarity of workflow nets (WF-nets) The similarity of two WF-nets reflects their consistent degree in behaviors It explores the behavioral relations of subsets of transitions based on the interleaving semantics, and more accurate relations are defined than the existing work Therefore, a more accurate similarity of two WF-nets (in their behaviors) can be obtained than that in the existing work that usually do not consider the loop and complex correspondence By refining the interleaving relation in a behavioral profile into six types, this paper proposes the notion of a relation profile based on behavioral profile Based on the relation profile of a WF-net, behavioral relation matrix can be constructed Additionally, we refine the complex correspondence and generate a group of behavioral relation submatrices from the behavioral relation matrix By using them we present a new formula to measure the behavior similarity of two WF-nets Finally, examples illustrate that our method can measure the similarity degree more accurately

...read moreread less

Journal Article•DOI•

Low-weight superimposed codes and related combinatorial structures: Bounds and applications

[...]

Luisa Gargano¹, Adele A. Rescigno¹, Ugo Vaccaro¹•Institutions (1)

University of Salerno¹

02 Feb 2020-Theoretical Computer Science

TL;DR: This work studies the problem of constructing superimposed codes that have the additional constraints that the number of 1's in each column of the matrix is constant, and equal to an input parameter w, and improves on the known literature in the area.

...read moreread less

Journal Article•DOI•

The algebra of Boolean matrices, correspondence functors, and simplicity

[...]

Serge Bouc, Jacques Thévenaz¹•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

06 Oct 2020

TL;DR: In this paper, the dimension of a simple module for the algebra of the monoid of all relations on a finite set was determined, which is in fact the same question as the determination of the dimension for every evaluation of every simple correspondence functor.

...read moreread less

Abstract: We determine the dimension of every simple module for the algebra of the monoid of all relations on a finite set (i.e. Boolean matrices). This is in fact the same question as the determination of the dimension of every evaluation of a simple correspondence functor. The method uses the theory of such functors developed in [BT2, BT3], as well as some new ingredients in the theory of finite lattices.

...read moreread less

Journal Article•DOI•

Bayesian mean-parameterized nonnegative binary matrix factorization

[...]

Alberto Lumbreras, Louis Filstroff¹, Cédric Févotte¹•Institutions (1)

University of Toulouse¹

01 Nov 2020-Data Mining and Knowledge Discovery

TL;DR: This work proposes a unified framework for Bayesian mean-parameterized nonnegative binary matrix factorization models (NBMF) and derives a novel collapsed Gibbs sampler and a collapsed variational algorithm to infer the posterior distribution of the factors.

...read moreread less

Abstract: Binary data matrices can represent many types of data such as social networks, votes, or gene expression. In some cases, the analysis of binary matrices can be tackled with nonnegative matrix factorization (NMF), where the observed data matrix is approximated by the product of two smaller nonnegative matrices. In this context, probabilistic NMF assumes a generative model where the data is usually Bernoulli-distributed. Often, a link function is used to map the factorization to the [0, 1] range, ensuring a valid Bernoulli mean parameter. However, link functions have the potential disadvantage to lead to uninterpretable models. Mean-parameterized NMF, on the contrary, overcomes this problem. We propose a unified framework for Bayesian mean-parameterized nonnegative binary matrix factorization models (NBMF). We analyze three models which correspond to three possible constraints that respect the mean-parameterization without the need for link functions. Furthermore, we derive a novel collapsed Gibbs sampler and a collapsed variational algorithm to infer the posterior distribution of the factors. Next, we extend the proposed models to a nonparametric setting where the number of used latent dimensions is automatically driven by the observed data. We analyze the performance of our NBMF methods in multiple datasets for different tasks such as dictionary learning and prediction of missing data. Experiments show that our methods provide similar or superior results than the state of the art, while automatically detecting the number of relevant components.

...read moreread less

Posted Content•

Relation of the Relations: A New Paradigm of the Relation Extraction Problem

[...]

Zhijing Jin¹, Yongyi Yang², Xipeng Qiu², Zheng Zhang³•Institutions (3)

University of Hong Kong¹, Fudan University², Amazon.com³

05 Jun 2020-arXiv: Computation and Language

TL;DR: This work develops a data-driven approach that does not require hand-crafted rules but learns by itself the RoR, using Graph Neural Networks and a relation matrix transformer, and shows that it outperforms the state-of-the-art approaches.

...read moreread less

Abstract: In natural language, often multiple entities appear in the same text. However, most previous works in Relation Extraction (RE) limit the scope to identifying the relation between two entities at a time. Such an approach induces a quadratic computation time, and also overlooks the interdependency between multiple relations, namely the relation of relations (RoR). Due to the significance of RoR in existing datasets, we propose a new paradigm of RE that considers as a whole the predictions of all relations in the same context. Accordingly, we develop a data-driven approach that does not require hand-crafted rules but learns by itself the RoR, using Graph Neural Networks and a relation matrix transformer. Experiments show that our model outperforms the state-of-the-art approaches by +1.12\% on the ACE05 dataset and +2.55\% on SemEval 2018 Task 7.2, which is a substantial improvement on the two competitive benchmarks.

...read moreread less

Posted Content•

Recent Developments in Boolean Matrix Factorization

[...]

Pauli Miettinen¹, Stefan Neumann²•Institutions (2)

University of Eastern Finland¹, Royal Institute of Technology²

05 Dec 2020-arXiv: Learning

...read moreread less

Proceedings Article•DOI•

A Boolean Load-Matrix Based Frequent Pattern Mining Algorithm

[...]

Anasuya Sahoo¹, Rajiv Senapati¹•Institutions (1)

Gandhi Institute of Engineering and Technology¹

01 Jan 2020

TL;DR: A novel FPM algorithm using boolean matrix is proposed, here the transactional dataset is transformed into a boolean matrix, that further decomposed vertically into multiple loads, which improves efficiency in terms of computational time.

...read moreread less

Abstract: Frequent Pattern Mining (FPM) plays an essential role in many data analytic tasks. In literature, many techniques have been proposed to discover interesting patterns. However, the complexity in terms of computational time is a major concern. In this paper, we have proposed a novel FPM algorithm using boolean matrix. Here the transactional dataset is transformed into a boolean matrix, that further decomposed vertically into multiple loads. Our proposed algorithm improves efficiency in terms of computational time and further it may be used in a distributed environment to deal with huge dataset.

...read moreread less

Journal Article•DOI•

Dictionary selection for Compressed Sensing of EEG signals using sparse binary matrix and spatiotemporal sparse Bayesian learning

[...]

Manika Rani Dey¹, Arsam Shiraz², Saeed Sharif¹, Jaswinder Lota¹, Andreas Demosthenous² - Show less +1 more•Institutions (2)

University of East London¹, University College London²

29 Oct 2020-Biomedical Physics & Engineering Express

TL;DR: It is demonstrated that using dictionaries (Beylkin and Discrete Cosine Transform (DCT) may improve performance tangibly only for a high compression ratio (CR) of 80% and with smaller block sizes, as compared to using no dictionaries.

...read moreread less

Abstract: Online monitoring of electroencephalogram (EEG) signals is challenging due to the high volume of data and power requirements. Compressed sensing (CS) may be employed to address these issues. Compressed sensing using sparse binary matrix, owing to its low power features, and reconstruction/decompression using spatiotemporal sparse Bayesian learning have been shown to constitute a robust framework for fast, energy efficient and accurate multichannel bio-signal monitoring. EEG signal, however, does not show a strong temporal correlation. Therefore, the use of sparsifying dictionaries has been proposed to exploit the sparsity in a transformed domain instead. Assuming sparsification adds values, a challenge, therefore, in employing this CS framework for the EEG signal is to identify the suitable dictionary. Using real multichannel EEG data from 15 subjects, in this paper, we systematically evaluated the performance of the framework when using various wavelet bases while considering their key attributes of number of vanishing moments and coherence with sensing matrix. We identified Beylkin as the wavelet dictionary leading to the best performance. Using the same dataset, we then compared the performance of Beylkin with discrete cosine basis, often used in the literature, and the case of using no sparsifying dictionary. We further demonstrate that using dictionaries (Beylkin and DCT) may improve performance tangibly only for a high compression ratio (CR) of 80% and with smaller block sizes; as compared to when using no dictionaries.

...read moreread less

Journal Article•DOI•

Hashing Based Answer Selection

[...]

Dong Xu¹, Wu-Jun Li¹•Institutions (1)

Nanjing University¹

03 Apr 2020

TL;DR: A novel method, called hashing based answer selection (HAS), which adopts a hashing strategy to learn a binary matrix representation for each answer, which can dramatically reduce the memory cost for storing the matrix representations of answers.

...read moreread less

Abstract: Answer selection is an important subtask of question answering (QA), in which deep models usually achieve better performance than non-deep models. Most deep models adopt question-answer interaction mechanisms, such as attention, to get vector representations for answers. When these interaction based deep models are deployed for online prediction, the representations of all answers need to be recalculated for each question. This procedure is time-consuming for deep models with complex encoders like BERT which usually have better accuracy than simple encoders. One possible solution is to store the matrix representation (encoder output) of each answer in memory to avoid recalculation. But this will bring large memory cost. In this paper, we propose a novel method, called hashing based answer selection (HAS), to tackle this problem. HAS adopts a hashing strategy to learn a binary matrix representation for each answer, which can dramatically reduce the memory cost for storing the matrix representations of answers. Hence, HAS can adopt complex encoders like BERT in the model, but the online prediction of HAS is still fast with a low memory cost. Experimental results on three popular answer selection datasets show that HAS can outperform existing models to achieve state-of-the-art performance.

...read moreread less

Posted Content•

Denoising individual bias for a fairer binary submatrix detection

[...]

Changlin Wan, Wennan Chang, Tong Zhao, Sha Cao, Chi Zhang - Show less +1 more

31 Jul 2020-arXiv: Learning

TL;DR: This work proposes a binary data denoising framework, namely BIND, which optimizes the detection of true patterns by estimating the row- or column-wise mixture distribution of patterns and disparate background, and eliminating the binary attributes that are more likely from the background.

...read moreread less

Abstract: Low rank representation of binary matrix is powerful in disentangling sparse individual-attribute associations, and has received wide applications. Existing binary matrix factorization (BMF) or co-clustering (CC) methods often assume i.i.d background noise. However, this assumption could be easily violated in real data, where heterogeneous row- or column-wise probability of binary entries results in disparate element-wise background distribution, and paralyzes the rationality of existing methods. We propose a binary data denoising framework, namely BIND, which optimizes the detection of true patterns by estimating the row- or column-wise mixture distribution of patterns and disparate background, and eliminating the binary attributes that are more likely from the background. BIND is supported by thoroughly derived mathematical property of the row- and column-wise mixture distributions. Our experiment on synthetic and real-world data demonstrated BIND effectively removes background noise and drastically increases the fairness and accuracy of state-of-the arts BMF and CC methods.

...read moreread less

Journal Article•DOI•

A fast MCMC algorithm for the uniform sampling of binary matrices with fixed margins

[...]

Guanyang Wang¹•Institutions (1)

Stanford University¹

01 Jan 2020-Electronic Journal of Statistics

TL;DR: The Rectangle Loop algorithm is proposed, a Markov chain Monte Carlo algorithm to sample binary matrices with fixed margins uniformly, and theoretical and empirical studies show it is better than the swap algorithm in Peskun's order.

...read moreread less

Abstract: Uniform sampling of binary matrix with fixed margins is an important and difficult problem in statistics, computer science, ecology and so on. The well-known swap algorithm would be inefficient when the size of the matrix becomes large or when the matrix is too sparse/dense. Here we propose the Rectangle Loop algorithm, a Markov chain Monte Carlo algorithm to sample binary matrices with fixed margins uniformly. Theoretically the Rectangle Loop algorithm is better than the swap algorithm in Peskun’s order. Empirically studies also demonstrates the Rectangle Loop algorithm is remarkablely more efficient than the swap algorithm.

...read moreread less

Proceedings Article•DOI•

Context-Free Path Querying with Single-Path Semantics by Matrix Multiplication

[...]

Arseniy Terekhov¹, Artyom Khoroshev², Rustam Azimov¹, Semyon V. Grigorev¹•Institutions (2)

Saint Petersburg State University¹, Saint Petersburg State University of Information Technologies, Mechanics and Optics²

14 Jun 2020

TL;DR: This work implements a matrix-based CFPQ algorithm by using appropriate high-performance libraries for linear algebra and integrate it with RedisGraph graph database and introduces a new CFPZ algorithm with single-path query semantics that allows us to extract one found path for each pair of nodes.

...read moreread less

Abstract: Context-Free Path Querying (CFPQ) allows one to use context-free grammars as path constraints in navigational graph queries. Many algorithms for CFPQ were proposed but recently showed that the state-of-the-art CFPQ algorithms are still not performant enough for practical use. One promising way to achieve high-performance solutions for graph querying problems is to reduce them to linear algebra operations. Recently, there are two CFPQ solutions formulated in terms of linear algebra: the one based on the Boolean matrix multiplication operation proposed by Azimov et al. (2018) and the Kronecker product-based CFPQ algorithm proposed by Orachev et al. (2020). However, the algorithm based on matrix multiplication still does not support the most expressive all-path query semantics and cannot be truly compared with Kronecker product-based CFPQ algorithm. In this work, we introduce a new matrix-based CFPQ algorithm with all-path query semantics that allows us to extract all found paths for each pair of vertices. Also, we implement our algorithm by using appropriate high-performance libraries for linear algebra. Finally, we provide a comparison of the most performant linear algebra-based CFPQ algorithms for different query semantics.

...read moreread less

Journal Article•DOI•

Differential uniformity of the composition of two functions

[...]

Li Shuai¹, Lina Wang¹, Li Miao¹, Xianwei Zhou¹•Institutions (1)

University of Science and Technology Beijing¹

01 Mar 2020-Cryptography and Communications

TL;DR: This research can construct new differentially 4-uniform permutations from known ones and find some clues about the existence of APN permutations of F 2 n $\mathbb {F}_{2^{n}}$ for even n ≥ 8.

...read moreread less

Abstract: In this paper, we study the differential uniformity of the composition of two functions with the help of Boolean matrix theory. Based on the result of our research, we can construct new differentially 4-uniform permutations from known ones. In addition, we find some clues about the existence of APN permutations of $\mathbb {F}_{2^{n}}$ for even n ≥ 8.

...read moreread less