scispace - formally typeset
Open AccessProceedings ArticleDOI

Dynamic Boolean Matrix Factorizations

Pauli Miettinen
- pp 519-528
Reads0
Chats0
TLDR
This paper proposes a method to dynamically update the Boolean matrix factorization when new data is added to the data base and is extended with a mechanism to improve the factorization with a trade-off in speed of computation.
Abstract
Boolean matrix factorization is a method to decompose a binary matrix into two binary factor matrices. Akin to other matrix factorizations, the factor matrices can be used for various data analysis tasks. Many (if not most) real-world data sets are dynamic, though, meaning that new information is recorded over time. Incorporating this new information into the factorization can require a re-computation of the factorization -- something we cannot do if we want to keep our factorization up-to-date after each update. This paper proposes a method to dynamically update the Boolean matrix factorization when new data is added to the data base. This method is extended with a mechanism to improve the factorization with a trade-off in speed of computation. The method is tested with a number of real-world and synthetic data sets including studying its efficiency against off-line methods. The results show that with good initialization the proposed online and dynamic methods can beat the state-of-the-art offline Boolean matrix factorization algorithms.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

The Discriminative Discrete Basis Problem: Definitions, Algorithms, Benchmarking, and Application to Brain's Functional Dynamics

TL;DR: In this article , the discriminative discrete basis problem (DDBP) is defined and an ensemble of new algorithms is proposed to find solutions for these classes of the DDBP, which can be used in a variety of data mining and feature selection applications.
References
More filters
Journal ArticleDOI

Indexing by Latent Semantic Analysis

TL;DR: A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.
Journal ArticleDOI

Paper: Modeling by shortest data description

Jorma Rissanen
- 01 Sep 1978 - 
TL;DR: The number of digits it takes to write down an observed sequence x1,...,xN of a time series depends on the model with its parameters that one assumes to have generated the observed data.
Book

Probability and Computing: Randomized Algorithms and Probabilistic Analysis

TL;DR: Preface 1. Events and probability 2. Discrete random variables and expectation 3. Moments and deviations 4. Chernoff bounds 5. Balls, bins and random graphs 6. Probabilistic method 7. Markov chains and random walks 8. Continuous distributions and the Poisson process
Journal ArticleDOI

Cuckoo hashing

TL;DR: In this paper, a simple dictionary with worst case constant lookup time was presented, equaling the theoretical performance of the classic dynamic perfect hashing scheme of Dietzfelbinger et al.
Journal Article

Probability and computing: randomized algorithms and probabilistic analysis.

TL;DR: For many applications, a randomized algorithm is often the simplest algorithm available, the fastest, or both.
Related Papers (5)