scispace - formally typeset
J

Juha Kärkkäinen

Researcher at Helsinki Institute for Information Technology

Publications -  114
Citations -  4642

Juha Kärkkäinen is an academic researcher from Helsinki Institute for Information Technology. The author has contributed to research in topics: Suffix array & String (computer science). The author has an hindex of 33, co-authored 114 publications receiving 4416 citations. Previous affiliations of Juha Kärkkäinen include University of Helsinki & Information Technology University.

Papers
More filters
Journal ArticleDOI

Tane: An Efficient Algorithm for Discovering Functional and Approximate Dependencies

TL;DR: TANE is an efficient algorithm for finding functional dependencies from large databases based on partitioning the set of rows with respect to their attribute values, which makes testing the validity of functional dependencies fast even for a large number of tuples.
Book ChapterDOI

Simple linear work suffix array construction

TL;DR: The skew algorithm for suffix array construction over integer alphabets that can be implemented to run in linear time using integer sorting as its only nontrivial subroutine is introduced.
Journal ArticleDOI

Linear work suffix array construction

TL;DR: A generalized algorithm, DC, that allows a space-efficient implementation and, moreover, supports the choice of a space--time tradeoff and is asymptotically faster than all previous suffix tree or array construction algorithms.
Proceedings ArticleDOI

Efficient discovery of functional and approximate dependencies using partitions

TL;DR: In this article, a new approach for finding functional dependencies from large databases, based on partitioning the set of rows with respect to their attribute values, is presented, which makes the discovery of approximate functional dependencies easy and efficient, and the erroneous or exceptional rows can be identified easily.

Better Filtering with Gapped q-Grams

TL;DR: It is shown that gapped q-grams can provide orders of magnitude faster and/or more efficient filtering than contiguous q- grams and a filter parameter called threshold have to be optimized.