Institution

Helsinki Institute for Information Technology

Facility•Espoo, Finland•

About: Helsinki Institute for Information Technology is a facility organization based out in Espoo, Finland. It is known for research contribution in the topics: Population & Bayesian network. The organization has 630 authors who have published 1962 publications receiving 63426 citations.

...read moreread less

Topics: Population, Bayesian network, Mobile computing, The Internet, Approximation algorithm ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Book Chapter•DOI•

Closed non-derivable itemsets

[...]

Juho Muhonen¹, Hannu Toivonen¹•Institutions (1)

Helsinki Institute for Information Technology¹

18 Sep 2006

TL;DR: A new pruning method based on combining techniques for closed and non-derivable itemsets that allows further reductions of itemsets and shows that the reduction is significant in some datasets.

...read moreread less

Abstract: Itemset mining typically results in large amounts of redundant itemsets. Several approaches such as closed itemsets, non-derivable itemsets and generators have been suggested for losslessly reducing the amount of itemsets. We propose a new pruning method based on combining techniques for closed and non-derivable itemsets that allows further reductions of itemsets. This reduction is done without loss of information, that is, the complete collection of frequent itemsets can still be derived from the collection of closed non-derivable itemsets. The number of closed non-derivable itemsets is bound both by the number of closed and the number of non-derivable itemsets, and never exceeds the smaller of these. Our experiments show that the reduction is significant in some datasets.

...read moreread less

31 citations

Proceedings Article•DOI•

Sandwich keyboard: fast ten-finger typing on a mobile device with adaptive touch sensing on the back side

[...]

Oliver Schoenleben¹, Antti Oulasvirta²•Institutions (2)

Helsinki Institute for Information Technology¹, Max Planck Society²

27 Aug 2013

TL;DR: Sandwich Keyboard is a prototype that folds any three-row keyboard layout and thus, by retaining the finger-to-letter assignment, supports transfer and the detection of key presses from finger release enhances the performance of touch-typing on a multitouch sensor.

...read moreread less

Abstract: This Note introduces a keyboard design that affords ten-finger touch typing by utilizing a touch sensor on the back side of a device. Previous work has used physical buttons. Using a touch sensor has the benefit that it retains the form factor and does not insist on a peripheral device. Moreover, any layout can be used. However, it is difficult to hit targets on a flat surface with no haptic feedback. Sandwich Keyboard is a prototype that folds any three-row keyboard layout and thus, by retaining the finger-to-letter assignment, supports transfer. Sandwich Keyboard includes an algorithm for constant adaptation of key targets in the back. We also learned that the detection of key presses from finger release enhances the performance of touch-typing on a multitouch sensor. After eight hours of training, experienced typists of the QWERTY and of the Dvorak Standard Keyboard (DSK) layout reached 26.1 and 46.2 wpm, respectively. We discuss improvements necessary for further increasing both speed and accuracy.

...read moreread less

31 citations

Journal Article•DOI•

Spatio-chromatic adaptation via higher-order canonical correlation analysis of natural images.

[...]

Michael U. Gutmann¹, Valero Laparra², Aapo Hyvärinen¹, Jesús Malo²•Institutions (2)

Helsinki Institute for Information Technology¹, University of Valencia²

12 Feb 2014-PLOS ONE

TL;DR: A statistical method which combines the desirable properties of independent component and canonical correlation analysis is proposed and is used to make a theory-driven testable prediction on how the neural response to colored patterns should change when the illumination changes.

...read moreread less

Abstract: Independent component and canonical correlation analysis are two general-purpose statistical methods with wide applicability. In neuroscience, independent component analysis of chromatic natural images explains the spatio-chromatic structure of primary cortical receptive fields in terms of properties of the visual environment. Canonical correlation analysis explains similarly chromatic adaptation to different illuminations. But, as we show in this paper, neither of the two methods generalizes well to explain both spatio-chromatic processing and adaptation at the same time. We propose a statistical method which combines the desirable properties of independent component and canonical correlation analysis: It finds independent components in each data set which, across the two data sets, are related to each other via linear or higher-order correlations. The new method is as widely applicable as canonical correlation analysis, and also to more than two data sets. We call it higher-order canonical correlation analysis. When applied to chromatic natural images, we found that it provides a single (unified) statistical framework which accounts for both spatio-chromatic processing and adaptation. Filters with spatio-chromatic tuning properties as in the primary visual cortex emerged and corresponding-colors psychophysics was reproduced reasonably well. We used the new method to make a theory-driven testable prediction on how the neural response to colored patterns should change when the illumination changes. We predict shifts in the responses which are comparable to the shifts reported for chromatic contrast habituation.

...read moreread less

30 citations

Journal Article•DOI•

Exploration and retrieval of whole-metagenome sequencing samples

[...]

Sohan Seth¹, Niko Välimäki¹, Samuel Kaski¹, Antti Honkela¹•Institutions (1)

Helsinki Institute for Information Technology¹

01 Sep 2014-Bioinformatics

TL;DR: A content-based exploration and retrieval method for whole-metagenome sequencing samples using a distributed string mining framework to efficiently extract all informative sequence k-mers from a pool of metagenomic samples and use them to measure the dissimilarity between two samples.

...read moreread less

Abstract: Motivation: Over the recent years, the field of whole metagenome shotgun sequencing has witnessed significant growth due to the highthroughput sequencing technologies that allow sequencing genomic samples cheaper, faster, and with better coverage than before. This technical advancement has initiated the trend of sequencing multiple samples in different conditions or environments to explore the similarities and dissimilarities of the microbial communities. Examples include the human microbiome project and various studies of the human intestinal tract. With the availability of ever larger databases of such measurements, finding samples similar to a given query sample is becoming a central operation. Results: In this paper, we develop a content-based exploration and retrieval method for whole metagenome sequencing samples. We apply a distributed string mining framework to efficiently extract all informative sequence k-mers from a pool of metagenomic samples and use them to measure the dissimilarity between two samples. We evaluate the performance of the proposed approach on two human gut metagenome data sets as well as human microbiome project metagenomic samples. We observe significant enrichment for diseased gut samples in results of queries with another diseased sample and very high accuracy in discriminating between different body sites even though the method is unsupervised. Availability: A software implementation of the DSM framework is available at https://github.com/HIITMetagenomics/dsm-framework. Contact: sohan.seth@hiit.fi, antti.honkela@hiit.fi

...read moreread less

30 citations

Book Chapter•DOI•

ARTEMIS: assessing the similarity of event-interval sequences

[...]

Orestis Kostakis¹, Panagiotis Papapetrou¹, Jaakko Hollmén¹•Institutions (1)

Helsinki Institute for Information Technology¹

05 Sep 2011

TL;DR: Two distance measures for comparing sequences of interval-based events are introduced which can be used for several data mining tasks such as classification and clustering and show the superiority of Artemis in terms of robustness to high levels of artificially introduced noise.

...read moreread less

Abstract: In several application domains, such as sign language, medicine, and sensor networks, events are not necessarily instantaneous but they can have a time duration. Sequences of interval-based events may contain useful domain knowledge; thus, searching, indexing, and mining such sequences is crucial. We introduce two distance measures for comparing sequences of interval-based events which can be used for several data mining tasks such as classification and clustering. The first measure maps each sequence of interval-based events to a set of vectors that hold information about all concurrent events. These sets are then compared using an existing dynamic programming method. The second method, called Artemis, finds correspondence between intervals by mapping the two sequences into a bipartite graph. Similarity is inferred by employing the Hungarian algorithm. In addition, we present a linear-time lowerbound for Artemis. The performance of both measures is tested on data from three domains: sign language, medicine, and sensor networks. Experiments show the superiority of Artemis in terms of robustness to high levels of artificially introduced noise.

...read moreread less

30 citations

Collapse

Authors

Showing all 632 results

Name	H-index	Papers	Citations
Dimitri P. Bertsekas	94	332	85939
Olli Kallioniemi	90	353	42021
Heikki Mannila	72	295	26500
Jukka Corander	66	411	17220
Jaakko Kangasjärvi	62	146	17096
Aapo Hyvärinen	61	301	44146
Samuel Kaski	58	522	14180
Nadarajah Asokan	58	327	11947
Aristides Gionis	58	292	19300
Hannu Toivonen	56	192	19316
Nicola Zamboni	53	128	11397
Jorma Rissanen	52	151	22720
Tero Aittokallio	52	271	8689
Juha Veijola	52	261	19588
Juho Hamari	51	176	16631

Network Information

Related Institutions (5)

Google

39.8K papers, 2.1M citations

93% related

Microsoft

86.9K papers, 4.1M citations

38.6K papers, 1.3M citations

92% related

Carnegie Mellon University

104.3K papers, 5.9M citations

91% related

Facebook

10.9K papers, 570.1K citations

91% related

Performance

Metrics

1,967

Papers

76,126

Citations

No. of papers from the Institution in previous years
Year	Papers
2023	1
2022	4
2021	85
2020	97
2019	140
2018	127