Institution

Helsinki Institute for Information Technology

Facility•Espoo, Finland•

About: Helsinki Institute for Information Technology is a facility organization based out in Espoo, Finland. It is known for research contribution in the topics: Population & Bayesian network. The organization has 630 authors who have published 1962 publications receiving 63426 citations.

...read moreread less

Topics: Population, Bayesian network, Mobile computing, The Internet, Approximation algorithm ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics

[...]

Michael U. Gutmann¹, Aapo Hyvärinen¹•Institutions (1)

Helsinki Institute for Information Technology¹

01 Jan 2012-Journal of Machine Learning Research

TL;DR: The basic idea is to perform nonlinear logistic regression to discriminate between the observed data and some artificially generated noise and it is shown that the new method strikes a competitive trade-off in comparison to other estimation methods for unnormalized models.

...read moreread less

Abstract: We consider the task of estimating, from observed data, a probabilistic model that is parameterized by a finite number of parameters. In particular, we are considering the situation where the model probability density function is unnormalized. That is, the model is only specified up to the partition function. The partition function normalizes a model so that it integrates to one for any choice of the parameters. However, it is often impossible to obtain it in closed form. Gibbs distributions, Markov and multi-layer networks are examples of models where analytical normalization is often impossible. Maximum likelihood estimation can then not be used without resorting to numerical approximations which are often computationally expensive. We propose here a new objective function for the estimation of both normalized and unnormalized models. The basic idea is to perform nonlinear logistic regression to discriminate between the observed data and some artificially generated noise. With this approach, the normalizing partition function can be estimated like any other parameter. We prove that the new estimation method leads to a consistent (convergent) estimator of the parameters. For large noise sample sizes, the new estimator is furthermore shown to behave like the maximum likelihood estimator. In the estimation of unnormalized models, there is a trade-off between statistical and computational performance. We show that the new method strikes a competitive trade-off in comparison to other estimation methods for unnormalized models. As an application to real data, we estimate novel two-layer models of natural image statistics with spline nonlinearities.

...read moreread less

695 citations

Journal Article•DOI•

SIRIUS 4: a rapid tool for turning tandem mass spectra into metabolite structure information.

[...]

Kai Dührkop¹, Markus Fleischauer¹, Marcus Ludwig¹, Alexander A. Aksenov², Alexey V. Melnik², Marvin Meusel³, Marvin Meusel¹, Pieter C. Dorrestein², Juho Rousu⁴, Sebastian Böcker¹ - Show less +6 more•Institutions (4)

University of Jena¹, University of Montana², Saarland University³, Helsinki Institute for Information Technology⁴

01 Apr 2019-Nature Methods

TL;DR: SIRIUS 4 is a fast and highly accurate tool for molecular structure interpretation from mass-spectrometry-based metabolomics data and integrates CSI:FingerID for searching in molecular structure databases.

...read moreread less

Abstract: Mass spectrometry is a predominant experimental technique in metabolomics and related fields, but metabolite structural elucidation remains highly challenging. We report SIRIUS 4 (https://bio.informatik.uni-jena.de/sirius/), which provides a fast computational approach for molecular structure identification. SIRIUS 4 integrates CSI:FingerID for searching in molecular structure databases. Using SIRIUS 4, we achieved identification rates of more than 70% on challenging metabolomics datasets. SIRIUS 4 is a fast and highly accurate tool for molecular structure interpretation from mass-spectrometry-based metabolomics data.

...read moreread less

620 citations

Proceedings Article•

Noise2Noise: Learning image restoration without clean data

[...]

Jaakko Lehtinen¹, Jaakko Lehtinen², Jacob Munkberg¹, Jon Hasselgren¹, Samuli Laine¹, Tero Karras¹, Miika Aittala³, Timo Aila¹ - Show less +4 more•Institutions (3)

Nvidia¹, Helsinki Institute for Information Technology², Massachusetts Institute of Technology³

01 Jan 2018

TL;DR: In this article, the authors apply basic statistical reasoning to signal reconstruction by machine learning, learning to map corrupted observations to clean signals without explicit image priors or likelihood models of the corruption, and show that a single model learns photographic noise removal, denoising synthetic Monte Carlo images, and reconstruction of undersampled MRI scans.

...read moreread less

Abstract: We apply basic statistical reasoning to signal reconstruction by machine learning -- learning to map corrupted observations to clean signals -- with a simple and powerful conclusion: it is possible to learn to restore images by only looking at corrupted examples, at performance at and sometimes exceeding training using clean data, without explicit image priors or likelihood models of the corruption. In practice, we show that a single model learns photographic noise removal, denoising synthetic Monte Carlo images, and reconstruction of undersampled MRI scans -- all corrupted by different processes -- based on noisy data only.

...read moreread less

610 citations

Journal Article•DOI•

A definition for gamification: anchoring gamification in the service marketing literature

[...]

Kai Huotari¹, Juho Hamari²•Institutions (2)

Helsinki Institute for Information Technology¹, University of Tampere²

01 Feb 2017-Electronic Markets

TL;DR: An attempt to tie in gamification with service marketing theory, which conceptualizes the consumer as a co-producer of the service as well as proposing a definition for gamification, one that emphasizes its experiential nature.

...read moreread less

Abstract: “Gamification” has gained considerable scholarly and practitioner attention; however, the discussion in academia has been largely confined to the human–computer interaction and game studies domains. Since gamification is often used in service design, it is important that the concept be brought in line with the service literature. So far, though, there has been a dearth of such literature. This article is an attempt to tie in gamification with service marketing theory, which conceptualizes the consumer as a co-producer of the service. It presents games as service systems composed of operant and operand resources. It proposes a definition for gamification, one that emphasizes its experiential nature. The definition highlights four important aspects of gamification: affordances, psychological mediators, goals of gamification and the context of gamification. Using the definition the article identifies four possible gamifying actors and examines gamification as communicative staging of the service environment.

...read moreread less

585 citations

Journal Article•DOI•

LoRDEC: accurate and efficient long read error correction.

[...]

Leena Salmela¹, Eric Rivals•Institutions (1)

Helsinki Institute for Information Technology¹

15 Dec 2014-Bioinformatics

TL;DR: LoRDEC, a hybrid error correction method that builds a succinct de Bruijn graph representing the short reads, and seeks a corrective sequence for each erroneous region in the long reads by traversing chosen paths in the graph is presented.

...read moreread less

Abstract: Motivation: PacBio single molecule real-time sequencing is a third-generation sequencing technique producing long reads, with com-paratively lower throughput and higher error rate. Errors include numerous indels and complicate downstream analysis like mapping or de novo assembly. A hybrid strategy that takes advantage of the high accuracy of second-generation short reads has been proposed for correcting long reads. Mapping of short reads on long reads pro-vides sufficient coverage to eliminate up to 99% of errors, however, at the expense of prohibitive running times and considerable amounts of disk and memory space. Results: We present LoRDEC, a hybrid error correction method that builds a succinct de Bruijn graph representing the short reads, and seeks a corrective sequence for each erroneous region in the long reads by traversing chosen paths in the graph. In comparison, LoRDEC is at least six times faster and requires at least 93% less memory or disk space than available tools, while achieving comparable accuracy.

...read moreread less

580 citations

Collapse

Authors

Showing all 632 results

Name	H-index	Papers	Citations
Dimitri P. Bertsekas	94	332	85939
Olli Kallioniemi	90	353	42021
Heikki Mannila	72	295	26500
Jukka Corander	66	411	17220
Jaakko Kangasjärvi	62	146	17096
Aapo Hyvärinen	61	301	44146
Samuel Kaski	58	522	14180
Nadarajah Asokan	58	327	11947
Aristides Gionis	58	292	19300
Hannu Toivonen	56	192	19316
Nicola Zamboni	53	128	11397
Jorma Rissanen	52	151	22720
Tero Aittokallio	52	271	8689
Juha Veijola	52	261	19588
Juho Hamari	51	176	16631

Network Information

Related Institutions (5)

Google

39.8K papers, 2.1M citations

93% related

Microsoft

86.9K papers, 4.1M citations

38.6K papers, 1.3M citations

92% related

Carnegie Mellon University

104.3K papers, 5.9M citations

91% related

Facebook

10.9K papers, 570.1K citations

91% related

Performance

Metrics

1,967

Papers

76,126

Citations

No. of papers from the Institution in previous years
Year	Papers
2023	1
2022	4
2021	85
2020	97
2019	140
2018	127