Institution

Helsinki Institute for Information Technology

Facility•Espoo, Finland•

About: Helsinki Institute for Information Technology is a facility organization based out in Espoo, Finland. It is known for research contribution in the topics: Population & Bayesian network. The organization has 630 authors who have published 1962 publications receiving 63426 citations.

...read moreread less

Topics: Population, Bayesian network, The Internet, Mobile computing, Cluster analysis ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Projective Inference in High-dimensional Problems: Prediction and Feature Selection

[...]

Juho Piironen¹, Markus Paasiniemi¹, Aki Vehtari¹•Institutions (1)

Helsinki Institute for Information Technology¹

04 Oct 2018-arXiv: Machine Learning

TL;DR: In this paper, a two-stage approach is proposed to construct a possibly non-sparse model that predicts well, and then find a minimal subset of features that characterize the predictions.

...read moreread less

Abstract: This paper discusses predictive inference and feature selection for generalized linear models with scarce but high-dimensional data. We argue that in many cases one can benefit from a decision theoretically justified two-stage approach: first, construct a possibly non-sparse model that predicts well, and then find a minimal subset of features that characterize the predictions. The model built in the first step is referred to as the \emph{reference model} and the operation during the latter step as predictive \emph{projection}. The key characteristic of this approach is that it finds an excellent tradeoff between sparsity and predictive accuracy, and the gain comes from utilizing all available information including prior and that coming from the left out features. We review several methods that follow this principle and provide novel methodological contributions. We present a new projection technique that unifies two existing techniques and is both accurate and fast to compute. We also propose a way of evaluating the feature selection process using fast leave-one-out cross-validation that allows for easy and intuitive model size selection. Furthermore, we prove a theorem that helps to understand the conditions under which the projective approach could be beneficial. The benefits are illustrated via several simulated and real world examples.

...read moreread less

47 citations

Journal Article•DOI•

Erratum to: Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

[...]

Aki Vehtari¹, Andrew Gelman², Jonah Gabry²•Institutions (2)

Helsinki Institute for Information Technology¹, Columbia University²

01 Sep 2017-Statistics and Computing

TL;DR: An efficient computation of LOO is introduced using Pareto-smoothed importance sampling (PSIS), a new procedure for regularizing importance weights, and it is demonstrated that PSIS-LOO is more robust in the finite case with weak priors or influential observations.

...read moreread less

Abstract: Leave-one-out cross-validation (LOO) and the widely applicable information criterion (WAIC) are methods for estimating pointwise out-of-sample prediction accuracy from a fitted Bayesian model using the log-likelihood evaluated at the posterior simulations of the parameter values. LOO and WAIC have various advantages over simpler estimates of predictive error such as AIC and DIC but are less used in practice because they involve additional computational steps. Here we lay out fast and stable computations for LOO and WAIC that can be performed using existing simulation draws. We introduce an efficient computation of LOO using Pareto-smoothed importance sampling (PSIS), a new procedure for regularizing importance weights. Although WAIC is asymptotically equal to LOO, we demonstrate that PSIS-LOO is more robust in the finite case with weak priors or influential observations. As a byproduct of our calculations, we also obtain approximate standard errors for estimated predictive errors and for comparison of predictive errors between two models. We implement the computations in an R package called loo and demonstrate using models fit with the Bayesian inference package Stan.

...read moreread less

47 citations

Journal Article•DOI•

Probabilistic retrieval and visualization of biologically relevant microarray experiments

[...]

José Manuel Peixoto Caldas¹, Nils Gehlenborg², Nils Gehlenborg³, Ali Faisal¹, Alvis Brazma³, Samuel Kaski¹ - Show less +2 more•Institutions (3)

Helsinki Institute for Information Technology¹, University of Cambridge², European Bioinformatics Institute³

19 Oct 2009-BMC Bioinformatics

TL;DR: Methods that allow for the search to be based upon measurement data, instead of the more customary annotation data, are introduced to retrieve experiments in which the same biological processes are activated.

...read moreread less

Abstract: Motivation: As ArrayExpress and other repositories of genome-wide experiments are reaching a mature size, it is becoming more meaningful to search for related experiments, given a particular study. We introduce methods that allow for the search to be based upon measurement data, instead of the more customary annotation data. The goal is to retrieve experiments in which the same biological processes are activated. This can be due either to experiments targeting the same biological question, or to as yet unknown relationships. Results: We use a combination of existing and new probabilistic machine learning techniques to extract information about the biological processes differentially activated in each experiment, to retrieve earlier experiments where the same processes are activated and to visualize and interpret the retrieval results. Case studies on a subset of ArrayExpress show that, with a sufficient amount of data, our method indeed finds experiments relevant to particular biological questions. Results can be interpreted in terms of biological processes using the visualization techniques. Availability: The code is available from http://www.cis.hut.fi/projects/mi/software/ismb09. Contact: jose.caldas@tkk.fi

...read moreread less

47 citations

Proceedings Article•

Non-Stationary Gaussian Process Regression with Hamiltonian Monte Carlo

[...]

Markus Heinonen, Henrik Mannerström, Juho Rousu¹, Samuel Kaski, Harri Lähdesmäki - Show less +1 more•Institutions (1)

Helsinki Institute for Information Technology¹

02 May 2016

TL;DR: In this article, a gradient-based inference method was proposed to learn the unknown function and the non-stationary model parameters, without requiring any model approximations, where all three key parameters (i.e., noise variance, signal variance and lengthscale) can be simultaneously input-dependent.

...read moreread less

Abstract: We present a novel approach for fully non-stationary Gaussian process regression (GPR), where all three key parameters -- noise variance, signal variance and lengthscale -- can be simultaneously input-dependent. We develop gradient-based inference methods to learn the unknown function and the non-stationary model parameters, without requiring any model approximations. We propose to infer full parameter posterior with Hamiltonian Monte Carlo (HMC), which conveniently extends the analytical gradient-based GPR learning by guiding the sampling with model gradients. We also learn the MAP solution from the posterior by gradient ascent. In experiments on several synthetic datasets and in modelling of temporal gene expression, the nonstationary GPR is shown to be necessary for modeling realistic input-dependent dynamics, while it performs comparably to conventional stationary or previous non-stationary GPR models otherwise.

...read moreread less

47 citations

Journal Article•DOI•

The number of Latin squares of order 11

[...]

Alexander Hulpke¹, Petteri Kaski², Patric R. J. Östergård³•Institutions (3)

Colorado State University¹, Helsinki Institute for Information Technology², Aalto University³

01 May 2011-Mathematics of Computation

TL;DR: In this paper, constructive and non-constructive techniques are employed to enumerate Latin squares and related objects, and it is established that there are (i) 2036029552582883134196099 main classes of Latin squares of order 11; (ii) 6108088657705958932053657 isomorphism classes of one-factorizations of K 11,11 ; (iii) 12216177315369229261482540 isotopy classes of normal Latin squares; (iv) 14781574551580444528

...read moreread less

Abstract: Constructive and nonconstructive techniques are employed to enumerate Latin squares and related objects. It is established that there are (i) 2036029552582883134196099 main classes of Latin squares of order 11; (ii) 6108088657705958932053657 isomorphism classes of one-factorizations of K 11,11 ; (iii) 12216177315369229261482540 isotopy classes of Latin squares of order 11; (iv) 1478157455158044452849321016 isomorphism classes of loops of order 11; and (v) 19464657391668924966791023043937578299025 isomorphism classes of quasigroups of order 11. The enumeration is constructive for the 1151666641 main classes with an autoparatopy group of order at least 3.

...read moreread less

47 citations

Collapse

Authors

Showing all 632 results

Name	H-index	Papers	Citations
Dimitri P. Bertsekas	94	332	85939
Olli Kallioniemi	90	353	42021
Heikki Mannila	72	295	26500
Jukka Corander	66	411	17220
Jaakko Kangasjärvi	62	146	17096
Aapo Hyvärinen	61	301	44146
Samuel Kaski	58	522	14180
Nadarajah Asokan	58	327	11947
Aristides Gionis	58	292	19300
Hannu Toivonen	56	192	19316
Nicola Zamboni	53	128	11397
Jorma Rissanen	52	151	22720
Tero Aittokallio	52	271	8689
Juha Veijola	52	261	19588
Juho Hamari	51	176	16631

Network Information

Related Institutions (5)

Google

39.8K papers, 2.1M citations

93% related

Microsoft

86.9K papers, 4.1M citations

38.6K papers, 1.3M citations

92% related

Carnegie Mellon University

104.3K papers, 5.9M citations

91% related

Facebook

10.9K papers, 570.1K citations

91% related

Performance

Metrics

1,967

Papers

76,126

Citations

No. of papers from the Institution in previous years
Year	Papers
2023	1
2022	4
2021	85
2020	97
2019	140
2018	127