Institution

Helsinki Institute for Information Technology

Facility•Espoo, Finland•

About: Helsinki Institute for Information Technology is a facility organization based out in Espoo, Finland. It is known for research contribution in the topics: Population & Bayesian network. The organization has 630 authors who have published 1962 publications receiving 63426 citations.

...read moreread less

Topics: Population, Bayesian network, The Internet, Mobile computing, Cluster analysis ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Fundamentals and Recent Developments in Approximate Bayesian Computation

[...]

Jarno Lintusaari¹, Michael U. Gutmann², Ritabrata Dutta², Samuel Kaski², Jukka Corander² - Show less +1 more•Institutions (2)

Aalto University¹, Helsinki Institute for Information Technology²

19 Oct 2016-Systematic Biology

TL;DR: Approximate Bayesian computation refers to a family of algorithms for approximate inference that makes a minimal set of assumptions by only requiring that sampling from a model is possible.

...read moreread less

Abstract: Bayesian inference plays an important role in phylogenetics, evolutionary biology, and in many other branches of science. It provides a principled framework for dealing with uncertainty and quantifying how it changes in the light of new evidence. For many complex models and inference problems, however, only approximate quantitative answers are obtainable. Approximate Bayesian computation (ABC) refers to a family of algorithms for approximate inference that makes a minimal set of assumptions by only requiring that sampling from a model is possible. We explain here the fundamentals of ABC, review the classical algorithms, and highlight recent developments. [ABC; approximate Bayesian computation; Bayesian inference; likelihood-free inference; phylogenetics; simulator-based models; stochastic simulation models; tree-based models.]

...read moreread less

221 citations

Journal Article•DOI•

Computational pan-genomics: status, promises and challenges.

[...]

Tobias Marschall¹, Manja Marz², Manja Marz¹, Thomas Abeel³, Louis Dijkstra, Bas E. Dutilh⁴, Ali Ghaffaari⁵, Ali Ghaffaari¹, Paul Kersey⁶, Wigard P. Kloosterman, Veli Mäkinen⁷, Adam M. Novak⁸, Benedict Paten⁸, David Porubsky, Eric Rivals, Can Alkan, Jasmijn A. Baaijens, Paul I.W. de Bakker, Valentina Boeva, Raoul J. P. Bonnal, Francesca Chiaromonte, Rayan Chikhi⁹, Francesca D. Ciccarelli, Robin Cijvat, Erwin Datema, Cornelia M. van Duijn, Evan E. Eichler⁸, Evan E. Eichler¹⁰, Corinna Ernst, Eleazar Eskin, Erik Garrison¹¹, Mohammed El-Kebir, Gunnar W. Klau, Jan O. Korbel¹¹, Eric-Wubbo Lameijer¹², Benjamin Langmead, Marcel Martin, Paul Medvedev¹³, John C. Mu¹⁴, Pieter B. Neerincx¹⁵, Klaasjan G. Ouwens, Pierre Peterlongo, Nadia Pisanti, Sven Rahmann, Ben Raphael, Knut Reinert, Dick de Ridder¹⁶, Jeroen de Ridder¹⁷, Matthias Schlesner, Ole Schulz-Trieglaff¹⁸, Ashley D. Sanders, Siavash Sheikhizadeh, Carl Shneider, Sandra Smit, Daniel Valenzuela¹⁹, Jiayin Wang²⁰, Lodewyk F. A. Wessels²¹, Y. Zhang, Victor Guryev, Fabio Vandin²², Kai Ye²⁰, Alexander Schönhuth - Show less +58 more•Institutions (22)

01 Jan 2018-Briefings in Bioinformatics

TL;DR: Already available approaches to construct and use pan-genomes are examined, the potential benefits of future technologies and methodologies are discussed, and open challenges from the vantage point of the above-mentioned biological disciplines are reviewed.

...read moreread less

Abstract: Many disciplines, from human genetics and oncology to plant breeding, microbiology and virology, commonly face the challenge of analyzing rapidly increasing numbers of genomes. In case of Homo sapiens, the number of sequenced genomes will approach hundreds of thousands in the next few years. Simply scaling up established bioinformatics pipelines will not be sufficient for leveraging the full potential of such rich genomic data sets. Instead, novel, qualitatively different computational methods and paradigms are needed. We will witness the rapid extension of computational pan-genomics, a new sub-area of research in computational biology. In this article, we generalize existing definitions and understand a pan-genome as any collection of genomic sequences to be analyzed jointly or to be used as a reference. We examine already available approaches to construct and use pan-genomes, discuss the potential benefits of future technologies and methodologies and review open challenges from the vantage point of the above-mentioned biological disciplines. As a prominent example for a computational paradigm shift, we particularly highlight the transition from the representation of reference genomes as strings to representations as graphs. We outline how this and other challenges from different application domains translate into common computational problems, point out relevant bioinformatics techniques and identify open problems in computer science. With this review, we aim to increase awareness that a joint approach to computational pan-genomics can help address many of the problems currently faced in various domains.

...read moreread less

220 citations

Journal Article•DOI•

The Glanville fritillary genome retains an ancient karyotype and reveals selective chromosomal fusions in Lepidoptera

[...]

Virpi Ahola¹, Rainer Lehtonen, Panu Somervuo¹, Leena Salmela², Patrik Koskinen¹, Pasi Rastas¹, Niko Välimäki¹, Lars Paulin¹, Jouni Kvist¹, Niklas Wahlberg³, Jaakko Tanskanen¹, Emily A. Hornett⁴, Emily A. Hornett⁵, Laura Ferguson⁶, Shiqi Luo⁷, Zijuan Cao⁷, Maaike A. de Jong¹, Maaike A. de Jong⁸, Anne Duplouy¹, Olli-Pekka Smolander¹, Heiko Vogel⁹, Rajiv C. McCoy¹⁰, Kui Qian¹, Wong Swee Chong¹, Qin Zhang¹¹, Freed Ahmad¹², Jani K. Haukka¹¹, Aruj Joshi¹¹, Jarkko Salojärvi¹, Christopher W. Wheat¹³, Ewald Grosse-Wilde⁹, Daniel S.T. Hughes¹⁴, Daniel S.T. Hughes¹⁵, Riku Katainen¹, Esa Pitkänen¹, Johannes Ylinen², Robert M. Waterhouse¹⁶, Robert M. Waterhouse¹⁷, Robert M. Waterhouse¹⁸, Mikko P. Turunen¹, Anna Vähärautio¹⁹, Anna Vähärautio¹, Sami P. Ojanen¹, Alan H. Schulman¹, Minna Taipale¹⁹, Minna Taipale¹, Daniel Lawson¹⁴, Esko Ukkonen², Veli Mäkinen², Marian R. Goldsmith²⁰, Liisa Holm¹, Petri Auvinen¹, Mikko J. Frilander¹, Ilkka Hanski¹ - Show less +50 more•Institutions (20)

05 Sep 2014-Nature Communications

TL;DR: The genome of the Glanville fritillary butterfly, a widely recognized model species in metapopulation biology and eco-evolutionary research, is reported, which shows that fusion chromosomes have retained the ancestral chromosome segments and very few rearrangements have occurred across the fusion sites.

...read moreread less

Abstract: Previous studies have reported that chromosome synteny in Lepidoptera has been well conserved, yet the number of haploid chromosomes varies widely from 5 to 223. Here we report the genome (393 Mb) ...

...read moreread less

216 citations

Journal Article•DOI•

Comprehensive Identification of Single Nucleotide Polymorphisms Associated with Beta-lactam Resistance within Pneumococcal Mosaic Genes

[...]

Claire Chewapreecha¹, Pekka Marttinen², Pekka Marttinen³, Nicholas J. Croucher⁴, Susannah J. Salter¹, Simon R. Harris¹, Alison E. Mather¹, William P. Hanage², David Goldblatt⁵, François Nosten⁶, Claudia Turner⁷, Paul Turner⁷, Stephen D. Bentley¹, Julian Parkhill¹ - Show less +10 more•Institutions (7)

Wellcome Trust Sanger Institute¹, Harvard University², Helsinki Institute for Information Technology³, Imperial College London⁴, University College London⁵, University of Oxford⁶, Angkor Hospital for Children⁷

07 Aug 2014-PLOS Genetics

TL;DR: A genome-wide association study to identify single nucleotide polymorphisms (SNPs) and indels that could confer beta-lactam non-susceptibility using 3,085 Thai and 616 USA pneumococcal isolates as independent datasets for the variant discovery.

...read moreread less

Abstract: Traditional genetic association studies are very difficult in bacteria, as the generally limited recombination leads to large linked haplotype blocks, confounding the identification of causative variants. Beta-lactam antibiotic resistance in Streptococcus pneumoniae arises readily as the bacteria can quickly incorporate DNA fragments encompassing variants that make the transformed strains resistant. However, the causative mutations themselves are embedded within larger recombined blocks, and previous studies have only analysed a limited number of isolates, leading to the description of “mosaic genes” as being responsible for resistance. By comparing a large number of genomes of beta-lactam susceptible and non-susceptible strains, the high frequency of recombination should break up these haplotype blocks and allow the use of genetic association approaches to identify individual causative variants. Here, we performed a genome-wide association study to identify single nucleotide polymorphisms (SNPs) and indels that could confer beta-lactam non-susceptibility using 3,085 Thai and 616 USA pneumococcal isolates as independent datasets for the variant discovery. The large sample sizes allowed us to narrow the source of beta-lactam non-susceptibility from long recombinant fragments down to much smaller loci comprised of discrete or linked SNPs. While some loci appear to be universal resistance determinants, contributing equally to non-susceptibility for at least two classes of beta-lactam antibiotics, some play a larger role in resistance to particular antibiotics. All of the identified loci have a highly non-uniform distribution in the populations. They are enriched not only in vaccine-targeted, but also non-vaccine-targeted lineages, which may raise clinical concerns. Identification of single nucleotide polymorphisms underlying resistance will be essential for future use of genome sequencing to predict antibiotic sensitivity in clinical microbiology.

...read moreread less

212 citations

Journal Article•DOI•

Comparison of Bayesian predictive methods for model selection

[...]

Juho Piironen¹, Aki Vehtari¹•Institutions (1)

Helsinki Institute for Information Technology¹

01 May 2017-Statistics and Computing

TL;DR: The study demonstrates that the model selection can greatly benefit from using cross-validation outside the searching process both for guiding the model size selection and assessing the predictive performance of the finally selected model.

...read moreread less

Abstract: The goal of this paper is to compare several widely used Bayesian model selection methods in practical model selection problems, highlight their differences and give recommendations about the preferred approaches. We focus on the variable subset selection for regression and classification and perform several numerical experiments using both simulated and real world data. The results show that the optimization of a utility estimate such as the cross-validation (CV) score is liable to finding overfitted models due to relatively high variance in the utility estimates when the data is scarce. This can also lead to substantial selection induced bias and optimism in the performance evaluation for the selected model. From a predictive viewpoint, best results are obtained by accounting for model uncertainty by forming the full encompassing model, such as the Bayesian model averaging solution over the candidate models. If the encompassing model is too complex, it can be robustly simplified by the projection method, in which the information of the full model is projected onto the submodels. This approach is substantially less prone to overfitting than selection based on CV-score. Overall, the projection method appears to outperform also the maximum a posteriori model and the selection of the most probable variables. The study also demonstrates that the model selection can greatly benefit from using cross-validation outside the searching process both for guiding the model size selection and assessing the predictive performance of the finally selected model.

...read moreread less

207 citations

Collapse

Authors

Showing all 632 results

Name	H-index	Papers	Citations
Dimitri P. Bertsekas	94	332	85939
Olli Kallioniemi	90	353	42021
Heikki Mannila	72	295	26500
Jukka Corander	66	411	17220
Jaakko Kangasjärvi	62	146	17096
Aapo Hyvärinen	61	301	44146
Samuel Kaski	58	522	14180
Nadarajah Asokan	58	327	11947
Aristides Gionis	58	292	19300
Hannu Toivonen	56	192	19316
Nicola Zamboni	53	128	11397
Jorma Rissanen	52	151	22720
Tero Aittokallio	52	271	8689
Juha Veijola	52	261	19588
Juho Hamari	51	176	16631

Network Information

Related Institutions (5)

Google

39.8K papers, 2.1M citations

93% related

Microsoft

86.9K papers, 4.1M citations

38.6K papers, 1.3M citations

92% related

Carnegie Mellon University

104.3K papers, 5.9M citations

91% related

Facebook

10.9K papers, 570.1K citations

91% related

Performance

Metrics

1,967

Papers

76,126

Citations

No. of papers from the Institution in previous years
Year	Papers
2023	1
2022	4
2021	85
2020	97
2019	140
2018	127