Institution

Helsinki Institute for Information Technology

Facility•Espoo, Finland•

About: Helsinki Institute for Information Technology is a facility organization based out in Espoo, Finland. It is known for research contribution in the topics: Population & Bayesian network. The organization has 630 authors who have published 1962 publications receiving 63426 citations.

...read moreread less

Topics: Population, Bayesian network, The Internet, Mobile computing, Cluster analysis ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Metabolite identification through multiple kernel learning on fragmentation trees.

[...]

Huibin Shen¹, Kai Dührkop¹, Sebastian Böcker¹, Juho Rousu¹•Institutions (1)

Helsinki Institute for Information Technology¹

15 Jun 2014-Bioinformatics

TL;DR: This work combines fragmentation tree computations with kernel-based machine learning to predict molecular fingerprints and identify molecular structures, and introduces a family of kernels capturing the similarity of fragmentation trees, and combines these kernels using recently proposed multiple kernel learning approaches.

...read moreread less

Abstract: Motivation: Metabolite identification from tandem mass spectrometric data is a key task in metabolomics. Various computational methods have been proposed for the identification of metabolites from tandem mass spectra. Fragmentation tree methods explore the space of possible ways in which the metabolite can fragment, and base the metabolite identification on scoring of these fragmentation trees. Machine learning methods have been used to map mass spectra to molecular fingerprints; predicted fingerprints, in turn, can be used to score candidate molecular structures. Results: Here, we combine fragmentation tree computations with kernel-based machine learning to predict molecular fingerprints and identify molecular structures. We introduce a family of kernels capturing the similarity of fragmentation trees, and combine these kernels using recently proposed multiple kernel learning approaches. Experiments on two large reference datasets show that the new methods significantly improve molecular fingerprint prediction accuracy. These improvements result in better metabolite identification, doubling the number of metabolites ranked at the top position of the candidates list. Contact: if.otlaa@nehs.nibiuh Supplementary information: Supplementary data are available at Bioinformatics online.

...read moreread less

89 citations

Journal Article•DOI•

[...]

Panayiotis Tsaparas¹, Leonardo Mariño-Ramírez², Olivier Bodenreider², Eugene V. Koonin², I. King Jordan³, I. King Jordan² - Show less +2 more•Institutions (3)

Helsinki Institute for Information Technology¹, National Institutes of Health², Georgia Institute of Technology³

12 Sep 2006-BMC Evolutionary Biology

TL;DR: The dissonance between global versus local network divergence suggests that the interspecies similarity of the global network properties is of limited biological significance, at best, and that the biologically relevant aspects of the architectures of gene coexpression are specific and particular, rather than universal.

...read moreread less

Abstract: A genome-wide comparative analysis of human and mouse gene expression patterns was performed in order to evaluate the evolutionary divergence of mammalian gene expression. Tissue-specific expression profiles were analyzed for 9,105 human-mouse orthologous gene pairs across 28 tissues. Expression profiles were resolved into species-specific coexpression networks, and the topological properties of the networks were compared between species. At the global level, the topological properties of the human and mouse gene coexpression networks are, essentially, identical. For instance, both networks have topologies with small-world and scale-free properties as well as closely similar average node degrees, clustering coefficients, and path lengths. However, the human and mouse coexpression networks are highly divergent at the local level: only a small fraction (<10%) of coexpressed gene pair relationships are conserved between the two species. A series of controls for experimental and biological variance show that most of this divergence does not result from experimental noise. We further show that, while the expression divergence between species is genuinely rapid, expression does not evolve free from selective (functional) constraint. Indeed, the coexpression networks analyzed here are demonstrably functionally coherent as indicated by the functional similarity of coexpressed gene pairs, and this pattern is most pronounced in the conserved human-mouse intersection network. Numerous dense network clusters show evidence of dedicated functions, such as spermatogenesis and immune response, that are clearly consistent with the coherence of the expression patterns of their constituent gene members. The dissonance between global versus local network divergence suggests that the interspecies similarity of the global network properties is of limited biological significance, at best, and that the biologically relevant aspects of the architectures of gene coexpression are specific and particular, rather than universal. Nevertheless, there is substantial evolutionary conservation of the local network structure which is compatible with the notion that gene coexpression networks are subject to purifying selection.

...read moreread less

88 citations

Journal Article•DOI•

Metabolic Regulation in Progression to Autoimmune Diabetes

[...]

Marko Sysi-Aho¹, Andrey Ermolov², Peddinti Gopalacharyulu¹, Abhishek Tripathi², Tuulikki Seppänen-Laakso¹, Johanna Maukonen¹, Ismo Mattila³, Suvi T. Ruohonen⁴, Laura H. Vähätalo⁴, Laxman Yetukuri¹, Taina Härkönen⁵, Erno Lindfors¹, Janne Nikkilä⁶, Jorma Ilonen⁴, Jorma Ilonen⁷, Olli Simell⁴, Maria Saarela¹, Mikael Knip⁵, Samuel Kaski², Eriika Savontaus⁴, Eriika Savontaus⁸, Matej Orešič³ - Show less +18 more•Institutions (8)

VTT Technical Research Centre of Finland¹, Helsinki Institute for Information Technology², Steno Diabetes Center³, University of Turku⁴, University of Helsinki⁵, Finnish Red Cross⁶, University of Eastern Finland⁷, Turku University Hospital⁸

27 Oct 2011-PLOS Computational Biology

TL;DR: It is shown that female NOD mice who later progress to autoimmune diabetes exhibit the same lipidomic pattern as prediabetic children, and the findings indicate that autoimmune diabetes is preceded by a state of increased metabolic demands on the islets resulting in elevated insulin secretion.

...read moreread less

Abstract: Recent evidence from serum metabolomics indicates that specific metabolic disturbances precede β-cell autoimmunity in humans and can be used to identify those children who subsequently progress to type 1 diabetes. The mechanisms behind these disturbances are unknown. Here we show the specificity of the pre-autoimmune metabolic changes, as indicated by their conservation in a murine model of type 1 diabetes. We performed a study in non-obese prediabetic (NOD) mice which recapitulated the design of the human study and derived the metabolic states from longitudinal lipidomics data. We show that female NOD mice who later progress to autoimmune diabetes exhibit the same lipidomic pattern as prediabetic children. These metabolic changes are accompanied by enhanced glucose-stimulated insulin secretion, normoglycemia, upregulation of insulinotropic amino acids in islets, elevated plasma leptin and adiponectin, and diminished gut microbial diversity of the Clostridium leptum group. Together, the findings indicate that autoimmune diabetes is preceded by a state of increased metabolic demands on the islets resulting in elevated insulin secretion and suggest alternative metabolic related pathways as therapeutic targets to prevent diabetes.

...read moreread less

87 citations

Journal Article•DOI•

Fast scaffolding with small independent mixed integer programs

[...]

Leena Salmela¹, Veli Mäkinen¹, Niko Välimäki¹, Johannes Ylinen¹, Esko Ukkonen¹ - Show less +1 more•Institutions (1)

Helsinki Institute for Information Technology¹

01 Dec 2011-Bioinformatics

TL;DR: A technique is presented that divides the scaffolding problem into smaller subproblems and solves these with mixed integer programming and is fast and produces better or as good scaffolds as its competitors on large genomes.

...read moreread less

Abstract: Motivation: Assembling genomes from short read data has become increasingly popular, but the problem remains computationally challenging especially for larger genomes. We study the scaffolding phase of sequence assembly where preassembled contigs are ordered based on mate pair data. Results: We present MIP Scaffolder that divides the scaffolding problem into smaller subproblems and solves these with mixed integer programming. The scaffolding problem can be represented as a graph and the biconnected components of this graph can be solved independently. We present a technique for restricting the size of these subproblems so that they can be solved accurately with mixed integer programming. We compare MIP Scaffolder to two state of the art methods, SOPRA and SSPACE. MIP Scaffolder is fast and produces better or as good scaffolds as its competitors on large genomes. Availability: The source code of MIP Scaffolder is freely available at http://www.cs.helsinki.fi/u/lmsalmel/mip-scaffolder/. Contact: leena.salmela@cs.helsinki.fi

...read moreread less

87 citations

Journal Article•DOI•

Empirical evaluation of scoring functions for Bayesian network model selection

[...]

Zhifa Liu¹, Brandon Malone¹, Brandon Malone², Changhe Yuan¹, Changhe Yuan³ - Show less +1 more•Institutions (3)

Mississippi State University¹, Helsinki Institute for Information Technology², Queens College³

11 Sep 2012-BMC Bioinformatics

TL;DR: A major finding of this study suggests that the Minimum Description Length (MDL) consistently outperforms other scoring functions such as Akaike's information criterion (AIC), Bayesian Dirichlet equivalence score (BDeu), and factorized normalized maximum likelihood (fNML) in recovering the underlying Bayesian network structures.

...read moreread less

Abstract: In this work, we empirically evaluate the capability of various scoring functions of Bayesian networks for recovering true underlying structures. Similar investigations have been carried out before, but they typically relied on approximate learning algorithms to learn the network structures. The suboptimal structures found by the approximation methods have unknown quality and may affect the reliability of their conclusions. Our study uses an optimal algorithm to learn Bayesian network structures from datasets generated from a set of gold standard Bayesian networks. Because all optimal algorithms always learn equivalent networks, this ensures that only the choice of scoring function affects the learned networks. Another shortcoming of the previous studies stems from their use of random synthetic networks as test cases. There is no guarantee that these networks reflect real-world data. We use real-world data to generate our gold-standard structures, so our experimental design more closely approximates real-world situations. A major finding of our study suggests that, in contrast to results reported by several prior works, the Minimum Description Length (MDL) (or equivalently, Bayesian information criterion (BIC)) consistently outperforms other scoring functions such as Akaike's information criterion (AIC), Bayesian Dirichlet equivalence score (BDeu), and factorized normalized maximum likelihood (fNML) in recovering the underlying Bayesian network structures. We believe this finding is a result of using both datasets generated from real-world applications rather than from random processes used in previous studies and learning algorithms to select high-scoring structures rather than selecting random models. Other findings of our study support existing work, e.g., large sample sizes result in learning structures closer to the true underlying structure; the BDeu score is sensitive to the parameter settings; and the fNML performs pretty well on small datasets. We also tested a greedy hill climbing algorithm and observed similar results as the optimal algorithm.

...read moreread less

87 citations

Collapse

Authors

Showing all 632 results

Name	H-index	Papers	Citations
Dimitri P. Bertsekas	94	332	85939
Olli Kallioniemi	90	353	42021
Heikki Mannila	72	295	26500
Jukka Corander	66	411	17220
Jaakko Kangasjärvi	62	146	17096
Aapo Hyvärinen	61	301	44146
Samuel Kaski	58	522	14180
Nadarajah Asokan	58	327	11947
Aristides Gionis	58	292	19300
Hannu Toivonen	56	192	19316
Nicola Zamboni	53	128	11397
Jorma Rissanen	52	151	22720
Tero Aittokallio	52	271	8689
Juha Veijola	52	261	19588
Juho Hamari	51	176	16631

Network Information

Related Institutions (5)

Google

39.8K papers, 2.1M citations

93% related

Microsoft

86.9K papers, 4.1M citations

38.6K papers, 1.3M citations

92% related

Carnegie Mellon University

104.3K papers, 5.9M citations

91% related

Facebook

10.9K papers, 570.1K citations

91% related

Performance

Metrics

1,967

Papers

76,126

Citations

No. of papers from the Institution in previous years
Year	Papers
2023	1
2022	4
2021	85
2020	97
2019	140
2018	127