Institution

Helsinki Institute for Information Technology

Facility•Espoo, Finland•

About: Helsinki Institute for Information Technology is a facility organization based out in Espoo, Finland. It is known for research contribution in the topics: Population & Bayesian network. The organization has 630 authors who have published 1962 publications receiving 63426 citations.

...read moreread less

Topics: Population, Bayesian network, The Internet, Mobile computing, Cluster analysis ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Linguistic feature extraction using independent component analysis

[...]

Timo Honkela¹, Aapo Hyvärinen²•Institutions (2)

Helsinki University of Technology¹, Helsinki Institute for Information Technology²

25 Jul 2004

TL;DR: Independent component analysis applied on word context data gives distinct features which reflect syntactic and semantic categories which can be obtained without any human supervision or tagged corpora that would have some predetermined morphological, syntactic or semantic information.

...read moreread less

Abstract: Our aim is to find syntactic and semantic relationships of words based on the analysis of corpora. We propose the application of independent component analysis, which seems to have clear advantages over two classic methods: latent semantic analysis and self-organizing maps. Latent semantic analysis is a simple method for automatic generation of concepts that are useful, e.g., in encoding documents for information retrieval purposes. However, these concepts cannot easily be interpreted by humans. Self-organizing maps can be used to generate an explicit diagram which characterizes the relationships between words. The resulting map reflects syntactic categories in the overall organization and semantic categories in the local level. The self-organizing map does not, however, provide any explicit distinct categories for the words. Independent component analysis applied on word context data gives distinct features which reflect syntactic and semantic categories. Thus, independent component analysis gives features or categories that are both explicit and can easily be interpreted by humans. This result can be obtained without any human supervision or tagged corpora that would have some predetermined morphological, syntactic or semantic information.

...read moreread less

30 citations

Posted Content•DOI•

Crowdsourced mapping extends the target space of kinase inhibitors

[...]

Anna Cichonska¹, Anna Cichonska², Anna Cichonska³, Balaguru Ravikumar², Robert J. Allaway, Sungjoon Park⁴, Fangping Wan⁵, Olexandr Isayev⁶, Shuya Li⁵, Michael Mason, Andrew Lamb, Ziaurrehman Tanoli², Minji Jeon⁴, Sunkyu Kim⁴, Mariya Popova⁶, Stephen J. Capuzzi⁷, Jianyang Zeng⁵, Kristen K. Dang, Gregory Koytiger, Jaewoo Kang⁴, Carrow I. Wells⁷, Timothy M. Willson⁷, Tudor I. Oprea⁸, Avner Schlessinger⁹, David H. Drewry⁷, Gustavo Stolovitzky¹⁰, Krister Wennerberg¹¹, Justin Guinney, Tero Aittokallio - Show less +25 more•Institutions (11)

University of Turku¹, University of Helsinki², Helsinki Institute for Information Technology³, Korea University⁴, Tsinghua University⁵, Carnegie Mellon University⁶, University of North Carolina at Chapel Hill⁷, University of New Mexico⁸, Icahn School of Medicine at Mount Sinai⁹, IBM¹⁰, University of Copenhagen¹¹

11 Feb 2020-bioRxiv

TL;DR: A crowdsourced benchmarking of the accuracy of machine learning (ML) algorithms at predicting kinase inhibitor potencies across multiple kinase families demonstrated that these models and their ensemble can improve the accuracies of experimental mapping efforts, especially for so far under-studied kinases.

...read moreread less

Abstract: Despite decades of intensive search for compounds that modulate the activity of particular targets, there are currently small-molecules available only for a small proportion of the human proteome. Effective approaches are therefore required to map the massive space of unexplored compound-target interactions for novel and potent activities. Here, we carried out a crowdsourced benchmarking of predictive models for kinase inhibitor potencies across multiple kinase families using unpublished bioactivity data. The top-performing predictions were based on kernel learning, gradient boosting and deep learning, and their ensemble resulted in predictive accuracy exceeding that of kinase activity assays. We then made new experiments based on the model predictions, which further improved the accuracy of experimental mapping efforts and identified unexpected potencies even for under-studied kinases. The open-source algorithms together with the novel bioactivities between 95 compounds and 295 kinases provide a resource for benchmarking new prediction algorithms and for extending the druggable kinome.

...read moreread less

30 citations

Journal Article•DOI•

Multi-task and multi-view learning of user state

[...]

Melih Kandemir¹, Akos Vetek², Mehmet Gönen³, Arto Klami⁴, Samuel Kaski⁴ - Show less +1 more•Institutions (4)

Heidelberg University¹, Nokia², Sage Bionetworks³, Helsinki Institute for Information Technology⁴

01 Sep 2014-Neurocomputing

TL;DR: This work discusses how two recent machine learning concepts, multi-view learning and multi-task learning, can be adapted for user state recognition, and illustrates how they can be effectively combined in a unified model by introducing a novel algorithm.

...read moreread less

30 citations

Journal Article•DOI•

Integrating sequence, evolution and functional genomics in regulatory genomics.

[...]

Martin Vingron¹, Alvis Brazma², Richard M.R. Coulson², Jacques van Helden³, Thomas Manke¹, Kimmo Palin⁴, Olivier Sand³, Esko Ukkonen⁵ - Show less +4 more•Institutions (5)

Max Planck Society¹, Wellcome Trust², Université libre de Bruxelles³, Wellcome Trust Sanger Institute⁴, Helsinki Institute for Information Technology⁵

30 Jan 2009-Genome Biology

TL;DR: With genome analysis expanding from the study of genes to theStudy of gene regulation, 'regulatory genomics' utilizes sequence information, evolution and functional genomics measurements to unravel how regulatory information is encoded in the genome.

...read moreread less

Abstract: With genome analysis expanding from the study of genes to the study of gene regulation, 'regulatory genomics' utilizes sequence information, evolution and functional genomics measurements to unravel how regulatory information is encoded in the genome.

...read moreread less

30 citations

Journal Article•DOI•

Bayesian network structure learning with integer programming: polytopes, facets and complexity

[...]

James Cussens¹, Matti Järvisalo², Janne H. Korhonen³, Mark Bartlett¹•Institutions (3)

University of York¹, Helsinki Institute for Information Technology², Aalto University³

01 Jan 2017-Journal of Artificial Intelligence Research

TL;DR: In this paper, the complexity of the separation problem is shown to be NP-hard and the relationship between three key polytopes underlying the IP-based approach to BNSL is analyzed.

...read moreread less

Abstract: The challenging task of learning structures of probabilistic graphical models is an important problem within modern AI research. Recent years have witnessed several major algorithmic advances in structure learning for Bayesian networks--arguably the most central class of graphical models--especially in what is known as the score-based setting. A successful generic approach to optimal Bayesian network structure learning (BNSL), based on integer programming (IP), is implemented in the gobnilp system. Despite the recent algorithmic advances, current understanding of foundational aspects underlying the IP based approach to BNSL is still somewhat lacking. Understanding fundamental aspects of cutting planes and the related separation problem is important not only from a purely theoretical perspective, but also since it holds out the promise of further improving the efficiency of state-of-the-art approaches to solving BNSL exactly. In this paper, we make several theoretical contributions towards these goals: (i) we study the computational complexity of the separation problem, proving that the problem is NP-hard; (ii) we formalise and analyse the relationship between three key polytopes underlying the IP-based approach to BNSL; (iii) we study the facets of the three polytopes both from the theoretical and practical perspective, providing, via exhaustive computation, a complete enumeration of facets for low-dimensional family-variable polytopes; and, furthermore, (iv) we establish a tight connection of the BNSL problem to the acyclic subgraph problem.

...read moreread less

30 citations

Collapse

Authors

Showing all 632 results

Name	H-index	Papers	Citations
Dimitri P. Bertsekas	94	332	85939
Olli Kallioniemi	90	353	42021
Heikki Mannila	72	295	26500
Jukka Corander	66	411	17220
Jaakko Kangasjärvi	62	146	17096
Aapo Hyvärinen	61	301	44146
Samuel Kaski	58	522	14180
Nadarajah Asokan	58	327	11947
Aristides Gionis	58	292	19300
Hannu Toivonen	56	192	19316
Nicola Zamboni	53	128	11397
Jorma Rissanen	52	151	22720
Tero Aittokallio	52	271	8689
Juha Veijola	52	261	19588
Juho Hamari	51	176	16631

Network Information

Related Institutions (5)

Google

39.8K papers, 2.1M citations

93% related

Microsoft

86.9K papers, 4.1M citations

38.6K papers, 1.3M citations

92% related

Carnegie Mellon University

104.3K papers, 5.9M citations

91% related

Facebook

10.9K papers, 570.1K citations

91% related

Performance

Metrics

1,967

Papers

76,126

Citations

No. of papers from the Institution in previous years
Year	Papers
2023	1
2022	4
2021	85
2020	97
2019	140
2018	127