Institution

Helsinki Institute for Information Technology

Facility•Espoo, Finland•

About: Helsinki Institute for Information Technology is a facility organization based out in Espoo, Finland. It is known for research contribution in the topics: Population & Bayesian network. The organization has 630 authors who have published 1962 publications receiving 63426 citations.

...read moreread less

Topics: Population, Bayesian network, The Internet, Mobile computing, Cluster analysis ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•

Iterative Supervised Principal Components

[...]

Juho Piironen, Aki Vehtari¹•Institutions (1)

Helsinki Institute for Information Technology¹

31 Mar 2018

TL;DR: In this article, the authors proposed an iterative supervised principal components (ISPC) method, which combines variable screening and dimension reduction to reduce the number of features to something that is computationally conveniently handled by Bayesian methods.

...read moreread less

Abstract: In high-dimensional prediction problems, where the number of features may greatly exceed the number of training instances, fully Bayesian approach with a sparsifying prior is known to produce good results but is computationally challenging. To alleviate this computational burden, we propose to use a preprocessing step where we first apply a dimension reduction to the original data to reduce the number of features to something that is computationally conveniently handled by Bayesian methods. To do this, we propose a new dimension reduction technique, called iterative supervised principal components (ISPC), which combines variable screening and dimension reduction and can be considered as an extension to the existing technique of supervised principal components (SPCs). Our empirical evaluations confirm that, although not foolproof, the proposed approach provides very good results on several microarray benchmark datasets with very affordable computation time, and can also be very useful for visualizing high-dimensional data.

...read moreread less

12 citations

Book Chapter•DOI•

Finding representative nodes in probabilistic graphs

[...]

Laura Langohr¹, Hannu Toivonen¹•Institutions (1)

Helsinki Institute for Information Technology¹

01 Jan 2012

TL;DR: In this article, the problem of identifying representative nodes in probabilistic graphs was introduced, motivated by the need to produce different simple views to large BisoNets, and the results suggest that the clustering based approaches are capable of finding a representative set of nodes.

...read moreread less

Abstract: We introduce the problem of identifying representative nodes in probabilistic graphs, motivated by the need to produce different simple views to large BisoNets. We define a probabilistic similarity measure for nodes, and then apply clustering methods to find groups of nodes. Finally, a representative is output from each cluster. We report on experiments with real biomedical data, using both the k-medoids and hierarchical clustering methods in the clustering step. The results suggest that the clustering based approaches are capable of finding a representative set of nodes.

...read moreread less

12 citations

Book Chapter•DOI•

Maintaining sliding-window neighborhood profiles in interaction networks

[...]

Rohit Kumar¹, Toon Calders¹, Aristides Gionis², Nikolaj Tatti²•Institutions (2)

Université libre de Bruxelles¹, Helsinki Institute for Information Technology²

07 Sep 2015

TL;DR: An online streaming algorithm to maintain neighborhood profiles in the sliding-window model is presented, which is highly scalable as it permits parallel processing and the computation is node centric, hence it scales easily to very large networks on a distributed system, like Apache Giraph.

...read moreread less

Abstract: Large networks are being generated by applications that keep track of relationships between different data entities Examples include online social networks recording interactions between individuals, sensor networks logging information exchanges between sensors, and more There is a large body of literature on computing exact or approximate properties on large networks, although most methods assume static networks On the other hand, in most modern real-world applications, networks are highly dynamic and continuous interactions along existing connections are generated Furthermore, it is desirable to consider that old edges become less important, and their contribution to the current view of the network diminishes over time We study the problem of maintaining the neighborhood profile of each node in an interaction network Maintaining such a profile has applications in modeling network evolution and monitoring the importance of the nodes of the network over time We present an online streaming algorithm to maintain neighborhood profiles in the sliding-window model The algorithm is highly scalable as it permits parallel processing and the computation is node centric, hence it scales easily to very large networks on a distributed system, like Apache Giraph We present results from both serial and parallel implementations of the algorithm for different social networks The summary of the graph is maintained such that query of any window length can be performed

...read moreread less

12 citations

Proceedings Article•

Finding links and initiators: A graph-reconstruction problem

[...]

Heikki Mannila¹, Evimaria Terzi²•Institutions (2)

Helsinki Institute for Information Technology¹, IBM²

01 Sep 2008

TL;DR: This work formally defines the problem and proposes an MCMC framework for estimating the links and the initiators given the matrix of observations M and shows how this framework can be extended to incorporate a temporal aspect.

...read moreread less

Abstract: Consider a 0–1 observation matrix M , where rows correspond to entities and columns correspond to signals; a value of 1 (or 0) in cell (i, j) of M indicates that signal j has been observed (or not observed) in entity i. Given such a matrix we study the problem of inferring the underlying directed links between entities (rows) and finding which entries in the matrix are initiators. We formally define this problem and propose an MCMC framework for estimating the links and the initiators given the matrix of observations M . We also show how this framework can be extended to incorporate a temporal aspect; instead of considering a single observation matrix M we consider a sequence of observation matrices M1, . . . ,Mt over time. We show the connection between our problem and several problems studied in the field of social-network analysis. We apply our method to paleontological and ecological data and show that our algorithms work well in practice and give reasonable results.

...read moreread less

12 citations

Proceedings Article•DOI•

Efficient Path Kernels for Reaction Function Prediction

[...]

Markus Heinonen¹, Niko Välimäki, Veli Mäkinen, Juho Rousu²•Institutions (2)

Helsinki Institute for Information Technology¹, University of Helsinki²

01 Jan 2012

TL;DR: An effective method for computing path based kernels is introduced that employs a Burrows-Wheeler transform based compressed path index for fast and space-efficient enumeration of paths and surpasses state-of-the-art graph kernels in prediction accuracy.

...read moreread less

Abstract: Kernels for structured data are rapidly becoming an essential part of the machine learning toolbox. Graph kernels provide similarity measures for complex relational objects, such as molecules and enzymes. Graph kernels based on walks are popular due their fast computation but their predictive performance is often not satisfactory, while kernels based on subgraphs suffer from high computational cost and are limited to small substructures. Kernels based on paths offer a promising middle ground between these two extremes. However, the computation of path kernels has so far been assumed computationally too challenging. In this paper we introduce an effective method for computing path based kernels; we employ a Burrows-Wheeler transform based compressed path index for fast and space-efficient enumeration of paths. Unlike many kernel algorithms the index representation retains fast access to individual features. In our experiments with chemical reaction graphs, path based kernels surpass state-of-the-art graph kernels in prediction accuracy.

...read moreread less

12 citations

Collapse

Authors

Showing all 632 results

Name	H-index	Papers	Citations
Dimitri P. Bertsekas	94	332	85939
Olli Kallioniemi	90	353	42021
Heikki Mannila	72	295	26500
Jukka Corander	66	411	17220
Jaakko Kangasjärvi	62	146	17096
Aapo Hyvärinen	61	301	44146
Samuel Kaski	58	522	14180
Nadarajah Asokan	58	327	11947
Aristides Gionis	58	292	19300
Hannu Toivonen	56	192	19316
Nicola Zamboni	53	128	11397
Jorma Rissanen	52	151	22720
Tero Aittokallio	52	271	8689
Juha Veijola	52	261	19588
Juho Hamari	51	176	16631

Network Information

Related Institutions (5)

Google

39.8K papers, 2.1M citations

93% related

Microsoft

86.9K papers, 4.1M citations

38.6K papers, 1.3M citations

92% related

Carnegie Mellon University

104.3K papers, 5.9M citations

91% related

Facebook

10.9K papers, 570.1K citations

91% related

Performance

Metrics

1,967

Papers

76,126

Citations

No. of papers from the Institution in previous years
Year	Papers
2023	1
2022	4
2021	85
2020	97
2019	140
2018	127