Institution
Helsinki Institute for Information Technology
Facility•Espoo, Finland•
About: Helsinki Institute for Information Technology is a facility organization based out in Espoo, Finland. It is known for research contribution in the topics: Population & Bayesian network. The organization has 630 authors who have published 1962 publications receiving 63426 citations.
Papers published on a yearly basis
Papers
More filters
•
31 Mar 2018TL;DR: In this article, the authors proposed an iterative supervised principal components (ISPC) method, which combines variable screening and dimension reduction to reduce the number of features to something that is computationally conveniently handled by Bayesian methods.
Abstract: In high-dimensional prediction problems, where the number of features may greatly exceed the number of training instances, fully Bayesian approach with a sparsifying prior is known to produce good results but is computationally challenging. To alleviate this computational burden, we propose to use a preprocessing step where we first apply a dimension reduction to the original data to reduce the number of features to something that is computationally conveniently handled by Bayesian methods. To do this, we propose a new dimension reduction technique, called iterative supervised principal components (ISPC), which combines variable screening and dimension reduction and can be considered as an extension to the existing technique of supervised principal components (SPCs). Our empirical evaluations confirm that, although not foolproof, the proposed approach provides very good results on several microarray benchmark datasets with very affordable computation time, and can also be very useful for visualizing high-dimensional data.
12 citations
••
01 Jan 2012TL;DR: In this article, the problem of identifying representative nodes in probabilistic graphs was introduced, motivated by the need to produce different simple views to large BisoNets, and the results suggest that the clustering based approaches are capable of finding a representative set of nodes.
Abstract: We introduce the problem of identifying representative nodes in probabilistic graphs, motivated by the need to produce different simple views to large BisoNets. We define a probabilistic similarity measure for nodes, and then apply clustering methods to find groups of nodes. Finally, a representative is output from each cluster. We report on experiments with real biomedical data, using both the k-medoids and hierarchical clustering methods in the clustering step. The results suggest that the clustering based approaches are capable of finding a representative set of nodes.
12 citations
••
07 Sep 2015TL;DR: An online streaming algorithm to maintain neighborhood profiles in the sliding-window model is presented, which is highly scalable as it permits parallel processing and the computation is node centric, hence it scales easily to very large networks on a distributed system, like Apache Giraph.
Abstract: Large networks are being generated by applications that keep track of relationships between different data entities Examples include online social networks recording interactions between individuals, sensor networks logging information exchanges between sensors, and more There is a large body of literature on computing exact or approximate properties on large networks, although most methods assume static networks On the other hand, in most modern real-world applications, networks are highly dynamic and continuous interactions along existing connections are generated Furthermore, it is desirable to consider that old edges become less important, and their contribution to the current view of the network diminishes over time
We study the problem of maintaining the neighborhood profile of each node in an interaction network Maintaining such a profile has applications in modeling network evolution and monitoring the importance of the nodes of the network over time We present an online streaming algorithm to maintain neighborhood profiles in the sliding-window model The algorithm is highly scalable as it permits parallel processing and the computation is node centric, hence it scales easily to very large networks on a distributed system, like Apache Giraph We present results from both serial and parallel implementations of the algorithm for different social networks The summary of the graph is maintained such that query of any window length can be performed
12 citations
•
01 Sep 2008TL;DR: This work formally defines the problem and proposes an MCMC framework for estimating the links and the initiators given the matrix of observations M and shows how this framework can be extended to incorporate a temporal aspect.
Abstract: Consider a 0–1 observation matrix M , where rows correspond to entities and columns correspond to signals; a value of 1 (or 0) in cell (i, j) of M indicates that signal j has been observed (or not observed) in entity i. Given such a matrix we study the problem of inferring the underlying directed links between entities (rows) and finding which entries in the matrix are initiators. We formally define this problem and propose an MCMC framework for estimating the links and the initiators given the matrix of observations M . We also show how this framework can be extended to incorporate a temporal aspect; instead of considering a single observation matrix M we consider a sequence of observation matrices M1, . . . ,Mt over time. We show the connection between our problem and several problems studied in the field of social-network analysis. We apply our method to paleontological and ecological data and show that our algorithms work well in practice and give reasonable results.
12 citations
••
01 Jan 2012TL;DR: An effective method for computing path based kernels is introduced that employs a Burrows-Wheeler transform based compressed path index for fast and space-efficient enumeration of paths and surpasses state-of-the-art graph kernels in prediction accuracy.
Abstract: Kernels for structured data are rapidly becoming an essential part of the machine learning toolbox. Graph kernels provide similarity measures for complex relational objects, such as molecules and enzymes. Graph kernels based on walks are popular due their fast computation but their predictive performance is often not satisfactory, while kernels based on subgraphs suffer from high computational cost and are limited to small substructures. Kernels based on paths offer a promising middle ground between these two extremes. However, the computation of path kernels has so far been assumed computationally too challenging. In this paper we introduce an effective method for computing path based kernels; we employ a Burrows-Wheeler transform based compressed path index for fast and space-efficient enumeration of paths. Unlike many kernel algorithms the index representation retains fast access to individual features. In our experiments with chemical reaction graphs, path based kernels surpass state-of-the-art graph kernels in prediction accuracy.
12 citations
Authors
Showing all 632 results
Name | H-index | Papers | Citations |
---|---|---|---|
Dimitri P. Bertsekas | 94 | 332 | 85939 |
Olli Kallioniemi | 90 | 353 | 42021 |
Heikki Mannila | 72 | 295 | 26500 |
Jukka Corander | 66 | 411 | 17220 |
Jaakko Kangasjärvi | 62 | 146 | 17096 |
Aapo Hyvärinen | 61 | 301 | 44146 |
Samuel Kaski | 58 | 522 | 14180 |
Nadarajah Asokan | 58 | 327 | 11947 |
Aristides Gionis | 58 | 292 | 19300 |
Hannu Toivonen | 56 | 192 | 19316 |
Nicola Zamboni | 53 | 128 | 11397 |
Jorma Rissanen | 52 | 151 | 22720 |
Tero Aittokallio | 52 | 271 | 8689 |
Juha Veijola | 52 | 261 | 19588 |
Juho Hamari | 51 | 176 | 16631 |