scispace - formally typeset
Search or ask a question
Institution

Helsinki Institute for Information Technology

FacilityEspoo, Finland
About: Helsinki Institute for Information Technology is a facility organization based out in Espoo, Finland. It is known for research contribution in the topics: Population & Bayesian network. The organization has 630 authors who have published 1962 publications receiving 63426 citations.


Papers
More filters
Proceedings Article
31 Mar 2018
TL;DR: In this article, the authors proposed an iterative supervised principal components (ISPC) method, which combines variable screening and dimension reduction to reduce the number of features to something that is computationally conveniently handled by Bayesian methods.
Abstract: In high-dimensional prediction problems, where the number of features may greatly exceed the number of training instances, fully Bayesian approach with a sparsifying prior is known to produce good results but is computationally challenging. To alleviate this computational burden, we propose to use a preprocessing step where we first apply a dimension reduction to the original data to reduce the number of features to something that is computationally conveniently handled by Bayesian methods. To do this, we propose a new dimension reduction technique, called iterative supervised principal components (ISPC), which combines variable screening and dimension reduction and can be considered as an extension to the existing technique of supervised principal components (SPCs). Our empirical evaluations confirm that, although not foolproof, the proposed approach provides very good results on several microarray benchmark datasets with very affordable computation time, and can also be very useful for visualizing high-dimensional data.

12 citations

Book ChapterDOI
01 Jan 2012
TL;DR: In this article, the problem of identifying representative nodes in probabilistic graphs was introduced, motivated by the need to produce different simple views to large BisoNets, and the results suggest that the clustering based approaches are capable of finding a representative set of nodes.
Abstract: We introduce the problem of identifying representative nodes in probabilistic graphs, motivated by the need to produce different simple views to large BisoNets. We define a probabilistic similarity measure for nodes, and then apply clustering methods to find groups of nodes. Finally, a representative is output from each cluster. We report on experiments with real biomedical data, using both the k-medoids and hierarchical clustering methods in the clustering step. The results suggest that the clustering based approaches are capable of finding a representative set of nodes.

12 citations

Book ChapterDOI
07 Sep 2015
TL;DR: An online streaming algorithm to maintain neighborhood profiles in the sliding-window model is presented, which is highly scalable as it permits parallel processing and the computation is node centric, hence it scales easily to very large networks on a distributed system, like Apache Giraph.
Abstract: Large networks are being generated by applications that keep track of relationships between different data entities Examples include online social networks recording interactions between individuals, sensor networks logging information exchanges between sensors, and more There is a large body of literature on computing exact or approximate properties on large networks, although most methods assume static networks On the other hand, in most modern real-world applications, networks are highly dynamic and continuous interactions along existing connections are generated Furthermore, it is desirable to consider that old edges become less important, and their contribution to the current view of the network diminishes over time We study the problem of maintaining the neighborhood profile of each node in an interaction network Maintaining such a profile has applications in modeling network evolution and monitoring the importance of the nodes of the network over time We present an online streaming algorithm to maintain neighborhood profiles in the sliding-window model The algorithm is highly scalable as it permits parallel processing and the computation is node centric, hence it scales easily to very large networks on a distributed system, like Apache Giraph We present results from both serial and parallel implementations of the algorithm for different social networks The summary of the graph is maintained such that query of any window length can be performed

12 citations

Proceedings Article
01 Sep 2008
TL;DR: This work formally defines the problem and proposes an MCMC framework for estimating the links and the initiators given the matrix of observations M and shows how this framework can be extended to incorporate a temporal aspect.
Abstract: Consider a 0–1 observation matrix M , where rows correspond to entities and columns correspond to signals; a value of 1 (or 0) in cell (i, j) of M indicates that signal j has been observed (or not observed) in entity i. Given such a matrix we study the problem of inferring the underlying directed links between entities (rows) and finding which entries in the matrix are initiators. We formally define this problem and propose an MCMC framework for estimating the links and the initiators given the matrix of observations M . We also show how this framework can be extended to incorporate a temporal aspect; instead of considering a single observation matrix M we consider a sequence of observation matrices M1, . . . ,Mt over time. We show the connection between our problem and several problems studied in the field of social-network analysis. We apply our method to paleontological and ecological data and show that our algorithms work well in practice and give reasonable results.

12 citations

Proceedings ArticleDOI
01 Jan 2012
TL;DR: An effective method for computing path based kernels is introduced that employs a Burrows-Wheeler transform based compressed path index for fast and space-efficient enumeration of paths and surpasses state-of-the-art graph kernels in prediction accuracy.
Abstract: Kernels for structured data are rapidly becoming an essential part of the machine learning toolbox. Graph kernels provide similarity measures for complex relational objects, such as molecules and enzymes. Graph kernels based on walks are popular due their fast computation but their predictive performance is often not satisfactory, while kernels based on subgraphs suffer from high computational cost and are limited to small substructures. Kernels based on paths offer a promising middle ground between these two extremes. However, the computation of path kernels has so far been assumed computationally too challenging. In this paper we introduce an effective method for computing path based kernels; we employ a Burrows-Wheeler transform based compressed path index for fast and space-efficient enumeration of paths. Unlike many kernel algorithms the index representation retains fast access to individual features. In our experiments with chemical reaction graphs, path based kernels surpass state-of-the-art graph kernels in prediction accuracy.

12 citations


Authors

Showing all 632 results

NameH-indexPapersCitations
Dimitri P. Bertsekas9433285939
Olli Kallioniemi9035342021
Heikki Mannila7229526500
Jukka Corander6641117220
Jaakko Kangasjärvi6214617096
Aapo Hyvärinen6130144146
Samuel Kaski5852214180
Nadarajah Asokan5832711947
Aristides Gionis5829219300
Hannu Toivonen5619219316
Nicola Zamboni5312811397
Jorma Rissanen5215122720
Tero Aittokallio522718689
Juha Veijola5226119588
Juho Hamari5117616631
Network Information
Related Institutions (5)
Google
39.8K papers, 2.1M citations

93% related

Microsoft
86.9K papers, 4.1M citations

93% related

Carnegie Mellon University
104.3K papers, 5.9M citations

91% related

Facebook
10.9K papers, 570.1K citations

91% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
20231
20224
202185
202097
2019140
2018127