Institution

Helsinki Institute for Information Technology

Facility•Espoo, Finland•

About: Helsinki Institute for Information Technology is a facility organization based out in Espoo, Finland. It is known for research contribution in the topics: Population & Bayesian network. The organization has 630 authors who have published 1962 publications receiving 63426 citations.

...read moreread less

Topics: Population, Bayesian network, Mobile computing, The Internet, Approximation algorithm ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

LANES: an inter-domain data-oriented routing architecture

[...]

Kari Visala¹, Dmitrij Lagutin¹, Sasu Tarkoma¹•Institutions (1)

Helsinki Institute for Information Technology¹

01 Dec 2009

TL;DR: This paper presents the interdomain routing layer and its interplay with the other components of the system, and introduces a new data-oriented congestion control scheme that takes into account the use of storage resources on-path and is fair to multicast flows.

...read moreread less

Abstract: Data-oriented networking has attracted research recently, but the efficiency of the state-of-the-art solutions can still be improved. Our work towards this goal is set in a clean-slate architecture consisting of modular rendezvous, routing, and forwarding functions. In this paper we present the interdomain routing layer and its interplay with the other components of the system. The proposed system is built around two types of nodes: forwarding nodes and branching nodes. The forwarding nodes are optimized for throughput with no per-subscription state and no need to change passing packets, while branching nodes contain a large memory for caching and can make complex routing decisions. The amount of storage space and bandwidth can be independently scaled to suit the needs of each network. In the background, topology nodes perform load-balancing and configure routes in each domain using a two-dimensional addressing mechanism. The paths taken by packets adapt to the number of active subscribers to keep the amount of in-network state and latency low. A new data-oriented congestion control scheme is introduced, which takes into account the use of storage resources on-path and is fair to multicast flows.

...read moreread less

43 citations

Journal Article•DOI•

Fast and accurate approximate inference of transcript expression from RNA-seq data

[...]

James Hensman¹, Panagiotis Papastamoulis, Peter Glaus², Antti Honkela³, Magnus Rattray - Show less +1 more•Institutions (3)

University of Sheffield¹, University of Manchester², Helsinki Institute for Information Technology³

15 Dec 2015-Bioinformatics

TL;DR: In this paper, an approximate inference scheme based on Variational Bayes (VB) was proposed and applied to an existing model of transcript expression inference from RNA-seq data, demonstrating a significant increase in speed with only very small loss in accuracy of expression level estimation.

...read moreread less

Abstract: Motivation: Assigning RNA-seq reads to their transcript of origin is a fundamental task in transcript expression estimation. Where ambiguities in assignments exist due to transcripts sharing sequence, e.g. alternative isoforms or alleles, the problem can be solved through probabilistic inference. Bayesian methods have been shown to provide accurate transcript abundance estimates compared with competing methods. However, exact Bayesian inference is intractable and approximate methods such as Markov chain Monte Carlo and Variational Bayes (VB) are typically used. While providing a high degree of accuracy and modelling flexibility, standard implementations can be prohibitively slow for large datasets and complex transcriptome annotations. Results: We propose a novel approximate inference scheme based on VB and apply it to an existing model of transcript expression inference from RNA-seq data. Recent advances in VB algorithmics are used to improve the convergence of the algorithm beyond the standard Variational Bayes Expectation Maximization algorithm. We apply our algorithm to simulated and biological datasets, demonstrating a significant increase in speed with only very small loss in accuracy of expression level estimation. We carry out a comparative study against seven popular alternative methods and demonstrate that our new algorithm provides excellent accuracy and inter-replicate consistency while remaining competitive in computation time.

...read moreread less

43 citations

Journal Article•DOI•

On the complexity of Minimum Path Cover with Subpath Constraints for multi-assembly

[...]

Romeo Rizzi¹, Alexandru I. Tomescu², Veli Mäkinen²•Institutions (2)

University of Verona¹, Helsinki Institute for Information Technology²

10 Sep 2014-BMC Bioinformatics

TL;DR: This paper considers two generalizations of the Minimum Path Cover Problem dealing with integrating constraints arising from long reads or paired-end reads, and shows that in the case of long reads (subpaths), the generalized problem can be solved in polynomial-time by a reduction to the classical MPC Problem.

...read moreread less

Abstract: Multi-assembly problems have gathered much attention in the last years, as Next-Generation Sequencing technologies have started being applied to mixed settings, such as reads from the transcriptome (RNA-Seq), or from viral quasi-species. One classical model that has resurfaced in many multi-assembly methods (e.g. in Cufflinks, ShoRAH, BRANCH, CLASS) is the Minimum Path Cover (MPC) Problem, which asks for the minimum number of directed paths that cover all the nodes of a directed acyclic graph. The MPC Problem is highly popular because the acyclicity of the graph ensures its polynomial-time solvability. In this paper, we consider two generalizations of it dealing with integrating constraints arising from long reads or paired-end reads; these extensions have also been considered by two recent methods, but not fully solved. More specifically, we study the two problems where also a set of subpaths, or pairs of subpaths, of the graph have to be entirely covered by some path in the MPC. We show that in the case of long reads (subpaths), the generalized problem can be solved in polynomial-time by a reduction to the classical MPC Problem. We also consider the weighted case, and show that it can be solved in polynomial-time by a reduction to a min-cost circulation problem. As a side result, we also improve the time complexity of the classical minimum weight MPC Problem. In the case of paired-end reads (pairs of subpaths), the generalized problem becomes NP-hard, but we show that it is fixed-parameter tractable (FPT) in the total number of constraints. This computational dichotomy between long reads and paired-end reads is also a general insight into multi-assembly problems.

...read moreread less

43 citations

MediaEval 2017 Predicting Media Interestingness Task

[...]

Claire-Hélène Demarty, Mats Sjöberg¹, Bogdan Ionescu², Thanh-Toan Do, Michael Gygli³, Ngoc Q. K. Duong - Show less +2 more•Institutions (3)

Helsinki Institute for Information Technology¹, Politehnica University of Bucharest², ETH Zurich³

13 Sep 2017

TL;DR: The Predicting Media Interestingness task as mentioned in this paper, which is running for the second year as part of the MediaEval 2017 Benchmarking Initiative for Multimedia Evaluation, is presented.

...read moreread less

Abstract: In this paper, the Predicting Media Interestingness task which is running for the second year as part of the MediaEval 2017 Bench-marking Initiative for Multimedia Evaluation, is presented. For the task, participants are expected to create systems that automatically select images and video segments that are considered to be the most interesting for a common viewer. All task characteristics are described, namely the task use case and challenges, the released data set and ground truth, the required participant runs and the evaluation metrics.

...read moreread less

43 citations

Proceedings Article•DOI•

Conditional NML Universal Models

[...]

Jorma Rissanen¹, Teemu Roos¹•Institutions (1)

Helsinki Institute for Information Technology¹

22 Oct 2007

TL;DR: A universal conditional NML model is presented, which has minmax optimal properties similar to those of the regular N ML model, but which defines a random process which can be used for prediction and also admits a recursive evaluation for data compression.

...read moreread less

Abstract: The NML (normalized maximum likelihood) universal model has certain minmax optimal properties but it has two shortcomings: the normalizing coefficient can be evaluated in a closed form only for special model classes, and it does not define a random process so that it cannot be used for prediction. We present a universal conditional NML model, which has minmax optimal properties similar to those of the regular NML model. However, unlike NML, the conditional NML model defines a random process which can be used for prediction. It also admits a recursive evaluation for data compression. The conditional normalizing coefficient is much easier to evaluate, for instance, for tree machines than the integral of the square root of the Fisher information in the NML model. For Bernoulli distributions, the conditional NML model gives a predictive probability, which behaves like the Krichevsky-Trofimov predictive probability, actually slightly better for extremely skewed strings. For some model classes, it agrees with the predictive probability found earlier by Takimoto and Warmuth, as the solution to a different more restrictive minmax problem. We also calculate the CNML models for the generalized Gaussian regression models, and in particular for the cases where the loss function is quadratic, and show that the CNML model achieves asymptotic optimality in terms of the mean ideal code length. Moreover, the quadratic loss, which represents fitting errors as noise rather than prediction errors, can be shown to be smaller than what can be achieved with the NML as well as with the so-called plug-in or the predictive MDL model.

...read moreread less

43 citations

Collapse

Authors

Showing all 632 results

Name	H-index	Papers	Citations
Dimitri P. Bertsekas	94	332	85939
Olli Kallioniemi	90	353	42021
Heikki Mannila	72	295	26500
Jukka Corander	66	411	17220
Jaakko Kangasjärvi	62	146	17096
Aapo Hyvärinen	61	301	44146
Samuel Kaski	58	522	14180
Nadarajah Asokan	58	327	11947
Aristides Gionis	58	292	19300
Hannu Toivonen	56	192	19316
Nicola Zamboni	53	128	11397
Jorma Rissanen	52	151	22720
Tero Aittokallio	52	271	8689
Juha Veijola	52	261	19588
Juho Hamari	51	176	16631

Network Information

Related Institutions (5)

Google

39.8K papers, 2.1M citations

93% related

Microsoft

86.9K papers, 4.1M citations

38.6K papers, 1.3M citations

92% related

Carnegie Mellon University

104.3K papers, 5.9M citations

91% related

Facebook

10.9K papers, 570.1K citations

91% related

Performance

Metrics

1,967

Papers

76,126

Citations

No. of papers from the Institution in previous years
Year	Papers
2023	1
2022	4
2021	85
2020	97
2019	140
2018	127