Institution
Helsinki Institute for Information Technology
Facility•Espoo, Finland•
About: Helsinki Institute for Information Technology is a facility organization based out in Espoo, Finland. It is known for research contribution in the topics: Population & Bayesian network. The organization has 630 authors who have published 1962 publications receiving 63426 citations.
Papers published on a yearly basis
Papers
More filters
••
TL;DR: The Long Baseline Campaign (LBC) was carried out from 2014 September to late November, culminating in end-to-end observations, calibrations, and imaging of selected Science Verification (SV) targets as discussed by the authors.
Abstract: A major goal of the Atacama Large Millimeter/submillimeter Array (ALMA) is to make accurate images with resolutions of tens of milliarcseconds, which at submillimeter (submm) wavelengths requires baselines up to ~15 km. To develop and test this capability, a Long Baseline Campaign (LBC) was carried out from 2014 September to late November, culminating in end-to-end observations, calibrations, and imaging of selected Science Verification (SV) targets. This paper presents an overview of the campaign and its main results, including an investigation of the short-term coherence properties and systematic phase errors over the long baselines at the ALMA site, a summary of the SV targets and observations, and recommendations for science observing strategies at long baselines. Deep ALMA images of the quasar 3C 138 at 97 and 241 GHz are also compared to VLA 43 GHz results, demonstrating an agreement at a level of a few percent. As a result of the extensive program of LBC testing, the highly successful SV imaging at long baselines achieved angular resolutions as fine as 19 mas at ~350 GHz. Observing with ALMA on baselines of up to 15 km is now possible, and opens up new parameter space for submm astronomy.
106 citations
••
TL;DR: The proposed error correction method, LoRMA, is the most accurate one relying on long reads only for read sets with high coverage and when the coverage of the read set is at least 75×, the throughput of the new method is at at least 20% higher.
Abstract: Motivation: New long read sequencing technologies, like PacBio SMRT and Oxford NanoPore, can produce sequencing reads up to 50 000 bp long but with an error rate of at least 15%. Reducing the error rate is necessary for subsequent utilization of the reads in, e.g. de novo genome assembly. The error correction problem has been tackled either by aligning the long reads against each other or by a hybrid approach that uses the more accurate short reads produced by second generation sequencing technologies to correct the long reads.
Results: We present an error correction method that uses long reads only. The method consists of two phases: first, we use an iterative alignment-free correction method based on de Bruijn graphs with increasing length of k-mers, and second, the corrected reads are further polished using long-distance dependencies that are found using multiple alignments. According to our experiments, the proposed method is the most accurate one relying on long reads only for read sets with high coverage. Furthermore, when the coverage of the read set is at least 75Â, the throughput of the new method is at least 20% higher.
Availability and Implementation: LoRMA is freely available at http://www.cs.helsinki.fi/u/lmsalmel/LoRMA/.
106 citations
•
01 Jan 2019TL;DR: This work presents Ordinary Differential Equation Variational Auto-Encoder (ODE2VAE), a latent second order ODE model for high-dimensional sequential data that can simultaneously learn the embedding of high dimensional trajectories and infer arbitrarily complex continuous-time latent dynamics.
Abstract: We present Ordinary Differential Equation Variational Auto-Encoder (ODE2VAE), a latent second order ODE model for high-dimensional sequential data. Leveraging the advances in deep generative models, ODE2VAE can simultaneously learn the embedding of high dimensional trajectories and infer arbitrarily complex continuous-time latent dynamics. Our model explicitly decomposes the latent space into momentum and position components and solves a second order ODE system, which is in contrast to recurrent neural network (RNN) based time series models and recently proposed black-box ODE techniques. In order to account for uncertainty, we propose probabilistic latent ODE dynamics parameterized by deep Bayesian neural networks. We demonstrate our approach on motion capture, image rotation, and bouncing balls datasets. We achieve state-of-the-art performance in long term motion prediction and imputation tasks.
105 citations
••
TL;DR: In this paper, a generic approach for reasoning over argumentation frameworks is proposed based on the concept of complexity-sensitivity, which allows instantiations of the generic framework via harnessing in an iterative way current sophisticated Boolean satisfiability solver technology for solving the considered argumentation reasoning problems.
103 citations
••
TL;DR: This paper demonstrates empirically and theoretically with standard regression models that in order to make sure that decision models are non-discriminatory, for instance, with respect to race, the sensitive racial information needs to be used in the model building process.
Abstract: Increasing numbers of decisions about everyday life are made using algorithms. By algorithms we mean predictive models (decision rules) captured from historical data using data mining. Such models often decide prices we pay, select ads we see and news we read online, match job descriptions and candidate CVs, decide who gets a loan, who goes through an extra airport security check, or who gets released on parole. Yet growing evidence suggests that decision making by algorithms may discriminate people, even if the computing process is fair and well-intentioned. This happens due to biased or non-representative learning data in combination with inadvertent modeling procedures. From the regulatory perspective there are two tendencies in relation to this issue: (1) to ensure that data-driven decision making is not discriminatory, and (2) to restrict overall collecting and storing of private data to a necessary minimum. This paper shows that from the computing perspective these two goals are contradictory. We demonstrate empirically and theoretically with standard regression models that in order to make sure that decision models are non-discriminatory, for instance, with respect to race, the sensitive racial information needs to be used in the model building process. Of course, after the model is ready, race should not be required as an input variable for decision making. From the regulatory perspective this has an important implication: collecting sensitive personal data is necessary in order to guarantee fairness of algorithms, and law making needs to find sensible ways to allow using such data in the modeling process.
103 citations
Authors
Showing all 632 results
Name | H-index | Papers | Citations |
---|---|---|---|
Dimitri P. Bertsekas | 94 | 332 | 85939 |
Olli Kallioniemi | 90 | 353 | 42021 |
Heikki Mannila | 72 | 295 | 26500 |
Jukka Corander | 66 | 411 | 17220 |
Jaakko Kangasjärvi | 62 | 146 | 17096 |
Aapo Hyvärinen | 61 | 301 | 44146 |
Samuel Kaski | 58 | 522 | 14180 |
Nadarajah Asokan | 58 | 327 | 11947 |
Aristides Gionis | 58 | 292 | 19300 |
Hannu Toivonen | 56 | 192 | 19316 |
Nicola Zamboni | 53 | 128 | 11397 |
Jorma Rissanen | 52 | 151 | 22720 |
Tero Aittokallio | 52 | 271 | 8689 |
Juha Veijola | 52 | 261 | 19588 |
Juho Hamari | 51 | 176 | 16631 |