Institution
Helsinki Institute for Information Technology
Facility•Espoo, Finland•
About: Helsinki Institute for Information Technology is a facility organization based out in Espoo, Finland. It is known for research contribution in the topics: Population & Bayesian network. The organization has 630 authors who have published 1962 publications receiving 63426 citations.
Papers published on a yearly basis
Papers
More filters
••
TL;DR: A sparse-coding framework for activity recognition in ubiquitous and mobile computing that alleviates two fundamental problems of current supervised learning approaches is proposed, and its practical potential is shown by successfully evaluating its generalization capabilities across both domain and sensor modalities.
115 citations
••
TL;DR: It is demonstrated that antibiotic resistance in E. coli can be accurately predicted from whole genome sequences without a priori knowledge of mechanisms, and that both genomic and epidemiological data can be informative.
Abstract: The emergence of microbial antibiotic resistance is a global health threat. In clinical settings, the key to controlling spread of resistant strains is accurate and rapid detection. As traditional culture-based methods are time consuming, genetic approaches have recently been developed for this task. The detection of antibiotic resistance is typically made by measuring a few known determinants previously identified from genome sequencing, and thus requires the prior knowledge of its biological mechanisms. To overcome this limitation, we employed machine learning models to predict resistance to 11 compounds across four classes of antibiotics from existing and novel whole genome sequences of 1936 E. coli strains. We considered a range of methods, and examined population structure, isolation year, gene content, and polymorphism information as predictors. Gradient boosted decision trees consistently outperformed alternative models with an average accuracy of 0.91 on held-out data (range 0.81-0.97). While the best models most frequently employed gene content, an average accuracy score of 0.79 could be obtained using population structure information alone. Single nucleotide variation data were less useful, and significantly improved prediction only for two antibiotics, including ciprofloxacin. These results demonstrate that antibiotic resistance in E. coli can be accurately predicted from whole genome sequences without a priori knowledge of mechanisms, and that both genomic and epidemiological data can be informative. This paves way to integrating machine learning approaches into diagnostic tools in the clinic.
113 citations
••
TL;DR: The paper provides a full theoretical foundation for the causal discovery procedure first presented by Eberhardt et al. (2010) by adapting the procedure to the problem of cellular network inference, applying it to the biologically realistic data of the DREAMchallenges.
Abstract: Identifying cause-effect relationships between variables of interest is a central problem in science. Given a set of experiments we describe a procedure that identifies linear models that may contain cycles and latent variables. We provide a detailed description of the model family, full proofs of the necessary and sufficient conditions for identifiability, a search algorithm that is complete, and a discussion of what can be done when the identifiability conditions are not satisfied. The algorithm is comprehensively tested in simulations, comparing it to competing algorithms in the literature. Furthermore, we adapt the procedure to the problem of cellular network inference, applying it to the biologically realistic data of the DREAMchallenges. The paper provides a full theoretical foundation for the causal discovery procedure first presented by Eberhardt et al. (2010) and Hyttinen et al. (2010).
112 citations
••
TL;DR: In this article, a unified theory for analysis of components in discrete data is presented, and the main families of algorithms discussed are a variational approximation, Gibbs sampling, and Rao-Blackwellised Gibbs sampling.
Abstract: This article presents a unified theory for analysis of components in discrete data, and compares the methods with techniques such as independent component analysis, non-negative matrix factorisation and latent Dirichlet allocation. The main families of algorithms discussed are a variational approximation, Gibbs sampling, and Rao-Blackwellised Gibbs sampling. Applications are presented for voting records from the United States Senate for 2003, and for the Reuters-21578 newswire collection.
111 citations
••
01 Nov 2016TL;DR: The goal of this article is to investigate how to separate the 2 types of tasks in an IR system using easily measurable behaviors, and shows that IR systems can distinguish the 2 search categories in the course of a search session.
Abstract: Exploratory search is an increasingly important activity yet challenging for users. Although there exists an ample amount of research into understanding exploration, most of the major information retrieval IR systems do not provide tailored and adaptive support for such tasks. One reason is the lack of empirical knowledge on how to distinguish exploratory and lookup search behaviors in IR systems. The goal of this article is to investigate how to separate the 2 types of tasks in an IR system using easily measurable behaviors. In this article, we first review characteristics of exploratory search behavior. We then report on a controlled study of 6 search tasks with 3 exploratory-comparison, knowledge acquisition, planning-and 3 lookup tasks-fact-finding, navigational, question answering. The results are encouraging, showing that IR systems can distinguish the 2 search categories in the course of a search session. The most distinctive indicators that characterize exploratory search behaviors are query length, maximum scroll depth, and task completion time. However, 2 tasks are borderline and exhibit mixed characteristics. We assess the applicability of this finding by reporting on several classification experiments. Our results have valuable implications for designing tailored and adaptive IR systems.
110 citations
Authors
Showing all 632 results
Name | H-index | Papers | Citations |
---|---|---|---|
Dimitri P. Bertsekas | 94 | 332 | 85939 |
Olli Kallioniemi | 90 | 353 | 42021 |
Heikki Mannila | 72 | 295 | 26500 |
Jukka Corander | 66 | 411 | 17220 |
Jaakko Kangasjärvi | 62 | 146 | 17096 |
Aapo Hyvärinen | 61 | 301 | 44146 |
Samuel Kaski | 58 | 522 | 14180 |
Nadarajah Asokan | 58 | 327 | 11947 |
Aristides Gionis | 58 | 292 | 19300 |
Hannu Toivonen | 56 | 192 | 19316 |
Nicola Zamboni | 53 | 128 | 11397 |
Jorma Rissanen | 52 | 151 | 22720 |
Tero Aittokallio | 52 | 271 | 8689 |
Juha Veijola | 52 | 261 | 19588 |
Juho Hamari | 51 | 176 | 16631 |