Showing papers by "Helsinki Institute for Information Technology published in 2007"

PDF

Open Access

Proceedings Article•DOI•

A data-oriented (and beyond) network architecture

[...]

Teemu Koponen¹, Mohit Chawla², Byung-Gon Chun², Andrey Ermolinskiy², Kye Hyun Kim², Scott Shenker³, Ion Stoica² - Show less +3 more•Institutions (3)

Helsinki Institute for Information Technology¹, University of California, Berkeley², International Computer Science Institute³

27 Aug 2007

TL;DR: The Data-Oriented Network Architecture (DONA) is proposed, which involves a clean-slate redesign of Internet naming and name resolution to adapt to changes in Internet usage.

...read moreread less

Abstract: The Internet has evolved greatly from its original incarnation. For instance, the vast majority of current Internet usage is data retrieval and service access, whereas the architecture was designed around host-to-host applications such as telnet and ftp. Moreover, the original Internet was a purely transparent carrier of packets, but now the various network stakeholders use middleboxes to improve security and accelerate applications. To adapt to these changes, we propose the Data-Oriented Network Architecture (DONA), which involves a clean-slate redesign of Internet naming and name resolution.

...read moreread less

1,643 citations

Book•

Information and complexity in statistical modeling

[...]

Jorma Rissanen¹•Institutions (1)

Helsinki Institute for Information Technology¹

01 Jan 2007

TL;DR: Inspired by Kolmogorov's structure function for finite sets as models of data in the algorithmic theory of information, this work adapts the construct to families of probability models to avoid the noncomputability problem.

...read moreread less

Abstract: Summary form only. Inspired by Kolmogorov's structure function for finite sets as models of data in the algorithmic theory of information we adapt the construct to families of probability models to avoid the noncomputability problem. The picture of modeling looks then as follows: The models in the family have a double index, where the first specifies a structure, ranging over a finite or a countable set, and the second consists of parameter values, ranging over a continuum. An optimal structure index can be determined by the MDL (Minimum Description Length) principle in a two-part code, where the sum of the code lengths for the structure and the data is minimized. The latter is obtained from the universal NML (Normalized Maximum Likelihood) model for the subfamily of models having a specified structure. The determination of the optimal model in the optimized structure is more difficult. It requires a partition of the parameter space into equivalence classes, each associated with a model, in such a way that the Kullback-Leibler distance between any two adjacent models is equal and that the models are optimally distinguishable from the given amount of data. This notion of distinguishability is a modification of a related idea of Balasubramanian. The particular model, specified by the observed data, is the simplest one that incorporates all the properties in the data that can be extracted with the model class considered.

...read moreread less

352 citations

Journal Article•DOI•

Some extensions of score matching

[...]

Aapo Hyvärinen¹•Institutions (1)

Helsinki Institute for Information Technology¹

01 Feb 2007-Computational Statistics & Data Analysis

TL;DR: It is shown how to estimate non-normalized models defined in the non-negative real domain, i.e. R"+^n", and it is shown that the score matching estimator can be obtained in closed form for some exponential families.

...read moreread less

233 citations

Proceedings Article•DOI•

Mobile kits and laptop trays: managing multiple devices in mobile information work

[...]

Antti Oulasvirta¹, Lauri Sumari¹•Institutions (1)

Helsinki Institute for Information Technology¹

29 Apr 2007

TL;DR: A study at a large IT company shows that mobile information workers frequently migrate work across devices, and workers' strategies of coping with these problems center on the physical handling of devices and cross-device synchronization.

...read moreread less

Abstract: A study at a large IT company shows that mobile information workers frequently migrate work across devices (here: smartphones, desktop PCs, laptops). While having multiple devices provides new opportunities to work in the face of changing resource deprivations, the management of devices is often problematic. The most salient problems are posed by 1) the physical effort demanded by various management tasks, 2) anticipating what data or functionality will be needed, and 3) aligning these efforts with work, mobility, and social situations. Workers' strategies of coping with these problems center on two interwoven activities: the physical handling of devices and cross-device synchronization. These aim at balancing risk and effort in immediate and subsequent use. Workers also exhibit subtle ways to handle devices in situ, appropriating their physical and operational properties. The design implications are discussed.

...read moreread less

133 citations

Journal Article•DOI•

Context-Aware Migratory Services in Ad Hoc Networks

[...]

Oriana Riva¹, Tamer Nadeem², Cristian Borcea³, Liviu Iftode•Institutions (3)

Helsinki Institute for Information Technology¹, Siemens², New Jersey Institute of Technology³

01 Dec 2007-IEEE Transactions on Mobile Computing

TL;DR: This work proposes a novel model of service provisioning in ad hoc networks based on the concept of context- aware migratory services, and built TJam, a proof-of-concept migratory service that predicts traffic jams in a given region of a highway by using only car-to-car short-range wireless communication.

...read moreread less

Abstract: Ad hoc networks can be used not only as data carriers for mobile devices but also as providers of a new class of services specific to ubiquitous computing environments. Building services in ad hoc networks, however, is challenging due to the rapidly changing operating contexts, which often lead to situations where a node hosting a certain service becomes unsuitable for hosting the service execution any longer. We propose a novel model of service provisioning in ad hoc networks based on the concept of context- aware migratory services. Unlike a regular service that executes always on the same node, a migratory service can migrate to different nodes in the network in order to accomplish its task. The migration is triggered by changes of the operating context, and it occurs transparently to the client application. We designed and implemented a framework for developing migratory services. We built TJam, a proof-of-concept migratory service that predicts traffic jams in a given region of a highway by using only car-to-car short-range wireless communication. The experimental results obtained over an ad hoc network of personal digital assistants (PDAs) show the effectiveness of our approach in the presence of frequent disconnections. We also present simulation results that demonstrate the benefits of migratory services in large-scale networks compared to a statically centralized approach.

...read moreread less

124 citations

Journal Article•DOI•

A linear-time algorithm for computing the multinomial stochastic complexity

[...]

Petri Kontkanen¹, Petri Myllymäki¹•Institutions (1)

Helsinki Institute for Information Technology¹

01 Sep 2007-Information Processing Letters

TL;DR: An elegant recursion formula is derived which allows efficient computation of the stochastic complexity in the case of n observations of a single multinomial random variable with K values and the time complexity is O(n+K) as opposed to O(nlognlogK) obtained with the previous results.

...read moreread less

109 citations

Proceedings Article•DOI•

Performative roles of materiality for collective creativity

[...]

Giulio Jacucci¹, Ina Wagner²•Institutions (2)

Helsinki Institute for Information Technology¹, Vienna University of Technology²

13 Jun 2007

TL;DR: This paper shows how the variety of material features expands communicative resources and provide border resources for action, in their peripheral, evocative, and referential function, and how materiality is part of performative action, looking at temporal frames of relevance and emergence in specific events.

...read moreread less

Abstract: This paper seeks to develop a better understanding of the contribution of materiality for creativity in collaborative settings, exploring the ways in which it provides resources for persuasive, narrative and experiential interactions. Based on extensive field studies of architectural design workplaces and on examples from art works, we show: how the variety of material features expands communicative resources and provide border resources for action, in their peripheral, evocative, and referential function; how spatiality supports the public availability of artefacts as well as people's direct, bodily engagement with materiality; and finally how materiality is part of performative action, looking at temporal frames of relevance and emergence in specific events. We conclude with implications for the development of novel interface technologies.

...read moreread less

107 citations

Proceedings Article•

On sensitivity of the MAP Bayesian network structure to the equivalent sample size parameter

[...]

Tomi Silander¹, Petri Kontkanen¹, Petri Myllymäki¹•Institutions (1)

Helsinki Institute for Information Technology¹

19 Jul 2007

TL;DR: The solution of the network structure optimization problem is highly sensitive to the chosen α parameter value, and explanations for how and why this phenomenon happens are given, and ideas for solving this problem are discussed.

...read moreread less

Abstract: BDeu marginal likelihood score is a popular model selection criterion for selecting a Bayesian network structure based on sample data. This non-informative scoring criterion assigns same score for network structures that encode same independence statements. However, before applying the BDeu score, one must determine a single parameter, the equivalent sample size α. Unfortunately no generally accepted rule for determining the α parameter has been suggested. This is disturbing, since in this paper we show through a series of concrete experiments that the solution of the network structure optimization problem is highly sensitive to the chosen α parameter value. Based on these results, we are able to give explanations for how and why this phenomenon happens, and discuss ideas for solving this problem.

...read moreread less

93 citations

Journal Article•DOI•

The Urbanet Revolution: Sensor Power to the People!

[...]

Oriana Riva¹, Cristian Borcea²•Institutions (2)

Helsinki Institute for Information Technology¹, New Jersey Institute of Technology²

01 Apr 2007-IEEE Pervasive Computing

TL;DR: With mobile devices becoming ubiquitous, the time is ripe to bring sensor data out of close-loop networks into the center of daily urban life and address application-specific, static-sensor deployments to accurately monitor the sensed environment in real time.

...read moreread less

Abstract: With mobile devices becoming ubiquitous, the time is ripe to bring sensor data out of close-loop networks into the center of daily urban life The Internet has become a great success because its applications appeal to regular people This isn't the case with sensor networks, which are generally perceived as "something" remote in the forest or on the battlefield With few exceptions, first-generation sensor networks address application-specific, static-sensor deployments to accurately monitor the sensed environment in real time

...read moreread less

85 citations

Journal Article•DOI•

Active construction of experience through mobile media: a field study with implications for recording and sharing

[...]

Giulio Jacucci¹, Antti Oulasvirta¹, Antti Salovaara¹•Institutions (1)

Helsinki Institute for Information Technology¹

01 Apr 2007

TL;DR: This analysis of the organization of experience-related activities in the mass event focuses on the active role of technology-mediated memories in constructing experiences and advocates applications that not only store or capture human experience for sharing or later use but also actively participates in the very construction of experience.

...read moreread less

Abstract: To fully appreciate the opportunities provided by interactive and ubiquitous multimedia to record and share experiences, we report on an ethnographic investigation on the settings and nature of human memory and experience at a large-scale event. We studied two groups of spectators at a FIA World Rally Championship in Finland, both equipped with multimedia mobile phones. Our analysis of the organization of experience-related activities in the mass event focuses on the active role of technology-mediated memories in constructing experiences. Continuity, reflexivity with regard to the Self and the group, maintaining and re-creating group identity, protagonism and active spectatorship were important social aspects of the experience and were directly reflected in how multimedia was used. Particularly, we witnessed multimedia-mediated forms of expression, such as staging, competition, storytelling, joking, communicating presence, and portraying others; and the motivation for these stemmed from the engaging, processual, and shared nature of experience. Moreover, we observed how temporality and spatiality provided a platform for constructing experiences. The analysis advocates applications that not only store or capture human experience for sharing or later use but also actively participates in the very construction of experience. The approach conveys several valuable design implications.

...read moreread less

84 citations

Book Chapter•DOI•

Combining web, mobile phones and public displays in large-scale: manhattan story mashup

[...]

Ville Tuulos¹, Jürgen Scheible, Heli Nyholm²•Institutions (2)

Helsinki Institute for Information Technology¹, Nokia²

13 May 2007

TL;DR: The analysis shows how the game succeeds in fostering players' creativity by exploiting ambiguity and how the players were engaged in a fast-paced competition which resulted in 115 stories and 3142 photos in 1.5 hours.

...read moreread less

Abstract: We present a large-scale pervasive game called Manhattan Story Mashup that combines the Web, camera phones, and a large public display. The game introduces a new form of interactive storytelling which lets an unlimited number of players author stories in the Web while a large number of players illustrate the stories with camera phones. This paper presents the first deployment of the game and a detailed analysis of its quantitative and qualitative results. We present details on the game implementation and game set up including practical lessons learnt about this large-scale experiment involving over 300 players in total. The analysis shows how the game succeeds in fostering players' creativity by exploiting ambiguity and how the players were engaged in a fast-paced competition which resulted in 115 stories and 3142 photos in 1.5 hours.

...read moreread less

Journal Article•DOI•

Interpreting and acting on mobile awareness cues

[...]

Antti Oulasvirta¹, Renaud Petit¹, Mika Raento¹, Sauli Tiitta¹•Institutions (1)

Helsinki Institute for Information Technology¹

01 May 2007-Human-Computer Interaction

TL;DR: This work investigates how users interpret cues of other users' situations as a situation, action, or intention of a remote person and then act on them in everyday social interactions through smartphone-based mobile awareness systems.

...read moreread less

Abstract: Mobile awareness systems provide user-controlled and automatic, sensor-derived cues of other users' situations and in that way attempt to facilitate group practices and provide opportunities for social interaction. We are interested in investigating how users interpret these cues as a situation, action, or intention of a remote person and then act on them in everyday social interactions. Three field trials utilizing A-B intervention research methodology were conducted with three types of teenager groups (N = 15, total days = 243). Each trial had a slightly different variation of Context Contacts-a smartphone-based multicue mobile awareness system. We report on several analyses on how the cues were accessed, viewed, monitored, inferred, and acted on.

...read moreread less

Journal Article•DOI•

Gene expression profiles in asbestos-exposed epithelial and mesothelial lung cell lines

[...]

Penny Nymark¹, Penny Nymark², Pamela Lindholm¹, Mikko Korpela³, Leo Lahti³, Salla Ruosaari³, Salla Ruosaari², Salla Ruosaari¹, Samuel Kaski³, Jaakko Hollmén³, Sisko Anttila², Vuokko L. Kinnula⁴, Sakari Knuutila¹ - Show less +9 more•Institutions (4)

Helsinki University Central Hospital¹, Finnish Institute of Occupational Health², Helsinki Institute for Information Technology³, University of Helsinki⁴

01 Mar 2007-BMC Genomics

TL;DR: This study exposed the human cell lines A549, Beas-2B and Met5A to crocidolite asbestos and determined time-dependent gene expression profiles by using Affymetrix arrays, and identified chromosomal regions enriched with genes potentially contributing to common responses to asbestos in these cell lines.

...read moreread less

Abstract: Asbestos has been shown to cause chromosomal damage and DNA aberrations. Exposure to asbestos causes many lung diseases e.g. asbestosis, malignant mesothelioma, and lung cancer, but the disease-related processes are still largely unknown. We exposed the human cell lines A549, Beas-2B and Met5A to crocidolite asbestos and determined time-dependent gene expression profiles by using Affymetrix arrays. The hybridization data was analyzed by using an algorithm specifically designed for clustering of short time series expression data. A canonical correlation analysis was applied to identify correlations between the cell lines, and a Gene Ontology analysis method for the identification of enriched, differentially expressed biological processes. We recognized a large number of previously known as well as new potential asbestos-associated genes and biological processes, and identified chromosomal regions enriched with genes potentially contributing to common responses to asbestos in these cell lines. These include genes such as the thioredoxin domain containing gene (TXNDC) and the potential tumor suppressor, BCL2/adenovirus E1B 19kD-interacting protein gene (BNIP3L), GO-terms such as "positive regulation of I-kappaB kinase/NF-kappaB cascade" and "positive regulation of transcription, DNA-dependent", and chromosomal regions such as 2p22, 9p13, and 14q21. We present the complete data sets as Additional files. This study identifies several interesting targets for further investigation in relation to asbestos-associated diseases.

...read moreread less

Journal Article•DOI•

Complex cell pooling and the statistics of natural images

[...]

Aapo Hyvärinen¹, Urs Köster¹•Institutions (1)

Helsinki Institute for Information Technology¹

01 Jun 2007-Network: Computation In Neural Systems

TL;DR: This model is novel in that it is the first to analyze optimal subspace size and how this size is influenced by contrast normalization and shows that the optimal nonlinearity for the pooling is squaring.

...read moreread less

Abstract: In previous work, we presented a statistical model of natural images that produced outputs similar to receptive fields of complex cells in primary visual cortex. However, a weakness of that model was that the structure of the pooling was assumed a priori and not learned from the statistical properties of natural images. Here, we present an extended model in which the pooling nonlinearity and the size of the subspaces are optimized rather than fixed, so we make much fewer assumptions about the pooling. Results on natural images indicate that the best probabilistic representation is formed when the size of the subspaces is relatively large, and that the likelihood is considerably higher than for a simple linear model with no pooling. Further, we show that the optimal nonlinearity for the pooling is squaring. We also highlight the importance of contrast gain control for the performance of the model. Our model is novel in that it is the first to analyze optimal subspace size and how this size is influenced by contrast normalization.

...read moreread less

Proceedings Article•DOI•

Appropriation of a MMS-based comic creator: from system functionalities to resources for action

[...]

Antti Salovaara¹•Institutions (1)

Helsinki Institute for Information Technology¹

29 Apr 2007

TL;DR: The relationship of functionalities of the artifact and the development of resources are discussed by presenting how functionalities can be designed to support three ways to appropriate communication technologies: increasing technical mastery, re-channeling existing communication into the new medium and inventing new communicative acts between users.

...read moreread less

Abstract: Technologies can be used - or appropriated - in different ways by different users, but how do the use patterns evolve, and how can design facilitate such evolution? This paper approaches these questions in light of a case study in which a group of 8 high school students used Comeks, a mobile comic strip creator that enables users to exchange rich, expressive multimedia messages. A qualitative analysis of the use processes shows how users turned the functionalities embodied in Comeks into particular resources for communication during the 9-week trial period. The paper discusses the relationship of functionalities of the artifact and the development of resources by presenting how functionalities can be designed to support three ways to appropriate communication technologies: increasing technical mastery, re-channeling existing communication into the new medium and inventing new communicative acts between users.

...read moreread less

Journal Article•DOI•

The federated database: a basis for biobank-based post-genome studies, integrating phenome and genome data from 600,000 twin pairs in Europe

[...]

Juha Muilu¹, Juha Muilu², Leena Peltonen¹, Jan-Eric Litton³•Institutions (3)

University of Helsinki¹, Helsinki Institute for Information Technology², Karolinska Institutet³

09 May 2007-European Journal of Human Genetics

TL;DR: This work describes how it constructed a federated database infrastructure for genotype and phenotype information collected in seven European countries and Australia and connected this database setting via a network called TwinNET to guarantee effortless data exchange and pooled analyses.

...read moreread less

Abstract: Integration of complex data and data management represent major challenges in large-scale biobank-based post-genome era research projects like GenomEUtwin (an international collaboration between eight Twin Registries) with extensive amounts of genotype and phenotype data combined from different data sources located in different countries. The challenge lies not only in data harmonization and constant update of clinical details in various locations, but also in the heterogeneity of data storage and confidentiality of sensitive health-related and genetic data. Solid infrastructure must be built to provide secure, but easily accessible and standardized, data exchange also facilitating statistical analyses of the stored data. Data collection sites desire to have full control of the accumulation of data, and at the same time the integration should facilitate effortless slicing and dicing of the data for different types of data pooling and study designs. Here we describe how we constructed a federated database infrastructure for genotype and phenotype information collected in seven European countries and Australia and connected this database setting via a network called TwinNET to guarantee effortless data exchange and pooled analyses. This federated database system offers a powerful facility for combining different types of information from multiple data sources. The system is transparent to end users and application developers, since it makes the set of federated data sources look like a single system. The user need not be aware of the format or site where the data are stored, the language or programming interface of the data source, how the data are physically stored, whether they are partitioned and/or replicated or what networking protocols are used. The user sees a single standardized interface with the desired data elements for pooled analyses.

...read moreread less

Journal Article•DOI•

Approximability of identifying codes and locating--dominating codes

[...]

Jukka Suomela¹•Institutions (1)

Helsinki Institute for Information Technology¹

01 Jun 2007-Information Processing Letters

TL;DR: This work studies the approximability and inapproximability of finding identifying codes and locating-dominating codes of the minimum size and shows that it is possible to approximate both problems within a logarithmic factor, but sublogarithic approximation ratios are intractable.

...read moreread less

Proceedings Article•DOI•

Conditional NML Universal Models

[...]

Jorma Rissanen¹, Teemu Roos¹•Institutions (1)

Helsinki Institute for Information Technology¹

22 Oct 2007

TL;DR: A universal conditional NML model is presented, which has minmax optimal properties similar to those of the regular N ML model, but which defines a random process which can be used for prediction and also admits a recursive evaluation for data compression.

...read moreread less

Abstract: The NML (normalized maximum likelihood) universal model has certain minmax optimal properties but it has two shortcomings: the normalizing coefficient can be evaluated in a closed form only for special model classes, and it does not define a random process so that it cannot be used for prediction. We present a universal conditional NML model, which has minmax optimal properties similar to those of the regular NML model. However, unlike NML, the conditional NML model defines a random process which can be used for prediction. It also admits a recursive evaluation for data compression. The conditional normalizing coefficient is much easier to evaluate, for instance, for tree machines than the integral of the square root of the Fisher information in the NML model. For Bernoulli distributions, the conditional NML model gives a predictive probability, which behaves like the Krichevsky-Trofimov predictive probability, actually slightly better for extremely skewed strings. For some model classes, it agrees with the predictive probability found earlier by Takimoto and Warmuth, as the solution to a different more restrictive minmax problem. We also calculate the CNML models for the generalized Gaussian regression models, and in particular for the cases where the loss function is quadratic, and show that the CNML model achieves asymptotic optimality in terms of the mean ideal code length. Moreover, the quadratic loss, which represents fitting errors as noise rather than prediction errors, can be shown to be smaller than what can be achieved with the NML as well as with the so-called plug-in or the predictive MDL model.

...read moreread less

Proceedings Article•DOI•

Thick non-crossing paths and minimum-cost flows in polygonal domains

[...]

Valentin Polishchuk¹, Joseph S. B. Mitchell²•Institutions (2)

Helsinki Institute for Information Technology¹, Stony Brook University²

06 Jun 2007

TL;DR: It is shown that if h is not constant, the problem is NP-hard; the hardness of approximation is shown and a pseudopolynomial-time algorithm is given for some rectilinear versions of the problem.

...read moreread less

Abstract: study the problem of flnding shortest non-crossing thick paths in a polygonal domain, where a thick path is the Min- kowski sum of a usual (zero-thickness, or thin) path and a disk. Given K pairs of terminals on the boundary of a sim- ple n-gon, we compute in O(n + K) time a representation of the set of K shortest non-crossing thick paths joining the terminal pairs; using the representation, any particular path can be output in time proportional to its complexity. We compute K shortest thick non-crossing paths in a polygon with h holes in O (K + 1) h h!poly(n;K) time, us- ing an e-cient method to compute any one of the K thick paths if the \threadings" of all paths amidst the holes are specifled. We show that if h is not constant, the problem is NP-hard; we also show the hardness of approximation. We give a pseudopolynomial-time algorithm for some rectilinear versions of the problem. We apply our thick paths algorithms to obtain the flrst algorithmic results for the minimum-cost continuous ∞ow problem | an extension of the standard discrete minimum- cost network ∞ow problem to continuous domains. The re- sults are based on showing a continuous analog of the Net-

...read moreread less

Journal Article•DOI•

Bayesian network analysis of resistance pathways against HIV-1 protease inhibitors.

[...]

Koen Deforche¹, Ricardo Jorge Camacho, Zehava Grossman², Tomi Silander³, Marcio Soares⁴, Yves Moreau⁵, Robert W. Shafer⁶, K. Van Laethem⁵, K. Van Laethem¹, Ana P. Carvalho, Brian Wynhoven, Patricia A. Cane⁷, Joke Snoeck¹, John R. Clarke, Sunee Sirivichayakul⁸, Koya Ariyoshi⁹, África Holguín¹⁰, H. Rudich², Rosangela Rodrigues¹¹, María Belén Bouzas, P. Cahn, Luís Fernando de Macedo Brígido¹¹, Vincent Soriano¹⁰, Wataru Sugiura⁹, Praphan Phanuphak⁸, Lynn Morris, J Weber, Deenan Pillay⁷, Amilcar Tanuri⁴, P. R. Harrigan, J.M. Shapiro², David Katzenstein⁶, Rami Kantor⁶, Anne-Mieke Vandamme¹ - Show less +30 more•Institutions (11)

Rega Institute for Medical Research¹, Sheba Medical Center², Helsinki Institute for Information Technology³, Federal University of Rio de Janeiro⁴, Katholieke Universiteit Leuven⁵, Brown University⁶, Health Protection Agency⁷, Chulalongkorn University⁸, National Institutes of Health⁹, Carlos III Health Institute¹⁰, Instituto Adolfo Lutz¹¹

01 Jun 2007-Infection, Genetics and Evolution

TL;DR: Bayesian network learning provides an autonomous method to gain insight in the role of resistance mutations and the influence of HIV-1 natural variation and the method successfully applied the method to three protease inhibitors.

...read moreread less

Book Chapter•DOI•

A two-layer ICA-like model estimated by score matching

[...]

Urs Köster¹, Aapo Hyvärinen¹•Institutions (1)

Helsinki Institute for Information Technology¹

09 Sep 2007

TL;DR: A statistical model is presented that learns a nonlinear representation from the data that reflects abstract, invariant properties of the signal without making requirements about the kind of signal that can be processed.

...read moreread less

Abstract: Capturing regularities in high-dimensional data is an important problem in machine learning and signal processing. Here we present a statistical model that learns a nonlinear representation from the data that reflects abstract, invariant properties of the signal without making requirements about the kind of signal that can be processed. The model has a hierarchy of two layers, with the first layer broadly corresponding to Independent Component Analysis (ICA) and a second layer to represent higher order structure. We estimate the model using the mathematical framework of Score Matching (SM), a novel method for the estimation of non-normalized statistical models. The model incorporates a squaring nonlinearity, which we propose to be suitable for forming a higher-order code of invariances. Additionally the squaring can be viewed as modelling subspaces to capture residual dependencies, which linear models cannot capture.

...read moreread less

Journal Article•DOI•

Methods for estimating human endogenous retrovirus activities from EST databases.

[...]

Merja Oja¹, Merja Oja², Jaakko Peltonen³, Jonas Blomberg⁴, Samuel Kaski³ - Show less +1 more•Institutions (4)

University of Helsinki¹, Helsinki University of Technology², Helsinki Institute for Information Technology³, Uppsala University⁴

03 May 2007-BMC Bioinformatics

TL;DR: A generative mixture model is introduced, based on Hidden Markov Models, for estimating the activities of the individual HERV sequences from EST (expressed sequence tag) databases, and it is shown that 7% of the HERVs are active.

...read moreread less

Abstract: Background Human endogenous retroviruses (HERVs) are surviving traces of ancient retrovirus infections and now reside within the human DNA. Recently HERV expression has been detected in both normal tissues and diseased patients. However, the activities (expression levels) of individual HERV sequences are mostly unknown.

...read moreread less

Journal Article•DOI•

The DYNAMOS approach to support context-aware service provisioning in mobile environments

[...]

Oriana Riva¹, Santtu Toivonen²•Institutions (2)

Helsinki Institute for Information Technology¹, VTT Technical Research Centre of Finland²

01 Dec 2007-Journal of Systems and Software

TL;DR: This project has designed and implemented a system platform and application prototype running on smart phones to support a hybrid approach that enhances context-aware service provisioning with peer-to-peer social functionalities in the DYNAMOS project.

...read moreread less

Proceedings Article•

Information Retrieval by Inferring Implicit Queries from Eye Movements

[...]

David R. Hardoon¹, John Shawe-Taylor¹, Antti Ajanki², Kai Puolamäki², Samuel Kaski² - Show less +1 more•Institutions (2)

University College London¹, Helsinki Institute for Information Technology²

11 Mar 2007

TL;DR: A new search strategy, in which the information retrieval (IR) query is inferred from eye movements measured when the user is reading text during an IR task, such that relevance predictions for a large set of unseen documents are ranked significantly better than by random guessing.

...read moreread less

Abstract: We introduce a new search strategy, in which the information retrieval (IR) query is inferred from eye movements measured when the user is reading text during an IR task. In training phase, we know the users' interest, that is, the relevance of training documents. We learn a predictor that produces a "query" given the eye movements; the target of learning is an "optimal" query that is computed based on the known relevance of the training documents. Assuming the predictor is universal with respect to the users' interests, it can also be applied to infer the implicit query when we have no prior knowledge of the users' interests. The result of an empirical study is that it is possible to learn the implicit query from a small set of read documents, such that relevance predictions for a large set of unseen documents are ranked significantly better than by random guessing.

...read moreread less

Proceedings Article•DOI•

Q-learning algorithms for optimal stopping based on least squares

[...]

Huizhen Yu¹, Dimitri P. Bertsekas¹•Institutions (1)

Helsinki Institute for Information Technology¹

02 Jul 2007

TL;DR: This work considers the solution of discounted optimal stopping problems using linear function approximation methods and proposes alternative algorithms, which are based on projected value iteration ideas and least squares, which prove the convergence of some of these algorithms.

...read moreread less

Abstract: We consider the solution of discounted optimal stopping problems using linear function approximation methods. A Q-learning algorithm for such problems, proposed by Tsitsiklis and Van Roy, is based on the method of temporal differences and stochastic approximation. We propose alternative algorithms, which are based on projected value iteration ideas and least squares. We prove the convergence of some of these algorithms and discuss their properties.

...read moreread less

Proceedings Article•DOI•

Using dramaturgical methods to gain more dynamic user understanding in user-centered design

[...]

Vesa Kantola¹, Sauli Tiitta¹, Katri Mehto, Tomi Kankainen¹•Institutions (1)

Helsinki Institute for Information Technology¹

13 Jun 2007

TL;DR: In this article, the authors introduce the method of dramaturgical reading, which was originally a method of producing different crystallized and associative theatrical and graphical presentations of a role character in a drama context.

...read moreread less

Abstract: In this paper we introduce the method of dramaturgical reading, which was originally a method of producing different crystallized and associative theatrical and graphical presentations of a role character in a drama context. We transfer dramaturgical reading into the field of user-centered design in order to understand, analyze and represent user-centered material. We compare a persona created with dramaturgical reading to a user profile and persona. We state that adapting a role character as an embodied and concrete user description in user-centered design improves the designers' ability to empathize and understand the users, thus improving the results of the design process. We believe personas must be enabled to "come to life" and allowed to develop in the minds of the designers using them. The dramaturgical method is one way of accomplishing this.

...read moreread less

Proceedings Article•DOI•

Performance of host identity protocol on lightweight hardware

[...]

Andrey Khurri¹, Ekaterina Vorobyeva¹, Andrei Gurtov¹•Institutions (1)

Helsinki Institute for Information Technology¹

27 Aug 2007

TL;DR: Performance measurements of HIP over WLAN on Nokia 770 Internet Tablet are presented and comprehensive analysis of the results are provided and suggestions on HIP suitability for lightweight clients are made.

...read moreread less

Abstract: The Host Identity Protocol (HIP) is being standardized by the IETF as a new solution for host mobility and multihoming in the Internet. HIP uses self-certifying public-private key pairs in combination with IPsec to authenticate hosts and protect user data. While there are three open-source HIP implementations, no experience is available with running HIP on lightweight hardware such as a PDA or a mobile phone. Limited computational power and battery lifetime of lightweight devices raises concerns if HIP can be used there at all. This paper presents performance measurements of HIP over WLAN on Nokia 770 Internet Tablet. It also provides comprehensive analysis of the results and makes suggestions on HIP suitability for lightweight clients.

...read moreread less

Journal Article•DOI•

Unscrambling the "average user" of habbo hotel

[...]

Mikael Johnson¹•Institutions (1)

Helsinki Institute for Information Technology¹

01 Jan 2007-Human technology : an interdisciplinary journal on humans in ICT environments

TL;DR: The aim is to create an understanding about categorization practices in design through a case study about the virtual community, Habbo Hotel, and a qualitative analysis highlighted not only the meaning of the "average user," but also the work that both the developer and the category contribute to this meaning.

...read moreread less

Abstract: The "user" is an ambiguous concept in human-computer interaction and information systems. Analyses of users as social actors, participants, or configured users delineate approaches to studying design-use relationships. Here, a developer's reference to a figure of speech, termed the "average user," is contrasted with design guidelines. The aim is to create an understanding about categorization practices in design through a case study about the virtual community, Habbo Hotel. A qualitative analysis highlighted not only the meaning of the "average user," but also the work that both the developer and the category contribute to this meaning. The average user a) represents the unknown, b) influences the boundaries of the target user groups, c) legitimizes the designer to disregard marginal user feedback, and d) keeps the design space open, thus allowing for creativity. The analysis shows how design and use are intertwined and highlights the developers' role in governing different users' interests.

...read moreread less

Journal Article•DOI•

XML messaging for mobile devices: From requirements to implementation

[...]

Jaakko Kangasharju¹, Tancred Lindholm¹, Sasu Tarkoma¹•Institutions (1)

Helsinki Institute for Information Technology¹

01 Nov 2007-Computer Networks

TL;DR: There is potential for improvement in XML messaging by using an asynchronous programming style and by using a compact serialization format, and the design and implementation of a messaging system that addresses these requirements is presented.

...read moreread less

Book Chapter•DOI•

Fast search algorithms for position specific scoring matrices

[...]

Cinzia Pizzi¹, Pasi Rastas¹, Esko Ukkonen¹•Institutions (1)

Helsinki Institute for Information Technology¹

12 Mar 2007

TL;DR: In the experimental comparison of different algorithms the new algorithms were clearly faster than the naive method and also fasterthan the well-known lookahead scoring algorithm.

...read moreread less

Abstract: Fast search algorithms for finding good instances of patterns given as position specific scoring matrices are developed, and some empirical results on their performance on DNA sequences are reported. The algorithms basically generalize the Aho-Corasick, filtration, and superalphabet techniques of string matching to the scoring matrix search. As compared to the naive search, our algorithms can be faster by a factor which is proportional to the length of the pattern. In our experimental comparison of different algorithms the new algorithms were clearly faster than the naive method and also faster than the well-known lookahead scoring algorithm. The Aho-Corasick technique is the fastest for short patterns and high significance thresholds of the search. For longer patterns the filtration method is better while the superalphabet technique is the best for very long patterns and low significance levels. We also observed that the actual speed of all these algorithms is very sensitive to implementation details.

...read moreread less