scispace - formally typeset
Search or ask a question

Showing papers by "Kai Puolamäki published in 2011"


Journal ArticleDOI
TL;DR: The first use of the prototype augmented reality (AR) platform to develop a pilot application, Virtual Laboratory Guide, and early evaluation results of this application are described.
Abstract: In this paper, we report on a prototype augmented reality (AR) platform for accessing abstract information in real-world pervasive computing environments. Using this platform, objects, people, and the environment serve as contextual channels to more information. The user’s interest with respect to the environment is inferred from eye movement patterns, speech, and other implicit feedback signals, and these data are used for information filtering. The results of proactive context-sensitive information retrieval are augmented onto the view of a handheld or head-mounted display or uttered as synthetic speech. The augmented information becomes part of the user’s context, and if the user shows interest in the AR content, the system detects this and provides progressively more information. In this paper, we describe the first use of the platform to develop a pilot application, Virtual Laboratory Guide, and early evaluation results of this application.

75 citations


Journal Article
TL;DR: In this article, a prototype augmented reality (AR) platform for accessing abstract information in real-world pervasive computing environments is reported, where objects, people, and the environment serve as contextual channels to more information.
Abstract: In this paper, we report on a prototype augmented reality (AR) platform for accessing abstract information in real-world pervasive computing environments. Using this platform, objects, people, and the environment serve as contextual channels to more information. The user’s interest with respect to the environment is inferred from eye movement patterns, speech, and other implicit feedback signals, and these data are used for information filtering. The results of proactive context-sensitive information retrieval are augmented onto the view of a handheld or head-mounted display or uttered as synthetic speech. The augmented information becomes part of the user’s context, and if the user shows interest in the AR content, the system detects this and provides progressively more information. In this paper, we describe the first use of the platform to develop a pilot application, Virtual Laboratory Guide, and early evaluation results of this application.

21 citations


01 Jan 2011
TL;DR: It is found that words obey different spatial patterns in the language, ranging from bursty to non-bursty/uniform, independent of their frequency, showing that the traditional approach leads to many false positives.
Abstract: Comparing frequency counts over texts or corpora is an important task in many applications and scientific disciplines. Given a text corpus, we want to test a hypothesis, such as "word X is frequent", "word X has become more frequent over time", or "word X is more frequent in male than in female speech". For this purpose we need a null model of word frequencies. The commonly used bag-of-words model, which corresponds to a Bernoulli process with fixed parameter, does not account for any structure present in natural languages. Using this model for word frequencies results in large numbers of words being reported as unexpectedly frequent. We address how to take into account the inherent occurrence patterns of words in significance testing of word frequencies. Based on studies of words in two large corpora, we propose two methods for modeling word frequencies that both take into account the occurrence patterns of words and go beyond the bag-of-words assumption. The first method models word frequencies based on the spatial distribution of individual words in the language. The second method is based on bootstrapping and takes into account only word frequency at the text level. The proposed methods are compared to the current gold standard in a series of experiments on both corpora. We find that words obey different spatial patterns in the language, ranging from bursty to non-bursty/uniform, independent of their frequency, showing that the traditional approach leads to many false positives.

16 citations


Journal Article
TL;DR: It is argued that before computing the correlations one has to carefully select what is the underlying base set of locations for which the co-occurrence counts, similarity indices, and their significance is computed.
Abstract: Correlation between occurrences of taxa is a fundamental concept in the analysis of presence-absence data. Such correlations can result from ecologically relevant processes, such as existence and evolution of species communities. Correlations are typically quantified by some sort of similarity index based on co-occurrence counts. We argue that the individual values of a similarity index are not useful as such: rather, we have to be able to estimate the statistical significance of the index value. Secondly, we argue that before computing the correlations one has to carefully select what is the underlying base set of locations for which the co-occurrence counts, similarity indices, and their significance is computed. We demonstrate base set selection with synthetic examples and conclude with an analysis of real data from a large database of fossil land mammals.

12 citations


Book ChapterDOI
05 Sep 2011
TL;DR: In this paper, the authors propose two methods for modeling word frequencies that both take into account the occurrence patterns of words and go beyond the bag-of-words assumption, based on the spatial distribution of individual words in the language.
Abstract: Comparing frequency counts over texts or corpora is an important task in many applications and scientific disciplines. Given a text corpus, we want to test a hypothesis, such as "word X is frequent", "word X has become more frequent over time", or "word X is more frequent in male than in female speech". For this purpose we need a null model of word frequencies. The commonly used bag-of-words model, which corresponds to a Bernoulli process with fixed parameter, does not account for any structure present in natural languages. Using this model for word frequencies results in large numbers of words being reported as unexpectedly frequent. We address how to take into account the inherent occurrence patterns of words in significance testing of word frequencies. Based on studies of words in two large corpora, we propose two methods for modeling word frequencies that both take into account the occurrence patterns of words and go beyond the bag-of-words assumption. The first method models word frequencies based on the spatial distribution of individual words in the language. The second method is based on bootstrapping and takes into account only word frequency at the text level. The proposed methods are compared to the current gold standard in a series of experiments on both corpora. We find that words obey different spatial patterns in the language, ranging from bursty to non-bursty/uniform, independent of their frequency, showing that the traditional approach leads to many false positives.

10 citations


Journal ArticleDOI
TL;DR: In this paper, the authors used an OLS linear regression model with a quadratic error function to estimate the number of bird species that occur in each grid and found that the best estimator for avian species richness in Finland is the length of the growing season.
Abstract: We used 10-km grid data from the Finnish Bird Atlas data and high-resolution data on temperature and rainfall to estimate species richness from climate and environmental variables across spatial scales. We used an ordinary least-squares (OLS) linear-regression model with a quadratic error function to estimate the number of bird species that occur. As a baseline, we used a simple dummy model that estimated the number of species in each grid to be the average number of species over all grids. We found that the best estimator for avian species richness in Finland is the length of the growing season with R2 values from 0.5 to 0.8, depending on the scale. Our results support the energy-water hypothesis, and we suggest that the proximate control of species richness in the present case is productivity, which is in turn controlled by climate. Some of the effects conventionally attributed to scaling may have trivial causes associated with sampling, in particular the completion of missing data as primary units of o...

6 citations