scispace - formally typeset
Search or ask a question

Showing papers by "Kai Puolamäki published in 2006"


Proceedings ArticleDOI
20 Aug 2006
TL;DR: This work considers bucket orders, i.e., total orders with ties, which can be used to capture the essential order information without overfitting the data and describes simple and efficient algorithms for finding good bucket orders.
Abstract: Ordering and ranking items of different types are important tasks in various applications, such as query processing and scientific data mining. A total order for the items can be misleading, since there are groups of items that have practically equal ranks.We consider bucket orders, i.e., total orders with ties. They can be used to capture the essential order information without overfitting the data: they form a useful concept class between total orders and arbitrary partial orders. We address the question of finding a bucket order for a set of items, given pairwise precedence information between the items. We also discuss methods for computing the pairwise precedence data.We describe simple and efficient algorithms for finding good bucket orders. Several of the algorithms have a provable approximation guarantee, and they scale well to large datasets. We provide experimental results on artificial and a real data that show the usefulness of bucket orders and demonstrate the accuracy and efficiency of the algorithms.

64 citations


Journal ArticleDOI
TL;DR: A full probabilistic model for fossil data that can be used to answer many different questions about the data, including seriation (finding the best ordering of the sites) and outlier detection is described.
Abstract: Given a collection of fossil sites with data about the taxa that occur in each site, the task in biochronology is to find good estimates for the ages or ordering of sites. We describe a full probabilistic model for fossil data. The parameters of the model are natural: the ordering of the sites, the origination and extinction times for each taxon, and the probabilities of different types of errors. We show that the posterior distributions of these parameters can be estimated reliably by using Markov chain Monte Carlo techniques. The posterior distributions of the model parameters can be used to answer many different questions about the data, including seriation (finding the best ordering of the sites) and outlier detection. We demonstrate the usefulness of the model and estimation method on synthetic data and on real data on large late Cenozoic mammals. As an example, for the sites with large number of occurrences of common genera, our methods give orderings, whose correlation with geochronologic ages is 0.95.

60 citations


01 Jan 2006
TL;DR: Eyetools’ pioneering work in inferring mental state from eye movements and visualizing eyetrack data has led to several key patents in the area, and has enabled eyetracking to be put into use more easily by an ever expanding number of companies and people.
Abstract: Eyetools was born in 2000 out of the Stanford University Advanced Eye Interpretation Project. After seeing the business value of eyetracking resulting from the Stanford-Poynter Project, a collaborative study between the Poynter Institute and Stanford University’s Department of Communications around the viewing of online news sites, founder Greg Edwards spun out Eyetools. Since then, Eyetools’ pioneering work in inferring mental state from eye movements and visualizing eyetracking data has led to several key patents in the area, and has enabled eyetracking to be put into use more easily by an ever expanding number of companies and people. Eyetools’ roots in Human-Computer Interaction began in 1995 ∗PASCAL Invited Talk

5 citations