scispace - formally typeset
Search or ask a question

Showing papers presented at "ACM international conference on Digital libraries in 2008"


Proceedings Article
01 Jan 2008
TL;DR: In this article, a method for determining whether data found on the Web are for the same or different objects that takes into account the possibility of changes in their attribute values over time is proposed.
Abstract: We have developed a method for determining whether data found on the Web are for the same or different objects that takes into account the possibility of changes in their attribute values over time. Specifically, we estimate the probability that observed data were generated for the same object that has undergone changes in its attribute values over time and the probability that the data are for different objects, and we define similarities between observed data using these probabilities. By giving a specific form to the distributions of time-varying attributes, we can calculate the similarity between given data and identify objects by using agglomerative clustering on the basis of the similarity. Experiments in which we compared identification accuracies between our proposed method and a method that regards all attribute values as constant showed that the proposed method improves the precision and recall of object identification.

10 citations