scispace - formally typeset
Search or ask a question

Showing papers by "Helsinki Institute for Information Technology published in 1996"


Proceedings Article
01 Feb 1996

2,649 citations


Proceedings Article
03 Sep 1996
TL;DR: New algorithms that reduce the database activity considerably by picking a Random sample, to find using this sample all association rules that probably hold in the whole database, and then to verify the results with the rest of the database.
Abstract: Discovery of association rules .is an important database mining problem. Current algorithms for finding association rules require several passes over the analyzed database, and obviously the role of I/O overhead is very significant for very large databases. We present new algorithms that reduce the database activity considerably. The idea is to pick a Random sample, to find using this sample all association rules that probably hold in the whole database, and then to verify the results with the rest of the database. The algorithms thus produce exact association rules, not approximations based on a sample. The approach is, however, probabilistic, and in those rare cases where our sampling method does not produce all association rules, the missing rules can be found in a second pass. Our experiments show that the proposed algorithms can find association rules very efficiently in only one database

1,245 citations


01 Jan 1996
TL;DR: This article presents a method, WEBSOM, for automatic organization of full-text document collections using the self-organizing map (SOM) algorithm, and presents a case study of its use.
Abstract: Powerful methods for interactive exploration and search from collections of free-form textual documents are needed to manage the ever-increasing flood of digital information. In this article we present a method, WEBSOM, for automatic organization of full-text document collections using the self-organizing map (SOM) algorithm. The document collection is ordered onto a map in an unsupervised manner utilizing statistical information of short word contexts. The resulting ordered map where similar documents lie near each other thus presents a general view of the document space. With the aid of a suitable (WWW-based) interface, documents in interesting areas of the map can be browsed. The browsing can also be interactively extended to related topics, which appear in nearby areas on the map. Along with the method we present a case study of its use.

178 citations