scispace - formally typeset
Search or ask a question

Showing papers by "Anthony Tomasic published in 2003"


Patent
12 Feb 2003
TL;DR: In this article, a method for comparing the contents of a query document to the content on the World Wide Web is presented, where the query document is indexed and compared to content from the Web which is continuously retrieved and indexed.
Abstract: Methods and related systems for indexing the contents of documents for comparison with the contents of other documents to identify matching content. A method for comparing the contents of a query document to the content on the World Wide Web is set forth. The contents of a query document are indexed and compared to content from the World Wide Web which is continuously retrieved and indexed. The method for indexing may comprise selecting substrings from the document, hashing the substrings to generate a plurality of hash values having a known range of values, selecting certain hash values to save from the generated hash values, and sorting the saved hash values. Methods for selecting certain hash values to save are set forth.

103 citations