scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

COS: A Frame Work for Clustered off-line Search

06 Aug 2008-pp 569-573
TL;DR: A novel frame work of COS, clustered off-line search system, a client-server architecture which consists of a search engine for indexing the WWW and takes a master query and identifies clusters from the search query results, packs them as a downloadable file enabling the users to access them offline i.e. the subsequent searches can be performed at the clients end.
Abstract: It is common for Web searchers to have difficulties crafting queries to fulfill their information needs. Even when they provide a good query, users often find it challenging to evaluate the results of their Web searches. Sources of these problems include the lack of support for query refinement. It also increases the overhead on the search engine server because the same user uses the server multiple times, refining the same query to get expected Web search results. To address these issues, we propose a novel frame work of COS, clustered off-line search system, a client-server architecture which consists of a search engine for indexing the WWW. It takes a master query and identifies clusters from the search query results, packs them as a downloadable file which contains the web pages and their index data of the result URLs from the cluster selected by the user from the search results thus enabling the users to access them offline i.e. the subsequent searches can be performed at the clients end eliminating the need to contact the server again and again.
Citations
More filters
01 Jan 2010
TL;DR: This paper presents a systematic analysis of a variety of different ad hoc network topologies in terms of node placement, node mobility and routing protocols through several simulated scenarios.
Abstract: In this paper we examine the behavior of Ad Hoc networks through simulations, using different routing protocols and various topologies. We examine the difference in performance, using CBR application, with packets of different size through a variety of topologies, showing the impact node placement has on networks performance. We show that the choice of routing protocol plays an important role on network’s performance. We also quantify node mobility effects, by looking into both static and fully mobile configurations. Our paper presents a systematic analysis of a variety of different ad hoc network topologies in terms of node placement, node mobility and routing protocols through several simulated scenarios.

58 citations

01 Jan 2010
TL;DR: The Bundle of Stored result consists of best possible results as the user chooses to save in it, which will enhance the systems searching capabilities offline, which in turn reduces the burden on the search engine server.
Abstract: Information retrieval systems (e.g., web search engines) are critical for overcoming information overload. A major deficiency of existing retrieval systems is that they generally lack user modeling and are not adaptive to individual users, resulting in inherently non-optimal retrieval performance [1]. Sources of these problems include the lack of support for query refinement. Web search engines typically provide search results without considering user interests or context. This in turn increases the overhead on the search engine server. To address these issues we propose a novel interactive guided Online/offline search mechanism. The system allows user to choose for normal or combinational search [5] of the query string and allows the user to store the best search results for the query string. The proposed system also provides option for off-line search which searches from the bundle of stored results. Systems which implemented offline search require downloading and installing the stored bundle of search results before using it. The proposed system is an interactive web based search facility both offline and online. The system doesn’t require installing the bundle of saved search results for offline searching, as the search results are added to the bundle interactively as chosen by the user. The system is very likely to return the best possible result as it uses combinational search. The result from the combination search can be stored and can be searched again offline. Experiments revealed that combination search of keywords in query yields variety of results. Thus the Bundle of Stored result consists of best possible results as the user chooses to save in it. This will enhance the systems searching capabilities offline, which in turn reduces the burden on the search engine server.

7 citations


Additional excerpts

  • ...The main difference between the tools and systems mentioned above and the system proposed in this paper is that there is no need for the user to download the bundle of best search results for offline search just as implemented in COS [14]....

    [...]

  • ...Some systems like COS [14] have the limitation of downloading and installing the COS pack [14] offline and it has no facility of Combination Search, thus again resulting in inconvenience to the user....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.
Abstract: A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries. The particular technique used is singular-value decomposition, in which a large term by document matrix is decomposed into a set of ca. 100 orthogonal factors from which the original matrix can be approximated by linear combination. Documents are represented by ca. 100 item vectors of factor weights. Queries are represented as pseudo-document vectors formed from weighted combinations of terms, and documents with supra-threshold cosine values are returned. initial tests find this completely automatic method for retrieval to be promising.

12,443 citations

Journal Article

215 citations

Journal Article
TL;DR: This paper presents a taxonomy of approaches to resource discovery, and uses this taxonomy to compare a number of resource discovery systems, and examine several gateways between existing systems.
Abstract: In the past several years, the number and variety of resources available on the Internet have increased dramatically. With this increase, many new systems have been developed that allow users to search for and access these resources. As these systems begin to interconnect with one another through "information gateways", the conceptual relationships between the systems come into question. Understanding these relationships is important, because they address the degree to which the systems can be made to interoperate seamlessly, without the need for users to learn the details of each system. In this paper we present a taxonomy of approaches to resource discovery. The taxonomy provides insights into the interrelated problems of organizing, browsing, and searching for information. Using this taxonomy, we compare a number of resource discovery systems, and examine several gateways between existing systems.

140 citations


"COS: A Frame Work for Clustered off..." refers background in this paper

  • ...Even when they provide a good query, users often find it challenging to evaluate the results of their web searches....

    [...]

Book ChapterDOI
03 Nov 2006
TL;DR: A new key-feature clustering (KFC) algorithm is proposed which firstly extracts the significant keywords from the results as key features and cluster them, then clusters the documents based on these clustered key features.
Abstract: With the increasing number of Web documents in the Internet, the most popular keyword-matching-based search engines, such as Google, often return a long list of search results ranked based on their relevancy and importance to the query. To cluster the search engine results can help users find the results in several clustered collections, so it is easy to locate the valuable search results that the users really needed. In this paper, we propose a new Key-Feature Clustering (KFC) algorithm which firstly extracts the significant keywords from the results as key features and cluster them, then clusters the documents based on these clustered key features. At last, the paper presents and analyzes the results from experiments we conducted to test and validate the algorithm.

15 citations

Journal Article
TL;DR: The results show that KDT and the LSI method can successfully be applied for clustering the very volatile and unstructured textual communication on the Internet.
Abstract: Automatic knowledge discovery from texts (KDT) is proving to be a promising method for businesses today to deal with the overload of textual information. In this paper, we first explore the possibilities for KDT to enhance communication in virtual communities, and then we present a practical case study with real-life Internet data. The problem in the case study is to manage the very successful virtual communities known as 'clubs' of the largest Dutch Internet Service Provider. It is possible for anyone to start a club about any subject, resulting in over 10,000 active clubs today. At the beginning, the founder assigns the club to a predefined category. This often results in illogical or inconsistent placements, which means that interesting clubs may be hard to locate for potential new members. The ISP therefore is looking for an automated way to categorize clubs in a logical and consistent manner. The method used is the so-called bag-of-words approach, previously applied mostly to scientific texts and structured documents. Each club is described by a vector of word occurrences of all communications within that club. Latent Semantic Indexing (LSI) is applied to reduce the dimensionality problem prior to clustering. Clustering is done by the Within Groups Clustering method using a cosine distance measure appropriate for texts. The results show that KDT and the LSI method can successfully be applied for clustering the very volatile and unstructured textual communication on the Internet.

12 citations