scispace - formally typeset
Proceedings ArticleDOI

Research on the Techniques for Effectively Searching and Retrieving Information from Internet

Hongqi Li, +2 more
- pp 99-102
Reads0
Chats0
TLDR
This paper presents a review of some very interesting, efficient yet implementable techniques from the field of Web content mining and study their impact in the area specific to business user needs focusing both on the customer as well as the producer.
Abstract
With the rapid growth in business size, todaypsilas businesses orient towards electronic technologies. Unfortunately the enormous size and hugely unstructured data on the Web. Extracting valuable information from such an ever-increasing data is an extremely tedious task towards the success of businesses. Web content mining can play a major role in solving these issues and application of Web content mining can be very encouraging in some areas. In this paper we present a review of some very interesting, efficient yet implementable techniques from the field of Web content mining and study their impact in the area specific to business user needs focusing both on the customer as well as the producer. These techniques have been analyzed and compared on the basis of their execution time and relevance of the result they produced against a particular search.

read more

Citations
More filters

Content Based Ranking for Search Engines

TL;DR: A novel approach using weighted technique is introduced to mine the web contents catering to the user needs and results prove that the performance of the proposed approach in terms of precision, recall and F-measure is high when compared to other search engine results.
Journal Article

Signed Approach for Mining Web Content Outliers

TL;DR: This paper mainly focuses on Signed approach and full word matching on the organized domain dictionary for mining web content outliers and gives the relevant web documents as well as outlying web documents.
Journal ArticleDOI

A Mathematical Approach for Mining Web Content Outliers using Term Frequency Ranking

TL;DR: The experimental result shows that the proposed method improves the precision, recall, f-score and accuracy of the search engine, as the result set contains irrelevant and redundant data called outliers.

Relevance Ranking and Evaluation of Search Results through Web Content Mining

TL;DR: A correlation algorithm for web content mining that detects redundant documents and ranks more than 90% of the relevant documents accurately and improves the quality of search results.
References
More filters
MonographDOI

Web Mining: Applications and Techniques

Anthony Scime
TL;DR: This book provides a record of current research and practical applications in Web searching that includes techniques that will improve the utilization of the Web by the design of Websites, as well as the design and application of search agents.
Book ChapterDOI

MARS: Multiplicative Adaptive Refinement Web Search

TL;DR: This chapter reports the project MARS (Multiplicative Adaptive Refinement Search), which applies a new multiplicative adaptive algorithm for user preference retrieval to Web searches that has provable better performance than the popular Rocchio’s similarity-based relevance feedback algorithm.
Book ChapterDOI

Using Context Information to Build a Topic-Specific Crawling System

TL;DR: This chapter proposes an effective approach based on the relevancy context graph to solve the problem of assigning proper order to unvisited Web pages and shows that this method outperforms the breath-first and the method using only the context graph.
Book ChapterDOI

Metadata Management: A Requirement for Web Warehousing and Knowledge Management

TL;DR: This chapter introduces the need for the World Wide Web to provide a standard mechanism so individuals can readily obtain data, reports, research and knowledge about any topic posted to it and approaches to provide an integrated interchange of quality metadata.
Book ChapterDOI

Ontology Learning from a Domain Web Corpus

TL;DR: This chapter provides an approach to the problem, defining a method and a tool, OntoLearn, aimed at the extraction of knowledge from Websites, and more generally from documents shared among the members of virtual organizations, to support the construction of a domain ontology.
Related Papers (5)