Research on Web Document Summarization

doi:10.1109/ITAPP.2010.5566074

Proceedings ArticleDOI

Research on Web Document Summarization

- pp 1-4

TLDR

An algorithm for WDS based on sentences extraction that considers both the Web formats and hyperlink attributes and the weight proportion of words and structures is learned by machine learning approach is presented.

Abstract:

Web document summarization (WDS) is becoming one of the hot subjects in the text summarization field due to the rapidly increasing number of documents on Web. WDS is different from traditional text summarization because it must process hyperlinked texts. This paper first analyses the features of Web documents, then gives a definition for WDS, and finally presents an algorithm for WDS based on sentences extraction. Each sentence's weight is a weighted sum of words' weight and its sentence-structure's weight. The former weight is adjusted by document class graph and latter weight considers both the Web formats and hyperlink attributes. The weight proportion of words and structures is learned by machine learning approach. Experiments on 2,000 Web documents show that our algorithm is feasible.

Citations

PDF

Open Access

More filters

Journal Article

Automatic Text Summarization for Web Pages on Internet

Chen Jia-jun

- 01 Jan 2006 -

Computer Engineering

TL;DR: This paper discusses the new demands of automatic summarization for text on Internet and some related information and draws a conclusion and prospect on the research of auto text summarization on Internet.

...read moreread less

Journal Article

A Framework for Collaborative Document Classification with GA-SVM

S. Chakraverty, +2 more

- 30 Dec 2016 -

International journal of scientific rese...

TL;DR: A Collaborative Document Classification (CDC) system that adapts according to a given corpus, the weighted contributions of statistical features, an array of lexical-semantic features derived from the WordNet ontology and categorical-Semantic features obtained from the hierarchical organization of Wikipedia category pages are developed.

...read moreread less

Journal ArticleDOI

Cognos Clustering in IBM Connections Metrics

Jian Xia Du

- 01 Oct 2013 -

Applied Mechanics and Materials

TL;DR: Cognos Clustering greatly enhance the load capacity of the report server, improve the performance, effectiveness and capacity, make the server more stable, ensure the user quantity concurrency.

...read moreread less

Journal ArticleDOI

Classify the Search Result Based on IBM OminiFind Edition and UIMA

Jian Xia Du, +2 more

- 01 Oct 2013 -

Advanced Materials Research

TL;DR: A method is proposed which uses the IBM OmniFind Enterprise Edition combined with IBM open source of unstructured information management architecture of Unstructured Information Management Architecture (UIMA), to realize the IBM Omnibus Enterprise Edition semantic search engine search and result classification.

...read moreread less

References

PDF

Open Access

More filters

Book ChapterDOI

Automatic Text Summarization with Genetic Algorithm-Based Attribute Selection

Carlos N. Silla, +3 more

TL;DR: The goal of the paper is to investigate the effectiveness of Genetic Algorithm (GA)-based attribute selection in improving the performance of classification algorithms solving the automatic text summarization task.

...read moreread less

Proceedings ArticleDOI

An approach to sentence-selection-based text summarization

Fang Chen, +2 more

TL;DR: This paper introduced a newly developed text summarization system that supports both Chinese and English, and describes two new techniques for processing the topic sensitive word feature and the sentence length feature.

...read moreread less

Journal ArticleDOI

FIDS: an intelligent financial Web news articles digest system

Wai Lam, +1 more

TL;DR: A system called FIDS (Financial Information Digest System), which can digest online financial news automatically and allows one to perform cross-validation on their contents, so users can have access to more complete information which otherwise would be scattered in different articles.

...read moreread less

Journal Article

Automatic Text Summarization for Web Pages on Internet

Chen Jia-jun

- 01 Jan 2006 -

Computer Engineering

TL;DR: This paper discusses the new demands of automatic summarization for text on Internet and some related information and draws a conclusion and prospect on the research of auto text summarization on Internet.

...read moreread less