Proceedings ArticleDOI
Research on Web Document Summarization
Zengmin Geng,Jujian Zhang,Xuefei Li,Jianxia Du,Zhengdong Liu +4 more
- pp 1-4
TLDR
An algorithm for WDS based on sentences extraction that considers both the Web formats and hyperlink attributes and the weight proportion of words and structures is learned by machine learning approach is presented.Abstract:
Web document summarization (WDS) is becoming one of the hot subjects in the text summarization field due to the rapidly increasing number of documents on Web. WDS is different from traditional text summarization because it must process hyperlinked texts. This paper first analyses the features of Web documents, then gives a definition for WDS, and finally presents an algorithm for WDS based on sentences extraction. Each sentence's weight is a weighted sum of words' weight and its sentence-structure's weight. The former weight is adjusted by document class graph and latter weight considers both the Web formats and hyperlink attributes. The weight proportion of words and structures is learned by machine learning approach. Experiments on 2,000 Web documents show that our algorithm is feasible.read more
Citations
More filters
Journal Article
Automatic Text Summarization for Web Pages on Internet
TL;DR: This paper discusses the new demands of automatic summarization for text on Internet and some related information and draws a conclusion and prospect on the research of auto text summarization on Internet.
Journal Article
A Framework for Collaborative Document Classification with GA-SVM
S. Chakraverty,U. Pandey,P. Dutt +2 more
TL;DR: A Collaborative Document Classification (CDC) system that adapts according to a given corpus, the weighted contributions of statistical features, an array of lexical-semantic features derived from the WordNet ontology and categorical-Semantic features obtained from the hierarchical organization of Wikipedia category pages are developed.
Journal ArticleDOI
Cognos Clustering in IBM Connections Metrics
TL;DR: Cognos Clustering greatly enhance the load capacity of the report server, improve the performance, effectiveness and capacity, make the server more stable, ensure the user quantity concurrency.
Journal ArticleDOI
Classify the Search Result Based on IBM OminiFind Edition and UIMA
TL;DR: A method is proposed which uses the IBM OmniFind Enterprise Edition combined with IBM open source of unstructured information management architecture of Unstructured Information Management Architecture (UIMA), to realize the IBM Omnibus Enterprise Edition semantic search engine search and result classification.
References
More filters
Book ChapterDOI
Automatic Text Summarization with Genetic Algorithm-Based Attribute Selection
TL;DR: The goal of the paper is to investigate the effectiveness of Genetic Algorithm (GA)-based attribute selection in improving the performance of classification algorithms solving the automatic text summarization task.
Proceedings ArticleDOI
An approach to sentence-selection-based text summarization
Fang Chen,Kesong Han,Guilin Chen +2 more
TL;DR: This paper introduced a newly developed text summarization system that supports both Chinese and English, and describes two new techniques for processing the topic sensitive word feature and the sentence length feature.
Journal ArticleDOI
FIDS: an intelligent financial Web news articles digest system
Wai Lam,Kei Shiu Ho +1 more
TL;DR: A system called FIDS (Financial Information Digest System), which can digest online financial news automatically and allows one to perform cross-validation on their contents, so users can have access to more complete information which otherwise would be scattered in different articles.
Journal Article
Automatic Text Summarization for Web Pages on Internet
TL;DR: This paper discusses the new demands of automatic summarization for text on Internet and some related information and draws a conclusion and prospect on the research of auto text summarization on Internet.