scispace - formally typeset
Proceedings ArticleDOI

Comparing Twitter Summarization Algorithms for Multiple Post Summaries

Reads0
Chats0
TLDR
This paper compares algorithms for extractive summarization of micro log posts with two algorithms that produce summaries by selecting several posts from a given set.
Abstract
Due to the sheer volume of text generated by a micro log site like Twitter, it is often difficult to fully understand what is being said about various topics. In an attempt to understand micro logs better, this paper compares algorithms for extractive summarization of micro log posts. We present two algorithms that produce summaries by selecting several posts from a given set. We evaluate the generated summaries by comparing them to both manually produced summaries and summaries produced by several leading traditional summarization systems. In order to shed light on the special nature of Twitter posts, we include extensive analysis of our results, some of which are unexpected.

read more

Content maybe subject to copyright    Report

Comparing Twitter Summarization Algorithms
for Multiple Post Summaries
David Inouye* and Jugal K. Kalita+
*School of Electrical and Computer Engineering
Georgia Institute of Technology
Atlanta, Georgia 30332 USA
+Department of Computer Science
University of Colorado
Colorado Springs, CO 80918 USA
dinouye3@gatech.edu, jkalita@uccs.edu
Abstract—Due to the sheer volume of text generated by a
microblog site like Twitter, it is often difficult to fully understand
what is being said about various topics. In an attempt to
understand microblogs better, this paper compares algorithms for
extractive summarization of microblog posts. We present two al-
gorithms that produce summaries by selecting several posts from
a given set. We evaluate the generated summaries by comparing
them to both manually produced summaries and summaries
produced by several leading traditional summarization systems.
In order to shed light on the special nature of Twitter posts,
we include extensive analysis of our results, some of which are
unexpected.
I. INTRODUCTION
Twitter
1
, the microblogging site started in 2006, has become
a social phenomenon. In February 2011, Twitter had 200 mil-
lion registered users
2
. There were a total of 25 billion tweets
in all of 2010
3
. While a majority of posts are conversational or
not particularly meaningful, about 3.6% of the posts concern
topics of mainstream news
4
.
To help people who read Twitter posts or tweets, Twitter
provides two interesting features: an API that allows users
to search for posts that contain a topic phrase and a short
list of popular topics called Trending Topics. A user can
perform a search for a topic and retrieve a list of the most
recent posts that contain the topic phrase. The difficulty in
interpreting the results is that the returned posts are only
sorted by recency, not relevancy. Therefore, the user is forced
to manually read through the posts in order to understand
what users are primarily saying about a particular topic. The
motivation of the summarizer is to automate this process.
In this paper, we discuss ongoing effort to create automatic
summaries of Twitter trending topics. In our recent prior work
1
http://www.twitter.com
2
http://www.bbc.co.uk/news/business-12889048
3
http://blog.twitter.com/2010/12/hindsight2010-top-trends-on-twitter.html
4
http://www.pearanalytics.com/blog/wp-content/uploads/2010/05/
Twitter-Study-August-2009.pdf
[1]–[4], we have discussed algorithms that can be used to pick
the single post that is representative of or is the summary of a
number of Twitter posts. Since the posts returned by the Twit-
ter API for a specified topic likely represent several sub-topics
or themes, it may be more appropriate to produce summaries
that encompass the multiple themes rather than just having
one post describe the whole topic. For this reason, this paper
extends the work significantly to create summaries that contain
multiple posts. We compare our multiple post summaries with
ones produced by leading traditional summarizers.
II. RELATED WORK
Summarizing microblogs can be viewed as an instance of
the more general problem of automated text summarization,
which is the problem of automatically generating a condensed
version of the most important content from one or more
documents. A number of algorithms have been developed
for various aspects of document summarization during recent
years. Notable algorithms include SumBasic [5] and the cen-
troid algorithm [6]. SumBasic’s underlying premise is that
words that occur more frequently across documents have a
higher probability of being selected for human created multi-
document summaries than words that occur less frequently.
The centroid algorithm takes into consideration a centrality
measure of a sentence in relation to the overall topic of the
document cluster or in relation to a document in the case of
single document summarization. The LexRank algorithm [7]
for computing the relative importance of sentences or other
textual units in a document (or a set of documents) creates an
adjacency matrix among the textual units and then computes
the stationary distribution considering it to be a Markov chain.
The TextRank algorithm [8] is also a graph-based approach
that finds the most highly ranked sentences (or keywords) in
a document using the PageRank algorithm [9].
In most cases, text summarization is performed for the pur-
poses of saving users time by reducing the amount of content

to read. However, text summarization has also been performed
for purposes such as reducing the number of features required
for classifying (e.g. [10]) or clustering (e.g. [11]) documents.
Following another line of approach, early work by Kalita et al.
generated textual summaries of database query results [12]–
[14]. Instead of presenting a table of data rows as the response
to a database query, they generated textual summaries from
predominant patterns found within the data table.
In the context of the Web, multi-document summarization
is useful in combining information from multiple sources.
Information may have to be extracted from many different
articles and pieced together to form a comprehensive and
coherent summary. One major difference between single docu-
ment summarization and multi-document summarization is the
potential redundancy that comes from using many source texts.
One solution may involve clustering the important sentences
picked out from the various source texts and using only
a representative sentence from each cluster. For example,
McKeown et al. first cluster the text units and then choose
the representative units from the clusters to include in the
final summary [15]. Dhillon models a document collection as
a bipartite graph consisting of words and documents and uses
a spectral co-clustering algorithm to obtain excellent results
[16].
Finally, in the context of multi-document summarization, it
is appropriate to mention MEAD [17], a flexible platform for
multi-document multi-lingual publicly-available summariza-
tion. MEAD implements multiple summarization algorithms
as well as provides metrics for evaluating multi-document
summaries.
III. PROBLEM DESCRIPTION
A Twitter post or tweet is at most 140 characters long
and in this study we only consider English posts. Because
a post is informal, it often has colloquial syntax, non-standard
orthography or non-standard spelling, and it frequently lacks
any punctuation.
The problem considered in this paper can be defined as
follows:
Given a topic keyword or phrase T and the de-
sired length k for the summary, output a set of
representative posts S with a cardinality of k such
that 1) s S, T is in the text of s, and 2)
s
i
, s
j
S, s
i
6∼ s
j
. s
i
6∼ s
j
means that the
two posts provide sufficiently different information in
order to keep the summaries from being redundant.
IV. SELECTED APPROACHES FOR TWITTER SUMMARIES
Among the many algorithms we discuss in prior papers [1]–
[4] for single-length summary creation for tweets, an algorithm
called the Hybrid TF-IDF algorithm that we developed worked
best. Thus, in this paper, we extend this algorithm to obtain
multi-post summaries. The contributions of this paper include
introduction of a hybrid TF-IDF based algorithm and a clus-
tering based algorithm for obtaining multi-post summaries of
Twitter posts along with detailed analysis of the Twitter post
domain for text processing by comparing these algorithms
with several other summarization algorithms. We find some
unexpected results when we apply multiple document sum-
marization algorithms to short informal documents.
A. Hybrid TF-IDF with Similarity Threshold
Term Frequency Inverse Document Frequency, is a sta-
tistical weighting technique that assigns each term within a
document a weight that reflects the term’s saliency within
the document. The weight of a post is the summation of
the individual term weights within the post. To determine the
weight of a term, we use the formula:
T F
IDF = tf
ij
log
2
N
df
j
(1)
where tf
ij
is the frequency of the term T
j
within the document
D
i
, N is the total number of documents, and df
j
is the number
of documents within the set that contain the term T
j
. We
assume that a term corresponds to a word and select the most
weighted post as summary.
The TF-IDF value is composed of two primary parts. The
term frequency component (TF) assigns more weight to words
that occur frequently within a document because important
words are often repeated. The inverse document frequency
component (IDF) compensates for the fact that some words
such as common stop words are frequent. Since these words
do not help discriminate between one sentence or document
over another, these words are penalized proportionally to their
inverse document frequency. The logarithm is taken to balance
the effect of the IDF component in the formula.
Equation (1) defines the weight of a term in the context of
a document. However, a microblog post is not a traditional
document. Therefore, one question we must first answer is
how we define a document. One option is to define a single
document that encompasses all the posts. In this case, the
TF component’s definition is straightforward since we can
compute the frequencies of the terms across all the posts.
However, doing so causes us to lose the IDF component
since we only have a single document. On the other extreme,
we could define each post as a document making the IDF
component’s definition clear. But, the TF component now has
a problem: because each post contains only a handful of words,
most term frequencies will be a small constant for a given post.
To handle this situation, we redefine TF-IDF in terms of
a hybrid document. We primarily define a document as a
single post. However, when computing the term frequencies,
we assume the document is the entire collection of posts.
Therefore, the TF component of the TF-IDF formula uses the
entire collection of posts while the IDF component treats each
post as a separate document. This way, we have differentiated
term frequencies but also do not lose the IDF component.
We next choose a normalization method since otherwise
the TF-IDF algorithm will always bias towards longer posts.
We normalize the weight of a post by dividing it by a
normalization factor. Since common stop words do not help
discriminate the saliency of sentences, we give stop words—
as defined by a prebuilt list—a weight of zero. Given this,

our definition of the TF-IDF summarization algorithm is now
complete for microblogs. We summarize this algorithm below
in Equations (2)-(6).
W (s) =
P
#W ordsInP ost
i=0
W (w
i
)
nf(s)
(2)
W (w
i
) = tf (w
i
) log
2
(idf(w
i
)) (3)
tf(w
i
) =
#OccurrencesOf W ordInAllP osts
#W ordsInAllP osts
(4)
idf(w
i
) =
#P osts
#P ostsInW hichW ordOccurs
(5)
nf(s) = max[M inimumT hreshold, (6)
#W ordsInP ost]
where W is the weight assigned to a post or a word, nf is a
normalization factor, w
i
is the ith word, and s is a post.
We select the top k most weighted posts. In order to avoid
redundancy, the algorithm selects the next top post and checks
to make sure that it does not have a similarity above a given
threshold t with any of the other previously selected posts
because the top most weighted posts may be very similar
or discuss the same subtopic. This similarity threshold filters
out a possible summary post s
i
if it satisfies the following
condition:
sim(s
i
, s
j
) > t
s
j
R where R is the set of posts aleady chosen for the
final summary and t is the similarity threshold. We use the
cosine similarity measure. The threshold was varied from 0 to
0.99 in increments of 0.01 for a total of 100 tests in order to
find the best threshold to be used.
B. Cluster Summarizer
We develop another method for summarizing a set of Twitter
posts. Similar to [15] and [16], we first cluster the tweets into
k clusters based on a similarity measure and then summarize
each cluster by picking the most weighted post as determined
by the Hybrid TF-IDF weighting described in Section IV-A.
During preliminary tests, we evaluated how well different
clustering algorithms would work on Twitter posts using the
weights computed by the Hybrid TF-IDF algorithm and the
cosine similarity measure. We implemented two variations of
the k-means algorithm: bisecting k-means [18] and k-means++
[19]. The bisecting k-means algorithm initially divides the
input into two clusters and then divides the largest cluster
into two smaller clusters. This splitting is repeated until the
kth cluster is formed. The k-means++ algorithm is similar to
the regular k-means algorithm except that it chooses the initial
centroids differently. It picks an initial centroid c
1
from the set
of vertices V randomly. It then chooses the next centroid c
i
,
selecting c
i
= v
V with the probability
D(v
)
2
P
vV
D(v)
2
where
D(v) is the shortest Euclidean distance from v to the closest
center which is already known. It repeats this selection process
until k initial centroids have been chosen. After trying these
methods, we found that the bisecting k-means++ algorithm—a
combination of the two algorithms—performed the best, even
though the performance gain above standard k-means was not
very high according to our evaluation methods.
Thus, the cluster summarizer attempts to creat k subtopics
by clustering the posts. It then feeds each subtopic cluster to
the Hybrid TF-IDF algorithm discussed in IV-A that selects
the most weighted post for each subtopic.
C. Additional Summarization Algorithms to Compare Results
We compare the results of summarization of the two
newly introduced algorithms with baseline algorithms and
well-known multi-document summarization algorithms. The
baseline algorithms include a Random summarizer and a Most
Recent summarizer. The other algorithms we compare our
results with are SumBasic, MEAD, LexRank and TextRank.
1) Random Summarizer: This summarizer randomly
chooses k posts or each topic as summary. This method was
chosen in order to provide worst case performance and set the
lower bound of performance.
2) Most Recent Summarizer: This summarizer chooses the
most recent k posts from the selection pool as a summary.
It is analogous to choosing the first part of a news article
as summary. It was implemented because often intelligent
summarizers cannot perform better than simple summarizers
that just use the first part of the document as summary.
3) SumBasic: SumBasic [5] uses simple word probabilities
with an update function to compute the best k posts. It was
chosen because it depends solely on the frequency of words
in the original text and is conceptually very simple.
4) MEAD: This summarizer
5
[17] is a well-known flexible
and extensible multi-document summarization system and was
chosen to provide a comparison between the more structured
document domain—in which MEAD works fairly well—and
the domain of Twitter posts being studied. In addition, the
default MEAD program is a cluster based summarizer so it
will provide some comparison to our cluster summarizer.
5) LexRank: This summarizer [7] uses a graph based
method that computes pairwise similarity between two
sentences—in our case two posts—and makes the similarity
score the weight of the edge between the two sentences. The
final score of a sentence is computed based on the weights of
the edges that are connected to it. This summarizer was chosen
to provide a baseline for graph based summarization instead
of direct frequency summarization. Though it does depend on
frequency, this system uses the relationships among sentences
to add more information and is therefore a more complex
algorithm than the frequency based ones.
6) TextRank: This summarizer [8] is another graph based
method that uses the PageRank [9] algorithm. This provided
another graph based summarizer that incorporates potentially
more information than LexRank since it recursively changes
the weights of posts. Therefore, the final score of each post
is not only dependent on how it is related to immediately
connected posts but also how those posts are related to other
posts. TextRank incorporates the whole complexity of the
graph rather than just pairwise similarities.
5
http://www.summarization.com/mead/

V. EXPERIMENTAL SETUP
A. Data Collection
For five consecutive days, we collected the top ten currently
trending topics from Twitter’s home page at roughly the
same time every evening. For each topic, we downloaded the
maximum number (approximately 1500) of posts. Therefore,
we had 50 trending topics with a set of 1500 posts for each.
B. Preprocessing the Posts
Pre-processing steps included converting any Unicode char-
acters into their ASCII equivalents, filtering out any embedded
URLs, discarding spam using a Na
¨
ıve Bayes classifier, etc.
These pre-processing steps and their rationale are described
more fully in [1].
C. Evaluation Methods
Summary evaluation is performed using one of two meth-
ods: intrinsic, or extrinsic. In intrinsic evaluation, the quality
of the summary is judged based on direct analysis using prede-
fined metrics such as grammaticality, fluency, or content [20].
Extrinsic evaluations measure how well a summary enables
a user to perform a task. To perform intrinsic evaluation, a
common approach is to create one or more manual summaries
and to compare the automated summaries against the man-
ual summaries. One popular automatic evaluation metric is
ROUGE, which is a suite of metrics [21]. Both precision and
recall of the automated summaries can be computed using
related formulations of the metric. Given that M S is the set of
manual summaries and u is the set of unigrams in a particular
manual summary, precision can be defined as
p =
P
mMS
P
um
match(u)
P
mMS
P
um
count(u)
=
matched
retrieved
, (7)
where count(u) is the number of unigrams in the automated
summary and match(u) is the number of co-occurring un-
igrams between the manual and automated summaries. The
ROUGE metric can be slightly altered so that it measures the
recall of the auto summaries such that
r =
P
mMS
P
um
match(u)
| MS |
P
ua
count(u)
=
matched
relevant
, (8)
where | MS | is the number of manual summaries and a is
the auto summary. We also report the F-measure, which is the
harmonic mean of precision and recall.
Lin’s use of of ROUGE with the very short (around 10
words) summary task of DUC 2003 shows that ROUGE-1
and other ROUGEs correlate highly with human judgments
[21]. Since this task is very similar to creating microblog
summaries, we implement ROUGE-1 as a metric. However,
since we want certainty that ROUGE-1 correlates with a
human evaluation, we implemented a human evaluation using
Amazon Mechanical Turk
6
, a paid system that pays human
workers small amounts of money for completing a short
Human Intelligence Task, or HIT. The HITs used for summary
evaluation displayed the summaries to be compared side by
side with the topic specified. Then, we asked the user, “The
6
http://www.mturk.com
TABLE I
ANSWERS TO THE SURVEY ABOUT HOW MANY CLUSTERS SEEMED
APPROPRIAT E FOR EACH TWITTER TOPIC.
Answer “3 (Less)” “4 (About Right)” “5 (More)”
Count 13 28 9
auto-generated summary expresses of the meaning
of the human produced summary. The possible answers were
All, “Most, “Some, “Hardly Any” and “None” which
correspond to a score of 5 through 1, respectively.
D. Manual Summarization
1) Choice of k: An initial question that we must answer
before using any multi-post extractive summarizer on a set of
Twitter posts is the question of how many posts are appropriate
in a summary. Though it is possible to choose k automatically
for clustering [22], we decided to focus our experiments on
summaries with a predefined value of k for several reasons.
First, we wanted to explore other summarization algorithms
for which automatically choosing k is not as straightforward
as in the cluster summarization algorithm. For example, the
SumBasic summarization does not have any mechanism for
choosing the right number of posts in the summary. Second,
we thought it would be difficult to perform evaluation where
the manual summaries were two or three posts in length and
the automatic summaries were five or six posts in length—or
vice versa—because the ROUGE evaluation metric is sensitive
to length even with some normalization.
To get a subjective idea of what people thought about the
value of k = 4 after being immersed in manual clustering
for a while, we took a survey of the volunteers after they
performed clustering of 50 topics—2 people for each of the
25 topics—with 100 posts in each topic. We asked them “How
many clusters do you think this should have had?” with the
choices “3 (Less)”, “4 (About Right)” or “5 (More)”. The
results are in Table I. This survey is probably biased towards
“4 (About Right)” because the question does not allow for
numbers other than 3, 4 or 5. Therefore, these results must be
taken tentatively but they at least suggest that there is some
significant variability about the best value for k. Our bias is
also based on the fact that our initial 1500 Twitter posts on
each topic were obtained within a small interval of 15 minutes
so we thought a small number would be good.
Since the volunteers had already clustered the posts into
four clusters, the manual summaries were four-post long as
well. This kept the already onerous manual summary creation
process somewhat simple. However, this also means that being
dependent on a single length for the summaries may impact
our evaluation process described next in an unknown way.
2) Manual Summarization Method: Our manual multi-post
summaries were created by volunteers who were undergrad-
uates from around the US gathered together in an NSF-
supported REU program. Each of the first 25 topics was man-
ually summarized by two different volunteers
7
by performing
7
A total of 16 volunteers produced manual summaries in such a combination
that no volunteer would be compared against another specified volunteer more
than once.

0.29
0.3
0.31
0.32
0.33
0.34
0.35
0.36
0
0.04
0.08
0.12
0.16
0.2
0.24
0.28
0.32
0.36
0.4
0.44
0.48
0.52
0.56
0.6
0.64
0.68
0.72
0.76
0.8
0.84
0.88
0.92
0.96
Similarity)Threshold
Fig. 1. F-measures of Hybrid TF-IDF algorithm over different thresholds.
steps parallel to the steps of the cluster summarizer. First, the
volunteers clustered the posts into 4 clusters (k = 4). Second,
they chose the most representative post from each cluster. And
finally, they ordered the representative posts in a way that
they thought was most logical or coherent. These steps were
chosen because it was initially thought that a clustering based
solution would be the best way to summarize the Twitter posts
and it seemed simpler for the volunteers to cluster first rather
than simply looking at all the posts at once. These procedures
probably biased the manual summaries—and consequently
the results—towards clustering based solutions but since the
cluster summarizer itself did not perform particularly well in
the evaluations, it seems that this bias was not particularly
strong.
E. Setup of the Summarizers
Like the manual summaries, the automated summaries were
restricted to producing four post summaries. For MEAD, each
post was formatted to be one document. For LexRank—which
is implemented in the standard MEAD distribution—the posts
for each topic were concatenated into one document. Because
the exact implementation of TextRank [8] was unavailable, the
TextRank summarizer was implemented internally.
For the Hybrid TF-IDF summarizer, in order to keep the
posts from being too similar in content, a preliminary test to
determine the best cosine similarity threshold was conducted.
The F-measure scores when varying the similarity threshold t
of the Hybrid TF-IDF summarizer from 0 to 0.99 are shown in
Figure 1. The best performing threshold of t = 0.77 seems to
be reasonable because it allows for some similarity between
final summary posts but does not allow them to be nearly
identical.
VI. RESULTS AND ANALYSIS
The average F-measure of all the iterations was computed.
For the summarizers that involve random seeding (e.g., random
summarizer and cluster summarizer), 100 summaries were
produced for each topic to avoid the effects of random seeding.
These numbers can be seen more clearly in Table II. Also,
because we realized that the overlap of the topic keywords in
TABLE II
EVALUATION NUMBERS FOR ROUGE AND MTURK EVALUAT IONS.
Number of summaries Randomly seeded* Others
Number of topics 25 25
Summaries per topic 100 1
Total summaries computed 2500 25
ROUGE evaluation
ROUGE scores computed 2500 25
MTurk evaluation
Number of summaries evaluated 25
+
25
Number of manual summaries per topic 2 2
Evaluators per manual summary 2 2
Total MTurk evaluations 100 100
* The randomly seeded summaries were the Random Summarizer and the
Cluster Summarizer.
+
An average scoring post based on the F-measure for each topic was
chosen for the MTurk evaluations because evaluating 2500 summaries
would have been impractical.
TABLE III
AVERAGE VALUES OF F-MEASURE, RECALL AND PRECISION ORDERED BY
F-MEASURE.
F-measure Recall Precision
LexRank 0.2027 0.1894 0.2333
Random 0.2071 0.2283 0.1967
Mead 0.2204 0.3050 0.1771
Manual 0.2252 0.2320 0.2320
Cluster 0.2310 0.2554 0.2180
TextRank 0.2328 0.3053 0.1954
MostRecent 0.2329 0.2463 0.2253
Hybrid TF-IDF 0.2524 0.2666 0.2499
SumBasic 0.2544 0.3274 0.2127
the summary is trivial since every post contains the keywords,
we ignored keyword overlap in our ROUGE calculations.
For the human evaluations using Amazon Mechanical Turk,
each automatic summary was compared to both manual sum-
maries by two different evaluators. This leads to 100 evalua-
tions per summarizer as can be seen in Table II. The manual
summaries were evaluated against each other by pretending
that one of them was the automatic summary.
A. Results
Our experiments evaluated eight different summarizers:
random, most recent, MEAD, TextRank, LexRank, cluster,
Hybrid TF-IDF and SumBasic. Both the automatic ROUGE
based evaluation and the MTurk human evaluation are reported
for all eight summarizers in Figures 2 and 3, respectively. The
values of average F-measure, recall and precision can be seen
in Table III. The values of average MTurk scores can be seen
at the top of Table V.
B. Analysis of Results
1) General Observations: We see that both the ROUGE
scores and the human evaluation scores do not seem to
obviously differentiate among the summarizers as seen in
Figures 2 and 3. Therefore, we performed a paired two-sided
T-test for each summarizer compared to each other summarizer
for both the ROUGE scores and the human evaluation scores.
For the ROUGE scores, the twenty five average F-measure
scores corresponding to each topic were used for the paired

Citations
More filters
Proceedings ArticleDOI

Cooperation Model for Optimized Classification on Social Data

TL;DR: This work proposes a social computation and classification model which can retain interpretability and the ability to detect events from anomalies and built and analyzed this model on social and physical data sets that encapsulate features and retain heterogeneity.

Event detection in interaction network

Han Xiao
TL;DR: This work introduces interaction meta-graph, which connects associated interactions, and defines an event to be a subset of interactions that are topically and temporally close and correspond to a tree capturing information flow.
Posted Content

InfoMiner at WNUT-2020 Task 2: Transformer-based Covid-19 Informative Tweet Extraction

TL;DR: This article used transformers to identify informative tweets from noise tweets in WNUT-2020 Task 2 and achieved 10th place in the final rankings scoring 0.9004 F1 score for the test set.
Book ChapterDOI

Continuous Summarization for Microblog Streams Based on Clustering

TL;DR: This work clusters online microblog streams by clustering them and maintains cluster feature vectors, and integrates these cluster information with an unsupervised topic evolvement detection model, and illustrates that latent topics to capture the feature dependencies summaries with better performance.
Posted Content

Utiliza\c{c}\~ao de Grafos e Matriz de Similaridade na Sumariza\c{c}\~ao Autom\'atica de Documentos Baseada em Extra\c{c}\~ao de Frases

TL;DR: This master's thesis describes the main techniques and methodologies (NLP and heuristics) to generate summaries and addressed and proposed someHeuristics based on graphs and similarity matrix to measure the relevance of judgments and to generate summary by extracting sentences.
References
More filters
Journal ArticleDOI

The anatomy of a large-scale hypertextual Web search engine

TL;DR: This paper provides an in-depth description of Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and looks at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.
Journal Article

The Anatomy of a Large-Scale Hypertextual Web Search Engine.

Sergey Brin, +1 more
- 01 Jan 1998 - 
TL;DR: Google as discussed by the authors is a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems.
Proceedings ArticleDOI

k-means++: the advantages of careful seeding

TL;DR: By augmenting k-means with a very simple, randomized seeding technique, this work obtains an algorithm that is Θ(logk)-competitive with the optimal clustering.
Journal ArticleDOI

Estimating the number of clusters in a data set via the gap statistic

TL;DR: In this paper, the authors proposed a method called the "gap statistic" for estimating the number of clusters (groups) in a set of data, which uses the output of any clustering algorithm (e.g. K-means or hierarchical), comparing the change in within-cluster dispersion with that expected under an appropriate reference null distribution.
Proceedings Article

TextRank: Bringing Order into Text

Rada Mihalcea, +1 more
TL;DR: TextRank, a graph-based ranking model for text processing, is introduced and it is shown how this model can be successfully used in natural language applications.