scispace - formally typeset
Search or ask a question
Author

Ralitsa Angelova

Other affiliations: Google
Bio: Ralitsa Angelova is an academic researcher from Max Planck Society. The author has contributed to research in topics: Cluster analysis & Social network. The author has an hindex of 8, co-authored 13 publications receiving 565 citations. Previous affiliations of Ralitsa Angelova include Google.

Papers
More filters
Proceedings ArticleDOI
28 Jun 2009
TL;DR: In this paper, the authors investigate the value of incorporating the history information available on the interactions (or links) of the current social network state and show that time-stamps of past interactions significantly improve the prediction accuracy of new and recurrent links over rather sophisticated methods proposed recently.
Abstract: Prediction of links - both new as well as recurring - in a social network representing interactions between individuals is an important problem. In the recent years, there is significant interest in methods that use only the graph structure to make predictions. However, most of them consider a single snapshot of the network as the input, neglecting an important aspect of these social networks viz., their evolution over time.In this work, we investigate the value of incorporating the history information available on the interactions (or links) of the current social network state. Our results unequivocally show that time-stamps of past interactions significantly improve the prediction accuracy of new and recurrent links over rather sophisticated methods proposed recently. Furthermore, we introduce a novel testing method which reflects the application of link prediction better than previous approaches.

227 citations

Proceedings ArticleDOI
06 Aug 2006
TL;DR: This paper presents a new method for graph-based classification, with particular emphasis on hyperlinked text documents but broader applicability, based on iterative relaxation labeling and can be combined with either Bayesian or SVM classifiers on the feature spaces of the given data items.
Abstract: Automatic classification of data items, based on training samples, can be boosted by considering the neighborhood of data items in a graph structure (e.g., neighboring documents in a hyperlink environment or co-authors and their publications for bibliographic data entries). This paper presents a new method for graph-based classification, with particular emphasis on hyperlinked text documents but broader applicability. Our approach is based on iterative relaxation labeling and can be combined with either Bayesian or SVM classifiers on the feature spaces of the given data items. The graph neighborhood is taken into consideration to exploit locality patterns while at the same time avoiding overfitting. In contrast to prior work along these lines, our approach employs a number of novel techniques: dynamically inferring the link/class pattern in the graph in the run of the iterative relaxation labeling, judicious pruning of edges from the neighborhood graph based on node dissimilarities and node degrees, weighting the influence of edges based on a distance metric between the classification labels of interest and weighting edges by content similarity measures. Our techniques considerably improve the robustness and accuracy of the classification outcome, as shown in systematic experimental comparisons with previously published methods on three different real-world datasets.

177 citations

01 Jan 2006
TL;DR: This approach is based on iterative relaxation of cluster assignments, and can be built on top of any clustering algorithm, and results in higher cluster purity, better overall accuracy, and make self-organization more robust.
Abstract: This paper addresses the problem of automatically structuring linked document collections by using clustering. In contrast to traditional clustering, we study the clustering problem in the light of available link structure information for the data set (e.g., hyperlinks among web documents or co-authorship among bibliographic data entries). Our approach is based on iterative relaxation of cluster assignments, and can be built on top of any clustering algorithm. This technique results in higher cluster purity, better overall accuracy, and make self-organization more robust.

63 citations

Proceedings ArticleDOI
06 Nov 2006
TL;DR: In this paper, the authors address the problem of automatically structuring linked document collections by using clustering and propose an approach based on iterative relaxation of cluster assignments, which can be built on top of any clustering algorithm.
Abstract: This paper addresses the problem of automatically structuring linked document collections by using clustering. In contrast to traditional clustering, we study the clustering problem in the light of available link structure information for the data set (e.g., hyperlinks among web documents or co-authorship among bibliographic data entries). Our approach is based on iterative relaxation of cluster assignments, and can be built on top of any clustering algorithm. This technique results in higher cluster purity, better overall accuracy, and make self-organization more robust.

39 citations

Journal ArticleDOI
TL;DR: This work addresses the problem of multi-label classification in heterogeneous graphs, where nodes belong to different types and different types have different sets of classification labels, and presents a novel approach that aims to classify nodes based on their neighborhoods.
Abstract: We address the problem of multi-label classification in heterogeneous graphs, where nodes belong to different types and different types have different sets of classification labels. We present a novel approach that aims to classify nodes based on their neighborhoods. We model the mutual influence of nodes as a random walk in which the random surfer aims at distributing class labels to nodes while walking through the graph. When viewing class labels as "colors", the random surfer is essentially spraying different node types with different color palettes; hence the name Graffiti of our method. In contrast to previous work on topic-based random surfer models, our approach captures and exploits the mutual influence of nodes of the same type based on their connections to nodes of other types. We show important properties of our algorithm such as convergence and scalability. We also confirm the practical viability of Graffiti by an experimental study on subsets of the popular social networks Flickr and LibraryThing. We demonstrate the superiority of our approach by comparing it to three other state-of-the-art techniques for graph-based classification.

26 citations


Cited by
More filters
01 Jan 2002

9,314 citations

Proceedings ArticleDOI
22 Jan 2006
TL;DR: Some of the major results in random graphs and some of the more challenging open problems are reviewed, including those related to the WWW.
Abstract: We will review some of the major results in random graphs and some of the more challenging open problems. We will cover algorithmic and structural questions. We will touch on newer models, including those related to the WWW.

7,116 citations

Journal ArticleDOI
TL;DR: Recent progress about link prediction algorithms is summarized, emphasizing on the contributions from physical perspectives and approaches, such as the random-walk-based methods and the maximum likelihood methods.
Abstract: Link prediction in complex networks has attracted increasing attention from both physical and computer science communities. The algorithms can be used to extract missing information, identify spurious interactions, evaluate network evolving mechanisms, and so on. This article summaries recent progress about link prediction algorithms, emphasizing on the contributions from physical perspectives and approaches, such as the random-walk-based methods and the maximum likelihood methods. We also introduce three typical applications: reconstruction of networks, evaluation of network evolving mechanism and classification of partially labeled networks. Finally, we introduce some applications and outline future challenges of link prediction algorithms.

2,530 citations

Journal ArticleDOI
TL;DR: YAGO2 as mentioned in this paper is an extension of the YAGO knowledge base, in which entities, facts, and events are anchored in both time and space, and it contains 447 million facts about 9.8 million entities.

1,186 citations

Proceedings Article
Long Jiang1, Mo Yu2, Ming Zhou1, Xiaohua Liu1, Tiejun Zhao2 
19 Jun 2011
TL;DR: This paper proposes to improve target-dependent Twitter sentiment classification by incorporating target- dependent features; and taking related tweets into consideration; and according to the experimental results, this approach greatly improves the performance of target- dependence sentiment classification.
Abstract: Sentiment analysis on Twitter data has attracted much attention recently. In this paper, we focus on target-dependent Twitter sentiment classification; namely, given a query, we classify the sentiments of the tweets as positive, negative or neutral according to whether they contain positive, negative or neutral sentiments about that query. Here the query serves as the target of the sentiments. The state-of-the-art approaches for solving this problem always adopt the target-independent strategy, which may assign irrelevant sentiments to the given target. Moreover, the state-of-the-art approaches only take the tweet to be classified into consideration when classifying the sentiment; they ignore its context (i.e., related tweets). However, because tweets are usually short and more ambiguous, sometimes it is not enough to consider only the current tweet for sentiment classification. In this paper, we propose to improve target-dependent Twitter sentiment classification by 1) incorporating target-dependent features; and 2) taking related tweets into consideration. According to the experimental results, our approach greatly improves the performance of target-dependent sentiment classification.

911 citations