scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Prediction of Binary Labels for Edges in Signed Networks: A Random-Walk Based Approach

01 Nov 2017-pp 1-4
TL;DR: A semi-supervised approach is proposed that uses the concept of random-walk for prediction of binary labels for edges in undirected and unweighted networks.
Abstract: Mining of signed networks where the links/edges between nodes have a positive or negative sign/label, is getting the attention of researchers and practitioners due to its wide realworld applicability in various domains. Label prediction for nodes in a network is a well-known and explored problem. However, the prediction of labels for edges in a network is relatively less explored, and very challenging and interesting problem. In this paper, we consider the problem of binary label prediction for edges in undirected and unweighted networks. The prediction of binary labels has a number of applications in realworld like friend/foe prediction, recommendation, trust/distrust prediction in social networks, and categorization. In this work, a semi-supervised approach is proposed that uses the concept of random-walk for prediction of binary labels for edges. In this paper, we demonstrate the viability and the effectiveness of the proposed approach using a real-world network.
Citations
More filters
Proceedings ArticleDOI
11 Feb 2020
TL;DR: Li et al. as discussed by the authors proposed a Local Community-based Edge Classification (LoCEC) framework that classifies user relationships in a social network into real-world social connection types, which enforces a three-phase processing, namely local community detection, community classification and relationship classification.
Abstract: Relationships in online social networks often imply social connections in real life. An accurate understanding of relationship types benefits many applications, e.g. social advertising and recommendation. Some recent attempts have been proposed to classify user relationships into predefined types with the help of pre-labeled relationships or abundant interaction features on relationships. Unfortunately, both relationship feature data and label data are very sparse in real social platforms like WeChat, rendering existing methods inapplicable.In this paper, we present an in-depth analysis of WeChat relationships to identify the major challenges for the relationship classification task. To tackle the challenges, we propose a Local Community-based Edge Classification (LoCEC) framework that classifies user relationships in a social network into real-world social connection types. LoCEC enforces a three-phase processing, namely local community detection, community classification and relationship classification, to address the sparsity issue of relationship features and relationship labels. Moreover, LoCEC is designed to handle large-scale networks by allowing parallel and distributed processing. We conduct extensive experiments on the real-world WeChat network with hundreds of billions of edges to validate the effectiveness and efficiency of LoCEC.

3 citations

Posted Content
TL;DR: This paper presents an in-depth analysis of WeChat relationships to identify the major challenges for the relationship classification task and proposes a Local Community-based Edge Classification (LoCEC) framework that classifies user relationships in a social network into real-world social connection types.
Abstract: Relationships in online social networks often imply social connections in the real world. An accurate understanding of relationship types benefits many applications, e.g. social advertising and recommendation. Some recent attempts have been proposed to classify user relationships into predefined types with the help of pre-labeled relationships or abundant interaction features on relationships. Unfortunately, both relationship feature data and label data are very sparse in real social platforms like WeChat, rendering existing methods inapplicable. In this paper, we present an in-depth analysis of WeChat relationships to identify the major challenges for the relationship classification task. To tackle the challenges, we propose a Local Community-based Edge Classification (LoCEC) framework that classifies user relationships in a social network into real-world social connection types. LoCEC enforces a three-phase processing, namely local community detection, community classification and relationship classification, to address the sparsity issue of relationship features and relationship labels. Moreover, LoCEC is designed to handle large-scale networks by allowing parallel and distributed processing. We conduct extensive experiments on the real-world WeChat network with hundreds of billions of edges to validate the effectiveness and efficiency of LoCEC.

2 citations


Cites background from "Prediction of Binary Labels for Edg..."

  • ...Some studies work on classifying edges as friends and enemies [7, 8, 9, 27, 28]....

    [...]

References
More filters
Book
08 Sep 2000
TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.
Abstract: The increasing volume of data in modern business and science calls for more complex and sophisticated tools. Although advances in data mining technology have made extensive data collection much easier, it's still always evolving and there is a constant need for new techniques and tools that can help us transform this data into useful information and knowledge. Since the previous edition's publication, great advances have been made in the field of data mining. Not only does the third of edition of Data Mining: Concepts and Techniques continue the tradition of equipping you with an understanding and application of the theory and practice of discovering patterns hidden in large data sets, it also focuses on new, important topics in the field: data warehouses and data cube technology, mining stream, mining social networks, and mining spatial, multimedia and other complex data. Each chapter is a stand-alone guide to a critical topic, presenting proven algorithms and sound implementations ready to be used directly or with strategic modification against live data. This is the resource you need if you want to apply today's most powerful data mining techniques to meet real business challenges. * Presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects. * Addresses advanced topics such as mining object-relational databases, spatial databases, multimedia databases, time-series databases, text databases, the World Wide Web, and applications in several fields. *Provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data

23,600 citations


"Prediction of Binary Labels for Edg..." refers background or methods in this paper

  • ...Most of the data in the real-world are unstructured/semistructured and can be represented as graphs/networks in their natural setting [1]....

    [...]

  • ...However, their approach requires measuring the similarity between nodes using Jaccard measure [1] which takes only the neighborhood information of nodes in the network which results in a low accuracy of edge labeling....

    [...]

Proceedings Article
09 Dec 2003
TL;DR: A principled approach to semi-supervised learning is to design a classifying function which is sufficiently smooth with respect to the intrinsic structure collectively revealed by known labeled and unlabeled points.
Abstract: We consider the general problem of learning from labeled and unlabeled data, which is often called semi-supervised learning or transductive inference. A principled approach to semi-supervised learning is to design a classifying function which is sufficiently smooth with respect to the intrinsic structure collectively revealed by known labeled and unlabeled points. We present a simple algorithm to obtain such a smooth solution. Our method yields encouraging experimental results on a number of classification problems and demonstrates effective use of unlabeled data.

4,205 citations


"Prediction of Binary Labels for Edg..." refers background in this paper

  • ...A lot of work has been done for node labeling in networks [5], [6], [7]....

    [...]

Journal ArticleDOI
TL;DR: This article introduces four of the most widely used inference algorithms for classifying networked data and empirically compare them on both synthetic and real-world data.
Abstract: Many real-world applications produce networked data such as the world-wide web (hypertext documents connected via hyperlinks), social networks (for example, people connected by friendship links), communication networks (computers connected via communication links) and biological networks (for example, protein interaction networks). A recent focus in machine learning research has been to extend traditional machine learning classification techniques to classify nodes in such networks. In this article, we provide a brief introduction to this area of research and how it has progressed during the past decade. We introduce four of the most widely used inference algorithms for classifying networked data and empirically compare them on both synthetic and real-world data.

2,937 citations


"Prediction of Binary Labels for Edg..." refers background in this paper

  • ...Among various mining tasks, node classification or labeling is an important mining task in the network as it has numerous real-world applications [3], [4]....

    [...]

Proceedings ArticleDOI
23 Jul 2002
TL;DR: A complementary approach, applicable in any domain with object-to-object relationships, that measures similarity of the structural context in which objects occur, based on their relationships with other objects is proposed.
Abstract: The problem of measuring "similarity" of objects arises in many applications, and many domain-specific measures have been developed, e.g., matching text across documents or computing overlap among item-sets. We propose a complementary approach, applicable in any domain with object-to-object relationships, that measures similarity of the structural context in which objects occur, based on their relationships with other objects. Effectively, we compute a measure that says "two objects are similar if they are related to similar objects:" This general similarity measure, called SimRank, is based on a simple and intuitive graph-theoretic model. For a given domain, SimRank can be combined with other domain-specific similarity measures. We suggest techniques for efficient computation of SimRank scores, and provide experimental results on two application domains showing the computational feasibility and effectiveness of our approach.

2,036 citations


"Prediction of Binary Labels for Edg..." refers background or methods in this paper

  • ...Then for each network, we compute the similarity between nodes using SimRank which is a random-walk based structural similarity measure [15]....

    [...]

  • ...The above equation iteratively computes the structural similarity between nodes [15]....

    [...]

  • ...2) Compute random-walk based structural similarity between nodes using SimRank....

    [...]

  • ...For that, we utilized SimRank which is a random-walk based measure to compute the similarity score between all pair of nodes [15]....

    [...]

  • ...For computing the similarity between all pairs of the node, we use the following matrix form of the SimRank: ( ){ }1max ,K kTC A IS SA −= ⋅ ⋅ ⋅ (2) The above equation iteratively computes the structural similarity between nodes [15]....

    [...]

Posted Content
TL;DR: In this article, the authors study online social networks in which relationships can be either positive (indicating relations such as friendship) or negative (ending up with opposition or antagonism) and find that the signs of links in the underlying social networks can be predicted with high accuracy, using models that generalize across this diverse range of sites.
Abstract: We study online social networks in which relationships can be either positive (indicating relations such as friendship) or negative (indicating relations such as opposition or antagonism). Such a mix of positive and negative links arise in a variety of online settings; we study datasets from Epinions, Slashdot and Wikipedia. We find that the signs of links in the underlying social networks can be predicted with high accuracy, using models that generalize across this diverse range of sites. These models provide insight into some of the fundamental principles that drive the formation of signed links in networks, shedding light on theories of balance and status from social psychology; they also suggest social computing applications by which the attitude of one user toward another can be estimated from evidence provided by their relationships with other members of the surrounding social network.

1,253 citations