scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Social Network Analysis With Data Fusion

TL;DR: The reported results shed light on the sensitivity of betweenness, closeness, and degree centrality metrics to fused graph inputs and the role of HVI identification as a test and evaluation tool for fusion process optimization.
Abstract: This paper reports on the utility of social network analysis methods in the data fusion domain. Given fused data that combine multiple intelligence reports from the same environment, social network extraction and high value individual (HVI) identification are of interest. The research on the feasibility of such activities may help not only in methodological developments in network science but also in testing and evaluation of fusion quality. This paper offers a parallel computing-based methodology to extract a social network of individuals from fused data, captured as a cumulative associated data graph (CDG). To obtain the desired social network, two approaches including a hop count weighted and a path salience approach are developed and compared. A supervised learning framework is implemented for parameterizing the extraction algorithms. Parameters utilized in the extraction algorithm consider paths between individuals within the social network, weighing relationships between these individuals based on the count weighted and the path salience calculation methodologies. An overall link strength value is then calculated by aggregating path hop count weights and saliences between unique individual pairs for the hop count weighted and path salience approaches, respectively. Ordered centrality-based HVI lists are obtained from the CDGs constructed from the Sunni criminal thread and Bath’est resurgence threads of the SYNCOIN data set, under various fusion system settings. The reported results shed light on the sensitivity of betweenness, closeness, and degree centrality metrics to fused graph inputs and the role of HVI identification as a test and evaluation tool for fusion process optimization. The computational results demonstrate superiority of path salience approach in identifying HVIs. The insights generated by these approaches and directions for future research are discussed.
Citations
More filters
Journal ArticleDOI
TL;DR: This article enriched the researches of the networked Medical Device (MD) systems to increase the efficiency and safety of the healthcare.
Abstract: Medical cyber-physical systems (MCPS) are healthcare critical integration of a network of medical devices. These systems are progressively used in hospitals to achieve a continuous high-quality healthcare. The MCPS design faces numerous challenges, including inoperability, security/privacy, and high assurance in the system software. In the current work, the infrastructure of the cyber-physical systems (CPS) are reviewed and discussed. This article enriched the researches of the networked Medical Device (MD) systems to increase the efficiency and safety of the healthcare. It also can assist the specialists of medical device to overcome crucial issues related to medical devices, and the challenges facing the design of the medical device's network. The concept of the social networking and its security along with the concept of the wireless sensor networks (WSNs) are addressed. Afterward, the CPS systems and platforms have been established, where more focus was directed toward CPS-based healthcare. The big data framework of CPSs is also included.

134 citations

Journal ArticleDOI
20 Feb 2020
TL;DR: Industrial information integration engineering is a set of foundational concepts and techniques that facilitate the industrial information integration process and in recent years, many applicat...
Abstract: Industrial information integration engineering (IIIE) is a set of foundational concepts and techniques that facilitate the industrial information integration process. In recent years, many applicat...

109 citations

Journal ArticleDOI
TL;DR: This paper attempts to provide a comprehensive overview of unsupervised deep learning methods and compares their performances in text categorization and introduces autoencoders, deconvolutional networks, restricted Boltzmann machines, and deep belief nets.
Abstract: High-dimensional features are extensively accessible in machine learning and computer vision areas. How to learn an efficient feature representation for specific learning tasks is invariably a crucial issue. Due to the absence of class label information, unsupervised feature representation is exceedingly challenging. In the last decade, deep learning has captured growing attention from researchers in a broad range of areas. Most of the deep learning methods are supervised, which is required to be fed with a large amount of accurately labeled data points. Nevertheless, acquiring sufficient accurately labeled data is unaffordable in numerous real-world applications, which is suggestive of the needs of unsupervised learning. Toward this end, quite a few unsupervised feature representation approaches based on deep learning have been proposed in recent years. In this paper, we attempt to provide a comprehensive overview of unsupervised deep learning methods and compare their performances in text categorization. Our survey starts with the autoencoder and its representative variants, including sparse autoencoder, stacked autoencoder, contractive autoencoder, denoising autoencoder, variational autoencoder, graph autoencoder, convolutional autoencoder, adversarial autoencoder, and residual autoencoder. Aside from autoencoders, deconvolutional networks, restricted Boltzmann machines, and deep belief nets are introduced. Then, the reviewed unsupervised feature representation methods are compared in terms of text clustering. Extensive experiments in eight publicly available data sets of text documents are conducted to provide a fair test bed for the compared methods.

33 citations


Cites background from "Social Network Analysis With Data F..."

  • ...It is known that an effective feature representation comes with satisfactory effects of learning tasks [1], [2]....

    [...]

Proceedings ArticleDOI
17 Oct 2018
TL;DR: This paper publishes the triangle counts satisfying the node-differential privacy with two kinds of histograms: the triangle count distribution and the cumulative distribution, and proposes a novel graph projection method that can be used to obtain an upper bound for sensitivity in different distributions.
Abstract: Triangle count is a critical parameter in mining relationships among people in social networks. However, directly publishing the findings obtained from triangle counts may bring potential privacy concern, which raises great challenges and opportunities for privacy-preserving triangle counting. In this paper, we choose to use differential privacy to protect triangle counting for large scale graphs. To reduce the large sensitivity caused in large graphs, we propose a novel graph projection method that can be used to obtain an upper bound for sensitivity in different distributions. In particular, we publish the triangle counts satisfying the node-differential privacy with two kinds of histograms: the triangle count distribution and the cumulative distribution. Moreover, we extend the research on privacy preserving triangle counting to one of its applications, the local clustering coefficient. Experimental results show that the cumulative distribution can fit the real statistical information better, and our proposed mechanism has achieved better accuracy for triangle counts while maintaining the requirement of differential privacy.

26 citations


Cites background from "Social Network Analysis With Data F..."

  • ...[24] introduced the measurement to determine whether a graph is a small-world network, many work focus on finding the triangles, which are one of the simplest but effective descriptions of a node’s status in a small-world [7, 16, 25]....

    [...]

Journal ArticleDOI
TL;DR: This work studies how to obtain feature-level ratings of the mobile products from the customer reviews and review votes to influence decision-making, both for new customers and manufacturers.
Abstract: This work studies how we can obtain feature-level ratings of the mobile products from the customer reviews and review votes to influence decision-making, both for new customers and manufacturers. Such a rating system gives a more comprehensive picture of the product than what a product-level rating system offers. While product-level ratings are too generic, feature-level ratings are particular; we exactly know what is good or bad about the product. There has always been a need to know which features fall short or are doing well according to the customer’s perception. It keeps both the manufacturer and the customer well-informed in the decisions to make in improving the product and buying, respectively. Different customers are interested in different features. Thus, feature-level ratings can make buying decisions personalized. We analyze the customer reviews collected on an online shopping site (Amazon) about various mobile products and the review votes. Explicitly, we carry out a feature-focused sentiment analysis for this purpose. Eventually, our analysis yields ratings to 108 features for 4000+ mobiles sold online. It helps in decision-making on how to improve the product (from the manufacturer’s perspective) and in making the personalized buying decisions (from the buyer’s perspective) a possibility. Our analysis has applications in recommender systems, consumer research, and so on.

17 citations


Cites background from "Social Network Analysis With Data F..."

  • ...Such people have to generally go through the entire comments section to know previous customers’ perceptions [1], [2] of the product’s features in which they are interested....

    [...]

References
More filters
Proceedings Article
19 Mar 2009
TL;DR: This work presents several key features of Gephi in the context of interactive exploration and interpretation of networks, and highlights key aspects of dynamic network visualization.
Abstract: Gephi is an open source software for graph and network analysis. It uses a 3D render engine to display large networks in real-time and to speed up the exploration. A flexible and multi-task architecture brings new possibilities to work with complex data sets and produce valuable visual results. We present several key features of Gephi in the context of interactive exploration and interpretation of networks. It provides easy and broad access to network data and allows for spatializing, filtering, navigating, manipulating and clustering. Finally, by presenting dynamic features of Gephi, we highlight key aspects of dynamic network visualization.

7,917 citations


"Social Network Analysis With Data F..." refers methods in this paper

  • ...To visualize the GTSN and CDGSN, an open-source software Gephi [3] is utilized....

    [...]

Journal ArticleDOI
25 Oct 2002-Science
TL;DR: Network motifs, patterns of interconnections occurring in complex networks at numbers that are significantly higher than those in randomized networks, are defined and may define universal classes of networks.
Abstract: Complex networks are studied across many fields of science. To uncover their structural design principles, we defined “network motifs,” patterns of interconnections occurring in complex networks at numbers that are significantly higher than those in randomized networks. We found such motifs in networks from biochemistry, neurobiology, ecology, and engineering. The motifs shared by ecological food webs were distinct from the motifs shared by the genetic networks of Escherichia coli and Saccharomyces cerevisiae or from those found in the World Wide Web. Similar motifs were found in networks that perform information processing, even though they describe elements as different as biomolecules within a cell and synaptic connections between neurons in Caenorhabditis elegans. Motifs may thus define universal classes of networks. This

6,992 citations


"Social Network Analysis With Data F..." refers background in this paper

  • ...Other related research introduces p∗ models (now widely known as exponential random graph models [24]), graph kernels [27], and motif analysis [21]....

    [...]

Journal ArticleDOI
TL;DR: New algorithms for betweenness are introduced in this paper and require O(n + m) space and run in O(nm) and O( nm + n2 log n) time on unweighted and weighted networks, respectively, where m is the number of links.
Abstract: Motivated by the fast‐growing need to compute centrality indices on large, yet very sparse, networks, new algorithms for betweenness are introduced in this paper. They require O(n + m) space and run in O(nm) and O(nm + n2 log n) time on unweighted and weighted networks, respectively, where m is the number of links. Experimental evidence is provided that this substantially increases the range of networks for which centrality analysis is feasible. The betweenness centrality index is essential in the analysis of social networks, but costly to compute. Currently, the fastest known algorithms require ?(n 3) time and ?(n 2) space, where n is the number of actors in the network.

4,190 citations


"Social Network Analysis With Data F..." refers background or methods in this paper

  • ...Degree centrality is easy to compute and reflects the number of direct connections a node has, while betweenness and closeness centralities are based on shortest paths connecting node pairs and reflect the distances between peers [5]....

    [...]

  • ...For the path-based centrality calculation, this paper employs the very efficient algorithm of Brandes [5]....

    [...]

Book
29 May 2009
TL;DR: This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoops clusters.
Abstract: Hadoop: The Definitive Guide helps you harness the power of your data. Ideal for processing large datasets, the Apache Hadoop framework is an open source implementation of the MapReduce algorithm on which Google built its empire. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoop clusters. Complete with case studies that illustrate how Hadoop solves specific problems, this book helps you: Use the Hadoop Distributed File System (HDFS) for storing large datasets, and run distributed computations over those datasets using MapReduce Become familiar with Hadoop's data and I/O building blocks for compression, data integrity, serialization, and persistence Discover common pitfalls and advanced features for writing real-world MapReduce programs Design, build, and administer a dedicated Hadoop cluster, or run Hadoop in the cloud Use Pig, a high-level query language for large-scale data processing Take advantage of HBase, Hadoop's database for structured and semi-structured data Learn ZooKeeper, a toolkit of coordination primitives for building distributed systems If you have lots of data -- whether it's gigabytes or petabytes -- Hadoop is the perfect solution. Hadoop: The Definitive Guide is the most thorough book available on the subject. "Now you have the opportunity to learn about Hadoop from a master-not only of the technology, but also of common sense and plain talk." -- Doug Cutting, Hadoop Founder, Yahoo!

3,797 citations

Journal ArticleDOI
TL;DR: The homogeneous Markov random graph models of Frank and Strauss are not appropriate for many observed networks, whereas the new model specifications of Snijders et al. are more suitable for social networks.

1,875 citations


"Social Network Analysis With Data F..." refers background or methods in this paper

  • ...The p∗ models and motif analysis are based on the presence of small subgraphs in the compared networks [24]....

    [...]

  • ...Other related research introduces p∗ models (now widely known as exponential random graph models [24]), graph kernels [27], and motif analysis [21]....

    [...]