scispace - formally typeset
Search or ask a question
Author

Shaobin Huang

Bio: Shaobin Huang is an academic researcher from Harbin Engineering University. The author has contributed to research in topics: Cluster analysis & Canopy clustering algorithm. The author has an hindex of 6, co-authored 47 publications receiving 127 citations.

Papers
More filters
Journal ArticleDOI
04 Aug 2014-PLOS ONE
TL;DR: The paper selects the multiple indicators including degree, ego-betweenness centrality and eigenvector centrality to evaluate the importance and the role of a node and shows that the proposed methods perform quite well in evaluating the importance of nodes and in identifying the node role.
Abstract: It is a classic topic of social network analysis to evaluate the importance of nodes and identify the node that takes on the role of core or bridge in a network. Because a single indicator is not sufficient to analyze multiple characteristics of a node, it is a natural solution to apply multiple indicators that should be selected carefully. An intuitive idea is to select some indicators with weak correlations to efficiently assess different characteristics of a node. However, this paper shows that it is much better to select the indicators with strong correlations. Because indicator correlation is based on the statistical analysis of a large number of nodes, the particularity of an important node will be outlined if its indicator relationship doesn't comply with the statistical correlation. Therefore, the paper selects the multiple indicators including degree, ego-betweenness centrality and eigenvector centrality to evaluate the importance and the role of a node. The importance of a node is equal to the normalized sum of its three indicators. A candidate for core or bridge is selected from the great degree nodes or the nodes with great ego-betweenness centrality respectively. Then, the role of a candidate is determined according to the difference between its indicators' relationship with the statistical correlation of the overall network. Based on 18 real networks and 3 kinds of model networks, the experimental results show that the proposed methods perform quite well in evaluating the importance of nodes and in identifying the node role.

34 citations

Proceedings ArticleDOI
20 Apr 2006
TL;DR: A secure mining algorithm of association rules, which builds a globe hash table to prune item-sets and incorporate cryptographic techniques to minimize the information shared is addressed.
Abstract: Association rules mining is one of the most important and fundamental problems in data mining. Recently, in need of security, more and more people are studying privacy- preserving association rules mining in distributed database. This paper addresses a secure mining algorithm of association rules, which builds a globe hash table to prune item- sets and incorporate cryptographic techniques to minimize the information shared.

13 citations

Proceedings ArticleDOI
13 Aug 2007
TL;DR: An interrupt count-control-flow checking by software signatures (IC-CFCSS) algorithm is presented based on the CFCSS, where the total number of instructions running in the basic blocks per machine cycle is calculated during the course of pre-compilation.
Abstract: In radiation environments, alpha particles, cosmic rays and solar wind flux can cause a single event upset (SEU), which is one of the major sources of bit-flips in digital electronics. The control-flow checking is an effective way for the running systems to prevent the breaking-down caused by SEU. Control-flow checking by software signatures (CFCSS) and enhanced control-flow checking with assertions (ECCA) are representative of pure software methods that check the control flow of a program by using assigned signatures. But these assigned-signatures algorithms cannot check for intra-block control-flow errors. To overcome this shortcoming, an interrupt count-control-flow checking by software signatures (IC-CFCSS) algorithm is presented based on the CFCSS. The total number of instructions running in the basic blocks per machine cycle is calculated during the course of pre-compilation. Whether or not to jump into a given block is judged by setting up interrupt instructions through the basic block running-time. Fault-injection experiments show that the error-detection coverage is increased by the IC-CFCSS algorithm.

12 citations

Journal ArticleDOI
TL;DR: This method uses both internal and external semantic information of phrases to generate new phrases with better semantic expression capabilities and can also generate well-represented phrase embedDings when only pre-trained component word embeddings are used as input to solve the problem of data sparseness effectively.
Abstract: Phrase embedding can improve the performance of multiple NLP tasks. Most of the previous phrase-embedding methods that only use the external or internal semantic information of phrases to learn phrase embedding are challenging to solve the problem of data sparseness and have poor semantic presentation ability. To solve the above issues, in this paper, we propose an autoencoder-based method to combine pre-trained phrase embeddings and phrase component word embeddings into new phrase embeddings through complex non-linear transformations. This method uses both internal and external semantic information of phrases to generate new phrases with better semantic expression capabilities. This method can also generate well-represented phrase embeddings when only pre-trained component word embeddings are used as input to solve the problem of data sparseness effectively. We have designed two models for this method. The first one uses an FCNN(Fully Connected Neural Network) as the encoder and decoder, which we call AE-F. The second one uses the attention mechanism shared by the parameters of encoder and decoder to proportionally allocate the outputs of an LSTM and an FCNN, which we call it AE-ALF. We evaluated them in terms of phrase similarity and phrase classification and used two English datasets and two Chinese datasets. Experimental results show that AE-F and AE-ALF methods using pre-trained phrase embeddings and component word embeddings exceed 17 baseline methods, and AE-F and AE-ALF perform similarly. With only pre-trained component word embeddings, AE-F and AE-ALF also exceed most baseline methods, and AE-ALF performs better than AE-F.

9 citations


Cited by
More filters
Journal ArticleDOI
01 Mar 1906-Nature
TL;DR: In view of the interest attaching to the vaporisation and diffusion of solids, the following observations may be worthy of record as discussed by the authors, which may be seen as a good starting point for further research.
Abstract: IN view of the interest attaching to the vaporisation and diffusion of solids, the following observations may be worthy of record.

560 citations

Journal ArticleDOI
TL;DR: NOREVA 2.0 is distinguished for its capability in identifying well-performing normalization method(s) for time-course and multi-class metabolomics, which makes it an indispensable complement to other available tools.
Abstract: Biological processes (like microbial growth & physiological response) are usually dynamic and require the monitoring of metabolic variation at different time-points. Moreover, there is clear shift from case-control (N=2) study to multi-class (N>2) problem in current metabolomics, which is crucial for revealing the mechanisms underlying certain physiological process, disease metastasis, etc. These time-course and multi-class metabolomics have attracted great attention, and data normalization is essential for removing unwanted biological/experimental variations in these studies. However, no tool (including NOREVA 1.0 focusing only on case-control studies) is available for effectively assessing the performance of normalization method on time-course/multi-class metabolomic data. Thus, NOREVA was updated to version 2.0 by (i) realizing normalization and evaluation of both time-course and multi-class metabolomic data, (ii) integrating 144 normalization methods of a recently proposed combination strategy and (iii) identifying the well-performing methods by comprehensively assessing the largest set of normalizations (168 in total, significantly larger than those 24 in NOREVA 1.0). The significance of this update was extensively validated by case studies on benchmark datasets. All in all, NOREVA 2.0 is distinguished for its capability in identifying well-performing normalization method(s) for time-course and multi-class metabolomics, which makes it an indispensable complement to other available tools. NOREVA can be accessed at https://idrblab.org/noreva/.

127 citations

Journal ArticleDOI
30 Jul 2015-PLOS ONE
TL;DR: The potential and significance of combining complex social housing and intensive behavioral characterization of group-living animals with the utilization of novel statistical methods is demonstrated to further the understanding of the neurobiological basis of social behavior at the individual, relationship and group levels.
Abstract: Modelling complex social behavior in the laboratory is challenging and requires analyses of dyadic interactions occurring over time in a physically and socially complex environment. In the current study, we approached the analyses of complex social interactions in group-housed male CD1 mice living in a large vivarium. Intensive observations of social interactions during a 3-week period indicated that male mice form a highly linear and steep dominance hierarchy that is maintained by fighting and chasing behaviors. Individual animals were classified as dominant, sub-dominant or subordinate according to their David’s Scores and I& SI ranking. Using a novel dynamic temporal Glicko rating method, we ascertained that the dominance hierarchy was stable across time. Using social network analyses, we characterized the behavior of individuals within 66 unique relationships in the social group. We identified two individual network metrics, Kleinberg’s Hub Centrality and Bonacich’s Power Centrality, as accurate predictors of individual dominance and power. Comparing across behaviors, we establish that agonistic, grooming and sniffing social networks possess their own distinctive characteristics in terms of density, average path length, reciprocity out-degree centralization and out-closeness centralization. Though grooming ties between individuals were largely independent of other social networks, sniffing relationships were highly predictive of the directionality of agonistic relationships. Individual variation in dominance status was associated with brain gene expression, with more dominant individuals having higher levels of corticotropin releasing factor mRNA in the medial and central nuclei of the amygdala and the medial preoptic area of the hypothalamus, as well as higher levels of hippocampal glucocorticoid receptor and brain-derived neurotrophic factor mRNA. This study demonstrates the potential and significance of combining complex social housing and intensive behavioral characterization of group-living animals with the utilization of novel statistical methods to further our understanding of the neurobiological basis of social behavior at the individual, relationship and group levels.

112 citations

Journal ArticleDOI

91 citations