scispace - formally typeset
Search or ask a question
Author

Hao Guo

Bio: Hao Guo is an academic researcher from Beijing University of Posts and Telecommunications. The author has contributed to research in topics: Computer science & Deep learning. The author has an hindex of 2, co-authored 3 publications receiving 21 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: A new ensemble strategy is applied to combine the results of different sub-extractors, making the SIE more universal and outperform any single sub- Extractor and outperforms the state-of-the-art methods on three datasets of different language.

21 citations

Journal ArticleDOI
TL;DR: It is shown that PSWA outperforms its backbone SGD remarkably during the early stage of the SGD sampling process, and thus it is demonstrated that the hypothesis that there are global scale geometric structures in the DNN loss landscape that can be discovered by an SGD agent at theEarly stage of its working period can be exploited by the WA operation.
Abstract: Averaging neural network weights sampled by a backbone stochastic gradient descent (SGD) is a simple-yet-effective approach to assist the backbone SGD in finding better optima, in terms of generalization. From a statistical perspective, weight-averaging contributes to variance reduction. Recently, a well-established stochastic weight-averaging (SWA) method was proposed, which featured the application of a cyclical or high-constant (CHC) learning-rate schedule for generating weight samples for weight-averaging. Then, a new insight on weight-averaging was introduced, which stated that weight average assisted in discovering a wider optima and resulted in better generalization. We conducted extensive experimental studies concerning SWA, involving 12 modern deep neural network model architectures and 12 open-source image, graph, and text datasets as benchmarks. We disentangled the contributions of the weight-averaging operation and the CHC learning-rate schedule for SWA, showing that the weight-averaging operation in SWA still contributed to variance reduction, and the CHC learning-rate schedule assisted in exploring the parameter space more widely than the backbone SGD, which could be be under-fitted due to a lack of training budget. We then presented an algorithm termed periodic SWA (PSWA) that comprised a series of weight-averaging operations to exploit such wide parameter space structures as explored by the CHC learning-rate schedule, and we empirically demonstrated that PSWA outperformed its backbone SGD remarkably.

16 citations

Journal ArticleDOI
TL;DR: A word-building method based on neural network model that can decompose a Chinese word to a sequence of radicals and learn structure information from these radical level features which is a key difference from the existing models is proposed.
Abstract: Text classification is a foundational task in many natural language processing applications. All traditional text classifiers take words as the basic units and conduct the pre-training process (lik...

6 citations

Journal ArticleDOI
TL;DR: N ORM S AGE 1 is introduced, a framework for addressing the novel task of conversation-grounded multi-lingual, multi-cultural norm discovery, based on language model prompting and self-verification, which discovers more relevant and insightful norms for conversations on-the-fly compared to baselines.
Abstract: Norm discovery is important for understanding and reasoning about the acceptable behaviors and potential violations in human communication and interactions. We introduce NormSage, a framework for addressing the novel task of conversation-grounded multi-lingual, multi-cultural norm discovery, based on language model prompting and self-verification. NormSAGE leverages the expressiveness and implicit knowledge of the pretrained GPT-3 language model backbone, to elicit knowledge about norms through directed questions representing the norm discovery task and conversation context. It further addresses the risk of language model hallucination with a self-verification mechanism ensuring that the norms discovered are correct and are substantially grounded to their source conversations. Evaluation results show that our approach discovers significantly more relevant and insightful norms for conversations on-the-fly compared to baselines (>10+% in Likert scale rating). The norms discovered from Chinese conversation are also comparable to the norms discovered from English conversation in terms of insightfulness and correctness (<3% difference). In addition, the culture-specific norms are promising quality, allowing for 80% accuracy in culture pair human identification. Finally, our grounding process in norm discovery self-verification can be extended for instantiating the adherence and violation of any norm for a given conversation on-the-fly, with explainability and transparency. NormSAGE achieves an AUC of 95.4% in grounding, with natural language explanation matching human-written quality.

5 citations

Book ChapterDOI
06 Sep 2018
TL;DR: A series of attention strategies are proposed and a CAM-LSTM (Combining Attention Mechanism with LSTM) model is designed based on these strategies to improve the aspect-level sentiment classification.
Abstract: Aspect-level sentiment classification, as a fine-grained task in sentiment classification, aiming to extract sentiment polarity from opinions towards a specific aspect word, has been made significant improvements in recent years. In this paper, we propose a series of attention strategies and design CAM-LSTM (Combining Attention Mechanism with LSTM) model based on these strategies to improve the aspect-level sentiment classification. Our attention strategies and model can capture the correlations between the aspect words and their context words more accurately by combining more semantic information of aspect words. We conduct experiments on three English datasets. The experimental results have shown that our attention strategies and model can make remarkable improvements and outperform the state-of-the-art baseline models in both datasets.

1 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This paper presents a novel model for experts to carry out Group Decision Making processes using free text and alternatives pairwise comparisons and introduces two ways of applying consensus measures over the Group decision Making process.
Abstract: Social networks are the most preferred mean for the people to communicate. Therefore, it is quite usual that experts use them to carry out Group Decision Making processes. One disadvantage that recent Group Decision Making methods have is that they do not allow the experts to use free text to express themselves. On the contrary, they force them to follow a specific user–computer communication structure. This is against social network nature where experts are free to express themselves using their preferred text structure. This paper presents a novel model for experts to carry out Group Decision Making processes using free text and alternatives pairwise comparisons. The main advantage of this method is that it is designed to work using social networks. Sentiment analysis procedures are used to analyze free texts and extract the preferences that the experts provide about the alternatives. Also, our method introduces two ways of applying consensus measures over the Group Decision Making process. They can be used to determine if the experts agree among them or if there are different postures. This way, it is possible to promote the debate in those cases where consensus is low.

89 citations

Journal ArticleDOI
TL;DR: This work evaluates existing efforts proposed to do language specific sentiment analysis with a simple yet effective baseline approach and suggests that simply translating the input text in a specific language to English and then using one of the existing best methods developed for English can be better than the existing language-specific approach evaluated.

72 citations

Journal ArticleDOI
TL;DR: In this paper, an end-to-end multi-prototype fusion embedding that fuses context-specific and task-specific information was proposed to solve the problem of polysemous-unaware word embedding.

33 citations