scispace - formally typeset
Search or ask a question
Author

Hao Lu

Other affiliations: Chinese Academy of Sciences
Bio: Hao Lu is an academic researcher from Beijing Institute of Technology. The author has contributed to research in topics: Intelligent transportation system & Meteorological disasters. The author has an hindex of 8, co-authored 22 publications receiving 241 citations. Previous affiliations of Hao Lu include Chinese Academy of Sciences.

Papers
More filters
Patent
12 Sep 2012
TL;DR: In this article, a sentiment analysis method for micro-blog short text is presented, which comprises the following steps: step 1, collecting microblog data including keywords including keywords so as to store in a database; step 2, pre-processing the microblogs data; step 3, loading associated dictionaries; step 4, processing sentence division and filtering sentences which do not include user configuration keywords; step 5, processing word division to the sentences including the keywords and labeling parts of speech; step 6, processing dependency sentence structure analysis to sentences including subjects by a sentence structure analyzing tool;
Abstract: The invention discloses a sentiment analysis method oriented to a micro-blog short text. The method comprises the following steps: step 1, collecting micro-blog data including keywords so as to store in a database; step 2, pre-processing the micro-blog data; step 3, loading associated dictionaries; step 4, processing sentence division and filtering sentences which do not include user configuration keywords; step 5, processing word division to the sentences including the keywords and labeling parts of speech; step 6, processing dependency sentence structure analysis to the sentences including subjects by a sentence structure analyzing tool; step 7, judging the polarity of each sentence including subject words; and step 8, judging the polarity of a whole micro-blog after judging the polarities of all sentences including the subject words. According to the sentiment analysis method provided by the invention, sentiment analysis is more specific, so that users can know sentiment attitude of concerned aspects from the micro-blog.

78 citations

Journal ArticleDOI
TL;DR: A hybrid recommendation model by fusing network structured feature with graph neural networks and user interactive activities with tensor factorization was proposed, which outperforms other existing neural network and matrix factorization models including xSVD++, RTTF and DSE with a smaller predictive error as well as better recommendation accuracy.

49 citations

Journal ArticleDOI
TL;DR: A recurrent neural network with an attention mechanism is built, capable of obtaining users’ preferences in the current session and consequently making recommendations, which outperforms the current state-of-the-art short-term music recommendation systems on one real-world dataset.
Abstract: The current existing data in online music service platforms are heterogeneous, extensive, and disorganized. Finding an effective method to use these data in recommending appropriate music to users during a short-term session is a significant challenge. Another serious problem is that most of the data, in reality, obey the long-tailed distribution, which consequently leads to traditional music recommendation systems recommending a lot of popular music that users do not like on a specific occasion. To solve these problems, we propose a heterogeneous knowledge-based attentive neural network model for short-term music recommendations. First, we collect three types of data for modeling entities in user–music interaction network, i.e., graphic, textual, and visual data, and then embed them into high-dimensional spaces using the TransR, distributed memory version of paragraph vector, and variational autoencoder methods, respectively. The concatenation of these embedding results is an abstract representation of the entity. Based on this, a recurrent neural network with an attention mechanism is built, which is capable of obtaining users’ preferences in the current session and consequently making recommendations. The experimental results show that our proposed approach outperforms the current state-of-the-art short-term music recommendation systems on one real-world dataset. In addition, it can also recommend more relatively unpopular songs compared with classic models.

31 citations

Journal ArticleDOI
TL;DR: In this article, a heterogeneous knowledge embedding-based attentive RNN model is proposed to recommend scientific paper citations, which is based on the implicit influence of scholars' previous preferences of writing and citing on his/her new manuscript.
Abstract: Tremendous academic information causes serious information overload problems while supporting scientific research. Scientific paper and citation recommendation systems have been developed to relieve this problem and work as a filter to furnish only relevant papers to researchers. Although previous studies have made comparative progress, this problem is still challenging because current paper recommendation systems rely on heterogeneous and multi-sourced features, thereby requiring a unified learning representation to cover different types and modalities of information. Additionally, the implicit influence of scholars’ previous preferences of writing and citing on his/her new manuscript has not been well considered in the previous studies. Facing the issue from these two aspects, in this paper, a heterogeneous knowledge embedding-based attentive RNN model is proposed to recommend scientific paper citations. First, the preparation of features consists of two parts: (1) building a unified learning representation of structural entities and relations for recommending paper citations; and (2) defining and constructing a bibliographic network comprising five types of entities and five relations. The bibliographic network enables learning a unified representation so that all graphical entities and relations can be vectorized using TransD. To establish textual representations, the PV-DM model is utilized to generate numeric features for the title of each paper. Second, by combining structural and textual representations focusing on the “author-text query” scenario, an attentive bidirectional RNN is constructed to recommend paper and citation based on an user’s identity with a length-limited inquiry to capture the scholars’ previous writing and citing preferences, thereby reducing recommendation error. Through the DBLP dataset, our experiment results show the feasibility and effectiveness of our method, both in terms of the number as well as the quality of the first few recommended items. In specific, compared with existing models, our model has improved MRR and NDCG by approximately 4.8% and 2.4%, respectively.

26 citations

Journal ArticleDOI
TL;DR: This paper proposes a novel way to forecast and generate alerts for city-level traffic incidents based on a social approach rather than traditional physical approaches and considers the news report as an objective measurement to flexibly validate the feasibility of proposed model from social cyberspace to physical space.
Abstract: Traffic situation awareness and alerting assisted by adverse weather conditions contributes to improve traffic safety, disaster coping mechanisms, and route planning for government agencies, business sectors, and individual travelers. However, at the city level, the physical sensor-generated data are partly held by different transportation and meteorological departments, which causes problems of “isolated information” for data fusion. Furthermore, it makes traffic situation awareness and estimation challenging and ineffective. In this paper, we leverage the power of crowdsourcing knowledge in social media and propose a novel way to forecast and generate alerts for city-level traffic incidents based on a social approach rather than traditional physical approaches. Specifically, we first collect adverse weather topics and reports of traffic incidents from social media. Then, we extract temporal, spatial, and meteorological features as well as labeled traffic reaction values corresponding to the social media “heat” for each city. Afterwards, the regression and alerting model is proposed to estimate the city-level traffic situation and give the suggestion of warning levels. The experiments show that the proposed model equipped with gcForest achieves the best root mean square error (RMSE) and mean absolute percentage error (MAPE) score on the social traffic incidents test dataset. Moreover, we consider the news report as an objective measurement to flexibly validate the feasibility of proposed model from social cyberspace to physical space. Finally, a prototype system was deployed and applied to government agencies to provide an intuitive visualization solution as well as decision support assistance.

21 citations


Cited by
More filters
Proceedings Article
01 Jan 2010
TL;DR: In this article, a method to collect, group, rank and track breaking news in Twitter is proposed, where each story is provided with the information of message originator, story development and activity chart.
Abstract: Twitter has been used as one of the communication channels for spreading breaking news. We propose a method to collect, group, rank and track breaking news in Twitter. Since short length messages make similarity comparison difficult, we boost scores on proper nouns to improve the grouping results. Each group is ranked based on popularity and reliability factors. Current detection method is limited to facts part of messages. We developed an application called “Hotstream” based on the proposed method. Users can discover breaking news from the Twitter timeline. Each story is provided with the information of message originator, story development and activity chart. This provides a convenient way for people to follow breaking news and stay informed with real-time updates.

230 citations

Patent
07 Jul 2014
TL;DR: In this article, a set of segments from a text field is analyzed to determine at least one of a target subtext or a target meaning associated with the set of segmented segments, and then a selection of candidate emoticons is made.
Abstract: Various embodiments provide a method that comprises receiving a set of segments from a text field, analyzing the set of segments to determine at least one of a target subtext or a target meaning associated with the set of segments, and identifying a set of candidate emoticons where each candidate emoticon in the set of candidate emoticons has an association between the candidate emoticon and at least one of the target subtext or the target meaning. The method may further comprise presenting the set of candidate emoticons for entry selection at a current position of an input cursor, receiving an entry selection for a set of selected emoticons from the set of candidate emoticons, and inserting the set of selected emoticons into the text field at the current position of the input cursor.

214 citations

Patent
31 Dec 2013
TL;DR: In this paper, a system for identifying influential users of a social network platform is proposed, where a score for each of multiple users is computed using MapReduce primitives or other constructs that allow the computations to be distributed across multiple parallel processors.
Abstract: A system for identifying influential users of a social network platform The system may compute a score for each of multiple users Such a score may be topic-based, leading to a more accurate identification of influential users Such a topic-based score may indicate authority and/or impact of a user with respect to a topic The impact may be computed based on authority combined with other factors, such as power of the user The authority score may be simply computed, in whole or in part, directly from a tweet log without, for example creating a retweet graph As a result, the scores may be computed, using MapReduce primitives or other constructs that allow the computations to be distributed across multiple parallel processors Such scores may be used to select users based on impact as part of social trend analysis, marketing or other functions

181 citations

Journal ArticleDOI
TL;DR: This paper analyzes and presents the notion of trust in the oracles used in blockchain ecosystems, and compares trust-enabling features of the leading blockchain oracle approaches, techniques, and platforms.
Abstract: The essence of blockchain smart contracts lies in the execution of business logic code in a decentralized architecture in which the execution outcomes are trusted and agreed upon by all the executing nodes. Despite the decentralized and trustless architectures of the blockchain systems, smart contracts on their own cannot access data from the external world. Instead, smart contracts interact with off-chain external data sources, called oracles, whose primary job is to collect and provide data feeds and input to smart contracts. However, there is always risk of oracles providing corrupt, malicious, or inaccurate data. In this paper, we analyze and present the notion of trust in the oracles used in blockchain ecosystems. We analyze and compare trust-enabling features of the leading blockchain oracle approaches, techniques, and platforms. Moreover, we discuss open research challenges that should be addressed to ensure secure and trustworthy blockchain oracles.

139 citations

Journal ArticleDOI
TL;DR: A thorough review of the state-of-the-art of recommender systems that leverage multimedia content is presented, by classifying the reviewed papers with respect to their media type, the techniques employed to extract and represent their content features, and the recommendation algorithm.
Abstract: Recommender systems have become a popular and effective means to manage the ever-increasing amount of multimedia content available today and to help users discover interesting new items. Today’s recommender systems suggest items of various media types, including audio, text, visual (images), and videos. In fact, scientific research related to the analysis of multimedia content has made possible effective content-based recommender systems capable of suggesting items based on an analysis of the features extracted from the item itself. The aim of this survey is to present a thorough review of the state-of-the-art of recommender systems that leverage multimedia content, by classifying the reviewed papers with respect to their media type, the techniques employed to extract and represent their content features, and the recommendation algorithm. Moreover, for each media type, we discuss various domains in which multimedia content plays a key role in human decision-making and is therefore considered in the recommendation process. Examples of the identified domains include fashion, tourism, food, media streaming, and e-commerce.

102 citations