scispace - formally typeset
Proceedings ArticleDOI

Incorporating Social Context and Domain Knowledge for Entity Recognition

Reads0
Chats0
TLDR
The SOCINST model, which can automatically construct a context of subtopics for each instance, with each subtopic representing one possible meaning of the instance, is proposed and incorporated into the model using a Dirichlet tree distribution.
Abstract
Recognizing entity instances in documents according to a knowledge base is a fundamental problem in many data mining applications. The problem is extremely challenging for short documents in complex domains such as social media and biomedical domains. Large concept spaces and instance ambiguity are key issues that need to be addressed. Most of the documents are created in a social context by common authors via social interactions, such as reply and citations. Such social contexts are largely ignored in the instance-recognition literature. How can users' interactions help entity instance recognition? How can the social context be modeled so as to resolve the ambiguity of different instances? In this paper, we propose the SOCINST model to formalize the problem into a probabilistic model. Given a set of short documents (e.g., tweets or paper abstracts) posted by users who may connect with each other, SOCINST can automatically construct a context of subtopics for each instance, with each subtopic representing one possible meaning of the instance. The model is also able to incorporate social relationships between users to help build social context. We further incorporate domain knowledge into the model using a Dirichlet tree distribution. We evaluate the proposed model on three different genres of datasets: ICDM'12 Contest, Weibo, and I2B2. In ICDM'12 Contest, the proposed model clearly outperforms (+21.4%; $p l 1e-5 with t-test) all the top contestants. In Weibo and I2B2, our results also show that the recognition accuracy of SOCINST is up to 5.3-26.6% better than those of several alternative methods.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

TempoRec: Temporal-Topic Based Recommender for Social Network Services

TL;DR: A hybrid recommendation algorithm based on social relations and time-sequenced topics, which has been verified using Real Sina Weibo datasets, works well and achieves better mean average precision (MAP) than existing other counterparts.
Book

Knowledge Graphs: An Information Retrieval Perspective

TL;DR: An overview of the literature on knowledge graphs (KGs) in the context of information retrieval (IR) is provided and how KGs can be employed to support IR tasks, including document and entity retrieval is discussed.
Proceedings Article

Multi-modal Bayesian embeddings for learning social knowledge graphs

TL;DR: A multi-modal Bayesian embedding model, GenVector, is proposed to learn latent topics that generate word and network embeddings in a shared latent topic space, and significantly decreases the error rate in an online A/B test with live users.
Proceedings ArticleDOI

AMiner: Mining Deep Knowledge from Big Scholar Data

TL;DR: This talk will focus on answering two fundamental questions for author-centric network analysis: who is who?
Journal ArticleDOI

A semantic and social‐based collaborative recommendation of friends in social networks

TL;DR: A novel approach which combines a user‐based collaborative filtering (CF) algorithm with semantic and social recommendations for the recommendation of users in social networks is proposed and a social recommender system based on this approach is developed.
References
More filters
Proceedings ArticleDOI

Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora

TL;DR: Labeled LDA is introduced, a topic model that constrains Latent Dirichlet Allocation by defining a one-to-one correspondence between LDA's latent topics and user tags that allows Labeled LDA to directly learn word-tag correspondences.
Proceedings Article

Named Entity Recognition in Tweets: An Experimental Study

TL;DR: The novel T-ner system doubles F1 score compared with the Stanford NER system, and leverages the redundancy inherent in tweets to achieve this performance, using LabeledLDA to exploit Freebase dictionaries as a source of distant supervision.

Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks.

TL;DR: In this article, Rao-Blackwellised particle filters (RBPFs) were proposed to increase the efficiency of particle filtering, using a technique known as Rao-blackwellisation.
Book ChapterDOI

Rao-blackwellised particle filtering for dynamic Bayesian networks

TL;DR: In this paper, Rao-Blackwellised particle filters (RBPFs) were proposed to increase the efficiency of particle filtering, using a technique known as Rao-blackwellisation.
Related Papers (5)