scispace - formally typeset
Search or ask a question
Author

Srikanta Bedathur

Bio: Srikanta Bedathur is an academic researcher from Indian Institute of Technology Delhi. The author has contributed to research in topics: Computer science & SPARQL. The author has an hindex of 21, co-authored 108 publications receiving 1680 citations. Previous affiliations of Srikanta Bedathur include IBM & Indraprastha Institute of Information Technology.


Papers
More filters
Proceedings ArticleDOI
24 Oct 2016
TL;DR: The method, coined ESPRESSO, explains the connection between two sets of entities given at query time in terms of a small number of relatedness cores: dense sub-graphs that have strong relations with both query sets.
Abstract: Analyzing and explaining relationships between entities in a knowledge graph is a fundamental problem with many applications. Prior work has been limited to extracting the most informative subgraph connecting two entities of interest. This paper extends and generalizes the state of the art by considering the relationships between two sets of entities given at query time. Our method, coined ESPRESSO, explains the connection between these sets in terms of a small number of relatedness cores: dense sub-graphs that have strong relations with both query sets. The intuition for this model is that the cores correspond to key events in which entities from both sets play a major role. For example, to explain the relationships between US politicians and European politicians, our method identifies events like the PRISM scandal and the Syrian Civil War as relatedness cores. Computing cores of bounded size is NP-hard. This paper presents efficient approximation algorithms. Our experiments with real-life knowledge graphs demonstrate the practical viability of our approach and, through user studies, the superior output quality compared to state-of-the-art baselines.

13 citations

Book ChapterDOI
28 Mar 2010
TL;DR: The demonstration of the NEAT (News Exploration Along Time) prototype system, an attempt towards building an intuitive and exploratory interface for search results over large news archives using timelines, consists of an exploratory search interface where it is shown how queries can produce different timelines and how one can use temporal information to discover interesting facts.
Abstract: There are a number of efforts towards building applications that leverage temporal information in documents. The demonstration of our NEAT (News Exploration Along Time) prototype system that we propose here, is an attempt towards building an intuitive and exploratory interface for search results over large news archives using timelines. The demonstration uses the New York Times Annotated Corpus as an illustrative example of such a news archive. The NEAT system consists of two parts: the back-end server extracts and stores in an index all the temporal information from documents, and performs important phrase discovery from sentences that have time-sensitive information. The front-end user interface, anchors the results of a keyword search along the timeline where the user can explore and browse results at different points in time. To aid in this exploration, the interesting phrases discovered from the result documents are displayed on the timeline to provide an overview. Another key feature of NEAT, which distinguishes it from other timeline-based approaches, is the adoption of semantic temporal annotations to anchor results on the timeline. An appropriate choice of personally-identifiable temporal annotations can enable users to more effectively contextualize results. For example, Barack Obama was elected in 2008 and Germany hosted the FIFA World Cup in 2006. We gathered temporal annotations at large-scale by crowdsourcing it over Amazon Mechanical Turk (AMT). Each HIT (Human Intelligence Task) on AMT consists of a request to expand a temporal expression (such as a year, a time-interval, or decade, etc.) with an entity (e.g., a person, country, organization etc.). Based on the agreement level among workers, we derive key entities for constructing a semantic temporal annotation layer on top the timeline. The outcome is a manually annotated timeline that can be very useful to anchor search results. Examples of annotations produced by crowdsourcing are (1969: Woodstock, Moon landing), (1970: Nixon), and (2003-2009: Iraq war) to name a few with different time granularities. The demonstration consists of an exploratory search interface where we show how queries can produce different timelines and how one can use temporal information to discover interesting facts.

13 citations

Proceedings ArticleDOI
10 Jun 2018
TL;DR: The DataVizard system is described, which is capable of recommending visual presentations with high accuracy when one needs to visualize the results of a structured query such as SQL and when one has acquired a data table with an associated short description.
Abstract: Selecting the appropriate visual presentation of the data such that it not only preserves the semantics but also provides an intuitive summary of the data is an important, often the final step of data analytics. Unfortunately, this is also a step involving significant human effort starting from selection of groups of columns in the structured results from analytics stages, to the selection of right visualization by experimenting with various alternatives. In this paper, we describe our DataVizard system aimed at reducing this overhead by automatically recommending the most appropriate visual presentation for the structured result. Specifically, we consider the following two scenarios: first, when one needs to visualize the results of a structured query such as SQL; and the second, when one has acquired a data table with an associated short description (e.g., tables from the Web). Using a corpus of real-world database queries (and their results) and a number of statistical tables crawled from the Web, we show that DataVizard is capable of recommending visual presentations with high accuracy.

12 citations

Proceedings ArticleDOI
19 May 2014
TL;DR: This paper describes the implementation of RQ-RDF-3X, a reification and quad enhanced RDF- 3X, which involved a significant re-engineering ranging from the set of indexes and their compression schemes to the query processing pipeline for queries over reified content.
Abstract: Efficient storage and querying of large repositories of RDF content is important due to the widespread growth of Semantic Web and Linked Open Data initiatives. Many novel database systems that store RDF in its native form or within traditional relational storage have demonstrated their ability to scale to large volumes of RDF content. However, it is increasingly becoming obvious that the simple dyadic relationship captured through traditional triples alone is not sufficient for modelling multi-entity relationships, provenance of facts, etc. Such richer models are supported in RDF through two techniques — first, called reification which retains the triple nature of RDF and the second, a non-standard extension called N-Quads. In this paper, we explore the challenges of supporting such richer semantic data by extending the state-of-the-art RDF-3X system. We describe our implementation of RQ-RDF-3X, a reification and quad enhanced RDF-3X, which involved a significant re-engineering ranging from the set of indexes and their compression schemes to the query processing pipeline for queries over reified content. Using large RDF repositories such as YAGO2S and DBpedia, and a set of SPARQL queries that utilize reification model, we demonstrate that RQ-RDF-3X is significantly faster than RDF-3X.

11 citations

Proceedings ArticleDOI
19 Dec 2013
TL;DR: This short position paper outlines two different scenarios that result in slightly different formulations of the problem and pursues the idea of using entity-centric summarization as a way of closing the gap in the interpretability of these results.
Abstract: With the availability of large entity-relationship graphs, finding the best relationship between entities is a problem that has attracted a lot of attention. Given two or more entities, the goal of most algorithms is to produce a graph structure of varying complexity (i.e., a simple path, a minimal weighted tree, or a dense subgraph etc.) as a way of characterizing the relationship between given entities. However, no attention is paid to the interpretability of these results -- i.e., the ability of humans to read these and comprehend the context in which these relationships exist. A key obstacle in this direction is the lack of necessary linguistic context and natural textual result formulations.We pursue the idea of using entity-centric summarization as a way of closing this gap. We aim to turn the resulting graph structures into one or more coherent textual snippets (or summaries) that can be easily read and interpreted. In this short position paper, we first outline two different scenarios that result in slightly different formulations of the problem. Based on preliminary experimental results, we discuss the challenges that are inherent in this setting.

10 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Journal ArticleDOI
TL;DR: Recent progress about link prediction algorithms is summarized, emphasizing on the contributions from physical perspectives and approaches, such as the random-walk-based methods and the maximum likelihood methods.
Abstract: Link prediction in complex networks has attracted increasing attention from both physical and computer science communities. The algorithms can be used to extract missing information, identify spurious interactions, evaluate network evolving mechanisms, and so on. This article summaries recent progress about link prediction algorithms, emphasizing on the contributions from physical perspectives and approaches, such as the random-walk-based methods and the maximum likelihood methods. We also introduce three typical applications: reconstruction of networks, evaluation of network evolving mechanism and classification of partially labeled networks. Finally, we introduce some applications and outline future challenges of link prediction algorithms.

2,530 citations

Journal ArticleDOI
TL;DR: YAGO2 as mentioned in this paper is an extension of the YAGO knowledge base, in which entities, facts, and events are anchored in both time and space, and it contains 447 million facts about 9.8 million entities.

1,186 citations

Journal ArticleDOI
TL;DR: YAGO is a large ontology with high coverage and precision, based on a clean logical model with a decidable consistency that allows representing n-ary relations in a natural way while maintaining compatibility with RDFS.

912 citations