Home
/
Authors
/
Srikanta Bedathur

Author

Srikanta Bedathur

Other affiliations: IBM, Indraprastha Institute of Information Technology, Indian Institute of Science ...read more

Bio: Srikanta Bedathur is an academic researcher from Indian Institute of Technology Delhi. The author has contributed to research in topics: Computer science & SPARQL. The author has an hindex of 21, co-authored 108 publications receiving 1680 citations. Previous affiliations of Srikanta Bedathur include IBM & Indraprastha Institute of Information Technology.

Topics: Computer science, SPARQL, RDF, Graph (abstract data type), Reachability ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003

Papers

PDF

Open Access

More filters

Posted Content•

DataVizard: Recommending Visual Presentations for Structured Data

[...]

Rema Ananthanarayanan¹, Pranay Lohia¹, Srikanta Bedathur¹•Institutions (1)

IBM¹

14 Nov 2017-arXiv: Artificial Intelligence

TL;DR: DataVizard as mentioned in this paper is a system that automatically recommends the most appropriate visual presentation for the structured result of a structured query such as SQL and a data table with an associated short description (e.g., tables from the Web).

...read moreread less

Abstract: Selecting the appropriate visual presentation of the data such that it preserves the semantics of the underlying data and at the same time provides an intuitive summary of the data is an important, often the final step of data analytics. Unfortunately, this is also a step involving significant human effort starting from selection of groups of columns in the structured results from analytics stages, to the selection of right visualization by experimenting with various alternatives. In this paper, we describe our \emph{DataVizard} system aimed at reducing this overhead by automatically recommending the most appropriate visual presentation for the structured result. Specifically, we consider the following two scenarios: first, when one needs to visualize the results of a structured query such as SQL; and the second, when one has acquired a data table with an associated short description (e.g., tables from the Web). Using a corpus of real-world database queries (and their results) and a number of statistical tables crawled from the Web, we show that DataVizard is capable of recommending visual presentations with high accuracy. We also present the results of a user survey that we conducted in order to assess user views of the suitability of the presented charts vis-a-vis the plain text captions of the data.

...read moreread less

1 citations

Efficiently Identifying Interesting Time-Points in Text Archive Search

[...]

Vinay Setty, Gerhard Weikum, Srikanta Bedathur

01 Jan 2010

TL;DR: The aim is to efficiently identify interesting time points in Web archives with an assumption that the authors receive a result list for a given query in standard relevance-order from an existing retrieval system, and an early termination technique which is proven to be very effective.

...read moreread less

Abstract: Large scale text archives are increasingly becoming available on the Web. Exploring their evolving contents along both text and temporal dimensions enables us to realize their full potential. Standard keyword queries facilitate exploration along the text dimension only. Recently proposed time-travel keyword queries enable query processing along both dimensions, but require the user to be aware of the exact time point of interest. This may be impractical if the user does not know the history of the query within the collection or is not familiar with the topic. In this work, our aim is to efficiently identify interesting time points in Web archives with an assumption that we receive a result list for a given query in standard relevance-order from an existing retrieval system. We consider two forms of Web archives: (i) one where documents have a publication time-stamp and never change (such as news archives), and (ii) the archives where documents undergo revisions, and are thus versioned. In both settings, we define interestingness as the change in top-k result set of two consecutive time-points. The key step in our solution is the maintenance of top-k results valid at each time-point of the archive, which can then be used to compute the interestingness scores for the time-points. We propose two techniques to realize efficient identification of interesting time points: (i) For the case when documents once published never change, we have a simple but effective technique. (ii) For the more general case with versioned documents, we develop an extension to the segment tree which makes it rank-aware and dynamic. To further improve efficiency, we propose an early termination technique which is proven to be very effective. Our methods are shown to be effective in efficiently finding interesting time points in a set of experiments using the New York Times news archive and the Wikipedia versioned archive.

...read moreread less

1 citations

Posted Content•

Regex Queries over Incomplete Knowledge Bases.

[...]

Vaibhav Adlakha, Parth Shah¹, Srikanta Bedathur²•Institutions (2)

Uka Tarsadia University¹, Indian Institute of Technology Delhi²

01 May 2020-arXiv: Computation and Language

TL;DR: RotatE-Box as discussed by the authors is a combination of RotatE and box embeddings for answering regular expression queries (containing disjunction and Kleene plus operators) over incomplete KBs.

...read moreread less

Abstract: We propose the novel task of answering regular expression queries (containing disjunction ($\vee$) and Kleene plus ($+$) operators) over incomplete KBs. The answer set of these queries potentially has a large number of entities, hence previous works for single-hop queries in KBC that model a query as a point in high-dimensional space are not as effective. In response, we develop RotatE-Box -- a novel combination of RotatE and box embeddings. It can model more relational inference patterns compared to existing embedding based models. Furthermore, we define baseline approaches for embedding based KBC models to handle regex operators. We demonstrate performance of RotatE-Box on two new regex-query datasets introduced in this paper, including one where the queries are harvested based on actual user query logs. We find that our final RotatE-Box model significantly outperforms models based on just RotatE and just box embeddings.

...read moreread less

Journal Article•DOI•

Sampling and Reconstruction Using Bloom Filters

[...]

Neha Sengupta¹, Amitabha Bagchi¹, Srikanta Bedathur², Maya Ramanath¹•Institutions (2)

Indian Institute of Technology Delhi¹, IBM²

12 Jan 2017-arXiv: Data Structures and Algorithms

TL;DR: This work introduces a novel hierarchical data structure called BloomSampleTree that helps us design efficient algorithms to extract an almost uniform sample from the set stored in a Bloom filter and also allows us to reconstruct the set efficiently.

...read moreread less

Abstract: In this paper, we address the problem of sampling from a set and reconstructing a set stored as a Bloom filter. To the best of our knowledge our work is the first to address this question. We introduce a novel hierarchical data structure called BloomSampleTree that helps us design efficient algorithms to extract an almost uniform sample from the set stored in a Bloom filter and also allows us to reconstruct the set efficiently. In the case where the hash functions used in the Bloom filter implementation are partially invertible, in the sense that it is easy to calculate the set of elements that map to a particular hash value, we propose a second, more space-efficient method called HashInvert for the reconstruction. We study the properties of these two methods both analytically as well as experimentally. We provide bounds on run times for both methods and sample quality for the BloomSampleTree based algorithm, and show through an extensive experimental evaluation that our methods are efficient and effective.

...read moreread less

Proceedings Article•DOI•

A Change Tracking Framework for Financial Documents

[...]

Nishtha Madaan¹, Gautam Singh¹, Srikanta Bedathur¹, Arun Kumar¹•Institutions (1)

IBM¹

02 Jul 2018

TL;DR: A graph-based approach is devised called DeepAntara1 and its performance for change tracking task over multiple sentence pairs extracted from different versions of publicly available financial CRS treaties is shown.

...read moreread less

Abstract: Businesses need to adhere to certain regulations to remain compliant They want to expand or move to a new geography and find itself subject to a slightly different set of regulations Also, regulations themselves change over time and force the business to change its internal working to remain compliant When a compliance officer is presented with a new regulatory document, he has to manually compare corresponding sentences between previous and the new version While most studies in text mining have focused on measuring textual similarity, textual entailment detection and paraphrase identification etc, there has been very little focus on the problem of change tracking (CT) Change tracking can be defined as the task of identifying the phrase pair(s) that captures the semantic difference between two given sentences, and plays an important role in domains such as financial regulatory compliance where core changes introduced by regulators to existing regulations need to identified quickly Naturally, the change tracking has to satisfy the minimality and comprehen-siveness requirements even in presence of complex language structure, context dependence and paraphrasing between com-pared sentences In this paper, we address these challenges and devise a graph-based approach called DeepAntara1 and show its performance for change tracking task over multiple sentence pairs extracted from different versions of publicly available financial CRS treaties

...read moreread less

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
…
18
19
20
21
22
23
24
…

Collapse

Cited by

PDF

Open Access

More filters

Pattern Recognition and Machine Learning

[...]

Christopher M. Bishop¹•Institutions (1)

Microsoft¹

01 Jan 2006

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.

...read moreread less

Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

...read moreread less

10,141 citations

Journal Article•DOI•

Link prediction in complex networks: A survey

[...]

Linyuan Lü¹, Linyuan Lü², Linyuan Lü³, Tao Zhou⁴, Tao Zhou¹ - Show less +1 more•Institutions (4)

University of Electronic Science and Technology of China¹, University of Fribourg², University of Shanghai for Science and Technology³, University of Science and Technology of China⁴

15 Mar 2011-Physica A-statistical Mechanics and Its Applications

TL;DR: Recent progress about link prediction algorithms is summarized, emphasizing on the contributions from physical perspectives and approaches, such as the random-walk-based methods and the maximum likelihood methods.

...read moreread less

Abstract: Link prediction in complex networks has attracted increasing attention from both physical and computer science communities. The algorithms can be used to extract missing information, identify spurious interactions, evaluate network evolving mechanisms, and so on. This article summaries recent progress about link prediction algorithms, emphasizing on the contributions from physical perspectives and approaches, such as the random-walk-based methods and the maximum likelihood methods. We also introduce three typical applications: reconstruction of networks, evaluation of network evolving mechanism and classification of partially labeled networks. Finally, we introduce some applications and outline future challenges of link prediction algorithms.

...read moreread less

2,530 citations

인공고관절의 세라믹 볼 헤드의 기계적 안정성 평가를 위한 3 차원 유한요소 해석

[...]

한성민, 추준욱, 전흥재, 김정성, 최귀원, 윤인찬 - Show less +2 more

01 May 2010

1,984 citations

Journal Article•DOI•

YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia

[...]

Johannes Hoffart¹, Fabian M. Suchanek², Klaus Berberich¹, Gerhard Weikum¹•Institutions (2)

Max Planck Society¹, French Institute for Research in Computer Science and Automation²

01 Jan 2013-Artificial Intelligence

TL;DR: YAGO2 as mentioned in this paper is an extension of the YAGO knowledge base, in which entities, facts, and events are anchored in both time and space, and it contains 447 million facts about 9.8 million entities.

...read moreread less

1,186 citations

Journal Article•DOI•

YAGO: A Large Ontology from Wikipedia and WordNet

[...]

Fabian M. Suchanek¹, Gjergji Kasneci¹, Gerhard Weikum¹•Institutions (1)

Max Planck Society¹

01 Sep 2008-Journal of Web Semantics

TL;DR: YAGO is a large ontology with high coverage and precision, based on a clean logical model with a decidable consistency that allows representing n-ary relations in a natural way while maintaining compatibility with RDFS.

...read moreread less

912 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse