Home
/
Authors
/
Dmitri V. Kalashnikov

Author

Dmitri V. Kalashnikov

Other affiliations: University of California, University of California, Irvine, NEC ...read more

Bio: Dmitri V. Kalashnikov is an academic researcher from AT&T Labs. The author has contributed to research in topics: Web page & Probabilistic logic. The author has an hindex of 27, co-authored 62 publications receiving 3331 citations. Previous affiliations of Dmitri V. Kalashnikov include University of California & University of California, Irvine.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Evaluating probabilistic queries over imprecise data

[...]

Reynold Cheng¹, Dmitri V. Kalashnikov¹, Sunil Prabhakar¹•Institutions (1)

Purdue University¹

09 Jun 2003

TL;DR: This paper addresses the important issue of measuring the quality of the answers to query evaluation based upon uncertain data, and provides algorithms for efficiently pulling data from relevant sensors or moving objects in order to improve thequality of the executing queries.

...read moreread less

Abstract: Many applications employ sensors for monitoring entities such as temperature and wind speed. A centralized database tracks these entities to enable query processing. Due to continuous changes in these values and limited resources (e.g., network bandwidth and battery power), it is often infeasible to store the exact values at all times. A similar situation exists for moving object environments that track the constantly changing locations of objects. In this environment, it is possible for database queries to produce incorrect or invalid results based upon old data. However, if the degree of error (or uncertainty) between the actual value and the database value is controlled, one can place more confidence in the answers to queries. More generally, query answers can be augmented with probabilistic estimates of the validity of the answers. In this paper we study probabilistic query evaluation based upon uncertain data. A classification of queries is made based upon the nature of the result set. For each class, we develop algorithms for computing probabilistic answers. We address the important issue of measuring the quality of the answers to these queries, and provide algorithms for efficiently pulling data from relevant sensors or moving objects in order to improve the quality of the executing queries. Extensive experiments are performed to examine the effectiveness of several data update policies.

...read moreread less

632 citations

Journal Article•DOI•

Querying imprecise data in moving object environments

[...]

Reynold Cheng¹, Dmitri V. Kalashnikov², Sunil Prabhakar•Institutions (2)

Purdue University¹, University of California, Irvine²

05 Mar 2003

TL;DR: Algorithms for computing these queries are presented for a generic object movement model and detailed solutions are discussed for two common models of uncertainty in moving object databases.

...read moreread less

Abstract: In moving object environments, it is infeasible for the database tracking the movement of objects to store the exact locations of objects at all times. Typically, the location of an object is known with certainty only at the time of the update. The uncertainty in its location increases until the next update. In this environment, it is possible for queries to produce incorrect results based upon old data. However, if the degree of uncertainty is controlled, then the error of the answers to queries can be reduced. More generally, query answers can be augmented with probabilistic estimates of the validity of the answer. We study the execution of probabilistic range and nearest-neighbor queries. The imprecision in answers to queries is an inherent property of these applications due to uncertainty in data, unlike the techniques for approximate nearest-neighbor processing that trade accuracy for performance. Algorithms for computing these queries are presented for a generic object movement model and detailed solutions are discussed for two common models of uncertainty in moving object databases. We study the performance of these queries through extensive simulations.

...read moreread less

441 citations

Journal Article•DOI•

Query indexing and velocity constrained indexing: scalable techniques for continuous queries on moving objects

[...]

Sunil Prabhakar¹, Yuni Xia¹, Dmitri V. Kalashnikov¹, Walid G. Aref¹, Susanne E. Hambrusch¹ - Show less +1 more•Institutions (1)

Purdue University¹

01 Oct 2002-IEEE Transactions on Computers

TL;DR: Novel techniques for the efficient and scalable evaluation of multiple continuous queries on moving objects and a combination of Query Indexing and Velocity Constrained Indexing enables the scalable execution of insertion and deletion of queries in addition to processing ongoing queries are developed.

...read moreread less

Abstract: Moving object environments are characterized by large numbers of moving objects and numerous concurrent continuous queries over these objects. Efficient evaluation of these queries in response to the movement of the objects is critical for supporting acceptable response times. In such environments, the traditional approach of building an index on the objects (data) suffers from the need for frequent updates and thereby results in poor performance. In fact, a brute force, no-index strategy yields better performance in many cases. Neither the traditional approach nor the brute force strategy achieve reasonable query processing times. This paper develops novel techniques for the efficient and scalable evaluation of multiple continuous queries on moving objects. Our solution leverages two complimentary techniques: Query Indexing and Velocity Constrained Indexing (VCI). Query Indexing relies on 1) incremental evaluation, 2) reversing the role of queries and data, and 3) exploiting the relative locations of objects and queries. VCI takes advantage of the maximum possible speed of objects in order to delay the expensive operation of updating an index to reflect the movement of objects. In contrast to an earlier technique that requires exact knowledge about the movement of the objects, VCI does not rely on such information. While Query Indexing outperforms VCI, it does not efficiently handle the arrival of new queries. Velocity constrained indexing, on the other hand, is unaffected by changes in queries. We demonstrate that a combination of Query Indexing and Velocity Constrained Indexing enables the scalable execution of insertion and deletion of queries in addition to processing ongoing queries. We also develop several optimizations and present a detailed experimental evaluation of our techniques. The experimental results show that the proposed schemes outperform the traditional approaches by almost two orders of magnitude.

...read moreread less

378 citations

Journal Article•DOI•

Domain-independent data cleaning via analysis of entity-relationship graph

[...]

Dmitri V. Kalashnikov¹, Sharad Mehrotra¹•Institutions (1)

University of California, Irvine¹

01 Jun 2006-ACM Transactions on Database Systems

TL;DR: The key difference between the approach (called RelDC) and the traditional techniques is that RelDC analyzes not only object features but also inter-object relationships to improve the disambiguation quality.

...read moreread less

Abstract: In this article, we address the problem of reference disambiguation. Specifically, we consider a situation where entities in the database are referred to using descriptions (e.g., a set of instantiated attributes). The objective of reference disambiguation is to identify the unique entity to which each description corresponds. The key difference between the approach we propose (called RelDC) and the traditional techniques is that RelDC analyzes not only object features but also inter-object relationships to improve the disambiguation quality. Our extensive experiments over two real data sets and over synthetic datasets show that analysis of relationships significantly improves quality of the result.

...read moreread less

188 citations

Proceedings Article•

Exploiting relationships for domain-independent data cleaning †

[...]

Dmitri V. Kalashnikov, Sharad Mehrotra, Zhaoqi Chen

01 Jan 2005

...read moreread less

Abstract: In this paper, we address the problem of reference disambiguation. Specifically, we consider a situation where entities in the database are referred to using descriptions (e.g., a set of instantiated attributes). The objective of reference disambiguation is to identify the unique entity to which each description corresponds. The key difference between the approach we propose (called RelDC) and the traditional techniques is that RelDC analyzes not only object features but also inter-object relationships to improve the disambiguation quality. Our extensive experiments over two real datasets and over synthetic datasets show that analysis of relationships significantly improves quality of the result.

...read moreread less

140 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

The Analysis of Time Series: An Introduction

[...]

P. A. Blight

01 Mar 1991-Journal of The Royal Statistical Society Series C-applied Statistics

TL;DR: The analysis of time series: An Introduction, 4th edn. as discussed by the authors by C. Chatfield, C. Chapman and Hall, London, 1989. ISBN 0 412 31820 2.

...read moreread less

Abstract: The Analysis of Time Series: An Introduction, 4th edn. By C. Chatfield. ISBN 0 412 31820 2. Chapman and Hall, London, 1989. 242 pp. £13.50.

...read moreread less

1,583 citations

Nonlinear Time Series Analysis.

[...]

James A. Stewart

01 Mar 1995

TL;DR: This thesis applies neural network feature selection techniques to multivariate time series data to improve prediction of a target time series and results indicate that the Stochastics and RSI indicators result in better prediction results than the moving averages.

...read moreread less

Abstract: : This thesis applies neural network feature selection techniques to multivariate time series data to improve prediction of a target time series. Two approaches to feature selection are used. First, a subset enumeration method is used to determine which financial indicators are most useful for aiding in prediction of the S&P 500 futures daily price. The candidate indicators evaluated include RSI, Stochastics and several moving averages. Results indicate that the Stochastics and RSI indicators result in better prediction results than the moving averages. The second approach to feature selection is calculation of individual saliency metrics. A new decision boundary-based individual saliency metric, and a classifier independent saliency metric are developed and tested. Ruck's saliency metric, the decision boundary based saliency metric, and the classifier independent saliency metric are compared for a data set consisting of the RSI and Stochastics indicators as well as delayed closing price values. The decision based metric and the Ruck metric results are similar, but the classifier independent metric agrees with neither of the other metrics. The nine most salient features, determined by the decision boundary based metric, are used to train a neural network and the results are presented and compared to other published results. (AN)

...read moreread less

1,545 citations

Proceedings Article•DOI•

Microblogging during two natural hazards events: what twitter may contribute to situational awareness

[...]

Sarah Vieweg¹, Amanda Lee Hughes¹, Kate Starbird¹, Leysia Palen¹•Institutions (1)

University of Colorado Boulder¹

10 Apr 2010

TL;DR: Analysis of microblog posts generated during two recent, concurrent emergency events in North America via Twitter, a popular microblogging service, aims to inform next steps for extracting useful, relevant information during emergencies using information extraction (IE) techniques.

...read moreread less

Abstract: We analyze microblog posts generated during two recent, concurrent emergency events in North America via Twitter, a popular microblogging service. We focus on communications broadcast by people who were "on the ground" during the Oklahoma Grassfires of April 2009 and the Red River Floods that occurred in March and April 2009, and identify information that may contribute to enhancing situational awareness (SA). This work aims to inform next steps for extracting useful, relevant information during emergencies using information extraction (IE) techniques.

...read moreread less

1,479 citations

Book•

Computational geometry

[...]

F. Frances Yao

02 Jan 1991

1,377 citations

Journal Article•DOI•

Trajectory Data Mining: An Overview

[...]

Yu Zheng¹•Institutions (1)

Microsoft¹

12 May 2015-ACM Transactions on Intelligent Systems and Technology

TL;DR: A systematic survey on the major research into trajectory data mining, providing a panorama of the field as well as the scope of its research topics, and introduces the methods that transform trajectories into other data formats, such as graphs, matrices, and tensors.

...read moreread less

Abstract: The advances in location-acquisition and mobile computing techniques have generated massive spatial trajectory data, which represent the mobility of a diversity of moving objects, such as people, vehicles, and animals. Many techniques have been proposed for processing, managing, and mining trajectory data in the past decade, fostering a broad range of applications. In this article, we conduct a systematic survey on the major research into trajectory data mining, providing a panorama of the field as well as the scope of its research topics. Following a road map from the derivation of trajectory data, to trajectory data preprocessing, to trajectory data management, and to a variety of mining tasks (such as trajectory pattern mining, outlier detection, and trajectory classification), the survey explores the connections, correlations, and differences among these existing techniques. This survey also introduces the methods that transform trajectories into other data formats, such as graphs, matrices, and tensors, to which more data mining and machine learning techniques can be applied. Finally, some public trajectory datasets are presented. This survey can help shape the field of trajectory data mining, providing a quick understanding of this field to the community.

...read moreread less

1,289 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse