Home
/
Authors
/
Gautam Das

Author

Gautam Das

Other affiliations: University of Toronto, Qatar Computing Research Institute, George Washington University ...read more

Bio: Gautam Das is an academic researcher from University of Texas at Arlington. The author has contributed to research in topics: Tuple & Ranking. The author has an hindex of 54, co-authored 253 publications receiving 11363 citations. Previous affiliations of Gautam Das include University of Toronto & Qatar Computing Research Institute.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

DBXplorer: a system for keyword-based search over relational databases

[...]

Sanjay Agrawal¹, Surajit Chaudhuri¹, Gautam Das¹•Institutions (1)

Microsoft¹

07 Aug 2002

TL;DR: DBXplorer, a system that enables keyword-based searches in relational databases using a commercial relational database and Web server and allows users to interact via a browser front-end is discussed.

...read moreread less

Abstract: Internet search engines have popularized the keyword-based search paradigm. While traditional database management systems offer powerful query languages, they do not allow keyword-based search. In this paper, we discuss DBXplorer, a system that enables keyword-based searches in relational databases. DBXplorer has been implemented using a commercial relational database and Web server and allows users to interact via a browser front-end. We outline the challenges and discuss the implementation of our system, including results of extensive experimental evaluation.

...read moreread less

818 citations

Proceedings Article•

Rule discovery from time series

[...]

Gautam Das¹, King-Ip Lin¹, Heikki Mannila², Gopal Renganathan, Padhraic Smyth³ - Show less +1 more•Institutions (3)

University of Memphis¹, University of Helsinki², University of California, Irvine³

27 Aug 1998

TL;DR: In this article, the problem of finding rules relating patterns in a time series to other patterns in that series, or patterns in one series to patterns in another series, was considered, and adaptive methods for finding rules of the above type from time-series data were described.

...read moreread less

Abstract: We consider the problem of finding rules relating patterns in a time series to other patterns in that series, or patterns in one series to patterns in another series A simple example is a rule such as "a period of low telephone call activity is usually followed by a sharp rise in call volume" Examples of rules relating two or more time series are "if the Microsoft stock price goes up and Intel falls, then IBM goes up the next day," and "if Microsoft goes up strongly for one day, then declines strongly on the next day, and on the same days Intel stays about level, then IBM stays about level" Our emphasis is in the discovery of local patterns in multivariate time series, in contrast to traditional time series analysis which largely focuses on global models Thus, we search for rules whose conditions refer to patterns in time series However, we do not want to define beforehand which patterns are to be used; rather, we want the patterns to be formed from the data in the context of rule discovery We describe adaptive methods for finding rules of the above type from time-series data The methods are based on discretizing the sequence by methods resembling vector quantization We first form subsequences by sliding a window through the time series, and then cluster these subsequences by using a suitable measure of time-series similarity The discretized version of the time series is obtained by taking the cluster identifiers corresponding to the subsequence Once the time-series is discretized, we use simple rule finding methods to obtain rules from the sequence We present empirical results on the behavior of the method

...read moreread less

713 citations

Journal Article•DOI•

On sparse spanners of weighted graphs

[...]

Ingo Althöfer¹, Gautam Das², David P. Dobkin³, Deborah Joseph⁴, José Soares⁵ - Show less +1 more•Institutions (5)

Bielefeld University¹, University of Memphis², Princeton University³, University of Wisconsin-Madison⁴, University of São Paulo⁵

01 Jan 1993-Discrete and Computational Geometry

TL;DR: This paper gives a simple algorithm for constructing sparse spanners for arbitrary weighted graphs and applies this algorithm to obtain specific results for planar graphs and Euclidean graphs.

...read moreread less

Abstract: Given a graphG, a subgraphG' is at-spanner ofG if, for everyu,v ?V, the distance fromu tov inG' is at mostt times longer than the distance inG. In this paper we give a simple algorithm for constructing sparse spanners for arbitrary weighted graphs. We then apply this algorithm to obtain specific results for planar graphs and Euclidean graphs. We discuss the optimality of our results and present several nearly matching lower bounds.

...read moreread less

654 citations

Journal Article•DOI•

Group recommendation: semantics and efficiency

[...]

Sihem Amer-Yahia¹, Senjuti Basu Roy², Ashish Chawlat², Gautam Das², Cong Yu¹ - Show less +1 more•Institutions (2)

Yahoo!¹, University of Texas at Arlington²

01 Aug 2009

TL;DR: In this article, a formal semantics that accounts for both item relevance to a group and disagreements among group members is proposed for group recommendation and evaluated on MovieLens data set with 10M ratings.

...read moreread less

Abstract: We study the problem of group recommendation. Recommendation is an important information exploration paradigm that retrieves interesting items for users based on their profiles and past activities. Single user recommendation has received significant attention in the past due to its extensive use in Amazon and Netflix. How to recommend to a group of users who may or may not share similar tastes, however, is still an open problem. The need for group recommendation arises in many scenarios: a movie for friends to watch together, a travel destination for a family to spend a holiday break, and a good restaurant for colleagues to have a working lunch. Intuitively, items that are ideal for recommendation to a group may be quite different from those for individual members. In this paper, we analyze the desiderata of group recommendation and propose a formal semantics that accounts for both item relevance to a group and disagreements among group members. We design and implement algorithms for efficiently computing group recommendations. We evaluate our group recommendation method through a comprehensive user study conducted on Amazon Mechanical Turk and demonstrate that incorporating disagreements is critical to the effectiveness of group recommendation. We further evaluate the efficiency and scalability of our algorithms on the MovieLens data set with 10M ratings.

...read moreread less

346 citations

Book Chapter•DOI•

Finding Similar Time Series

[...]

Gautam Das¹, Dimitrios Gunopulos², Heikki Mannila³•Institutions (3)

University of Memphis¹, IBM², University of Helsinki³

24 Jun 1997

TL;DR: This paper presents an intuitive model for measuring the similarity between two time series that takes into account outliers, different scaling functions, and variable sampling rates, and shows the naturalness of this notion of similarity.

...read moreread less

Abstract: Similarity of objects is one of the crucial concepts in several applications, including data mining For complex objects, similarity is nontrivial to define In this paper we present an intuitive model for measuring the similarity between two time series The model takes into account outliers, different scaling functions, and variable sampling rates Using methods from computational geometry, we show that this notion of similarity can be computed in polynomial time Using statistical approximation techniques, the algorithms can be speeded up considerably We give preliminary experimental results that show the naturalness of the notion

...read moreread less

336 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Machine learning

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

01 Dec 1996-ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

...read moreread less

13,246 citations

Data Mining - Concepts and Techniques.

[...]

Petra Perner

01 Jan 2002

9,314 citations

Journal Article•DOI•

The Social Psychology of Organizations.

[...]

Warren G. Bennis, Daniel Katz, Robert L. Kahn

01 Oct 1966-American Sociological Review

4,201 citations

Proceedings Article•DOI•

Models and issues in data stream systems

[...]

Brian Babcock¹, Shivnath Babu¹, Mayur Datar¹, Rajeev Motwani¹, Jennifer Widom¹ - Show less +1 more•Institutions (1)

Stanford University¹

03 Jun 2002

TL;DR: The need for and research issues arising from a new model of data processing, where data does not take the form of persistent relations, but rather arrives in multiple, continuous, rapid, time-varying data streams are motivated.

...read moreread less

Abstract: In this overview paper we motivate the need for and research issues arising from a new model of data processing. In this model, data does not take the form of persistent relations, but rather arrives in multiple, continuous, rapid, time-varying data streams. In addition to reviewing past work relevant to data stream systems and current projects in the area, the paper explores topics in stream query languages, new requirements and challenges in query processing, and algorithmic issues.

...read moreread less

2,933 citations

Journal Article•DOI•

A guided tour to approximate string matching

[...]

Gonzalo Navarro¹•Institutions (1)

University of Chile¹

01 Mar 2001-ACM Computing Surveys

TL;DR: This work surveys the current techniques to cope with the problem of string matching that allows errors, and focuses on online searching and mostly on edit distance, explaining the problem and its relevance, its statistical behavior, its history and current developments, and the central ideas of the algorithms.

...read moreread less

Abstract: We survey the current techniques to cope with the problem of string matching that allows errors. This is becoming a more and more relevant issue for many fast growing areas such as information retrieval and computational biology. We focus on online searching and mostly on edit distance, explaining the problem and its relevance, its statistical behavior, its history and current developments, and the central ideas of the algorithms and their complexities. We present a number of experiments to compare the performance of the different algorithms and show which are the best choices. We conclude with some directions for future work and open problems.

...read moreread less

2,723 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse