Home
/
Authors
/
Vipin Kumar

Author

Vipin Kumar

Other affiliations: University of Maryland, College Park, United States Department of the Army, United States Geological Survey ...read more

Bio: Vipin Kumar is an academic researcher from University of Minnesota. The author has contributed to research in topics: Parallel algorithm & Cluster analysis. The author has an hindex of 95, co-authored 614 publications receiving 59034 citations. Previous affiliations of Vipin Kumar include University of Maryland, College Park & United States Department of the Army.

Topics: Parallel algorithm, Cluster analysis, Speedup, Graph partition, Search algorithm ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

[...]

Shyam Boriah¹, Varun Chandola¹, Vipin Kumar•Institutions (1)

University of Minnesota¹

01 Jan 2008

TL;DR: In this paper, the performance of a variety of similarity measures in the context of a specific data mining task is evaluated. But their relative performance has not been evaluated for all types of problems.

...read moreread less

Abstract: Measuring similarity or distance between two entities is a key step for several data mining and knowledge discovery tasks. The notion of similarity for continuous data is relatively well-understood, but for categorical data, the similarity computation is not straightforward. Several data-driven similarity measures have been proposed in the literature to compute the similarity between two categorical data instances but their relative performance has not been evaluated. In this paper we study the performance of a variety of similarity measures in the context of a specific data mining task: outlier detection. Results on a variety of data sets show that while no one measure dominates others for all types of problems, some measures are able to have consistently high performance.

...read moreread less

554 citations

Journal Article•DOI•

Theory-guided Data Science: A New Paradigm for Scientific Discovery from Data

[...]

Anuj Karpatne¹, Gowtham Atluri², James H. Faghmous³, Michael Steinbach¹, Arindam Banerjee¹, Auroop R. Ganguly⁴, Shashi Shekhar¹, Nagiza F. Samatova⁵, Vipin Kumar¹ - Show less +5 more•Institutions (5)

University of Minnesota¹, University of Cincinnati², Icahn School of Medicine at Mount Sinai³, Northeastern University⁴, North Carolina State University⁵

27 Dec 2016-arXiv: Learning

TL;DR: The theory-guided data science (TGDS) paradigm as mentioned in this paper is an emerging paradigm that aims to leverage the wealth of scientific knowledge for improving the effectiveness of data science models in enabling scientific discovery.

...read moreread less

Abstract: Data science models, although successful in a number of commercial domains, have had limited applicability in scientific problems involving complex physical phenomena. Theory-guided data science (TGDS) is an emerging paradigm that aims to leverage the wealth of scientific knowledge for improving the effectiveness of data science models in enabling scientific discovery. The overarching vision of TGDS is to introduce scientific consistency as an essential component for learning generalizable models. Further, by producing scientifically interpretable models, TGDS aims to advance our scientific understanding by discovering novel domain insights. Indeed, the paradigm of TGDS has started to gain prominence in a number of scientific disciplines such as turbulence modeling, material discovery, quantum chemistry, bio-medical science, bio-marker discovery, climate science, and hydrology. In this paper, we formally conceptualize the paradigm of TGDS and present a taxonomy of research themes in TGDS. We describe several approaches for integrating domain knowledge in different research themes using illustrative examples from different disciplines. We also highlight some of the promising avenues of novel research for realizing the full potential of theory-guided data science.

...read moreread less

532 citations

Proceedings Article•DOI•

Multilevel k-way hypergraph partitioning

[...]

George Karypis¹, Vipin Kumar¹•Institutions (1)

University of Minnesota¹

01 Jun 1999

TL;DR: A new multilevel k-way hypergraph partitioning algorithm that substantially outperforms the existing state-of-the-art K-PM/LR algorithm for multi-way partitioning, both for optimizing local as well as global objectives.

...read moreread less

Abstract: In this paper, we present a new multilevel k-way hypergraph partitioning algorithm that substantially outperforms the existing state-of-the-art K-PM/LR algorithm for multi-way partitioning, both for optimizing local as well as global objectives. Experiments on the ISPD98 benchmark suite show that the partitionings produced by our scheme are on the average 15% to 23% better than those produced by the K-PM/LR algorithm, both in terms of the hyperedge cut as well as the (K-1) metric. Furthermore, our algorithm is significantly faster, requiring 4 to 5 times less time than that required by K-PM/LR.

...read moreread less

521 citations

Journal Article•DOI•

A Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering

[...]

George Karypis¹, Vipin Kumar¹•Institutions (1)

University of Minnesota¹

01 Jan 1998-Journal of Parallel and Distributed Computing

TL;DR: The quality of the produced partitions and orderings are comparable to those produced by the serial multilevel algorithm that has been shown to outperform both spectral partitioning and multiple minimum degree.

...read moreread less

496 citations

Proceedings Article•DOI•

Multilevel Algorithms for Multi-Constraint Graph Partitioning

[...]

George Karypis¹, Vipin Kumar¹•Institutions (1)

University of Minnesota¹

07 Nov 1998

TL;DR: This work focuses on developing new types of heuristics for coarsening, initial partitioning, and refinement that are capable of successfully handling multiple constraints that underlay many existing and emerging large-scale scientific simulations.

...read moreread less

Abstract: Traditional graph partitioning algorithms compute a k-way partitioning of a graph such that the number of edges that are cut by the partitioning is minimized and each partition has an equal number of vertices. The task of minimizing the edge-cut can be considered as the objective and the requirement that the partitions will be of the same size can be considered as the constraint. In this paper we extend the partitioning problem by incorporating an arbitrary number of balancing constraints. In our formulation, a vector of weights is assigned to each vertex, and the goal is to produce a k-way partitioning such that the partitioning satisfies a balancing constraint associated with each weight, while attempting to minimize the edge-cut. Applications of this multi-constraint graph partitioning problem include parallel solution of multi-physics and multi-phase computations, that underlay many existing and emerging large-scale scientific simulations. We present new multi-constraint graph partitioning algorithms that are based on the multilevel graph partitioning paradigm. Our work focuses on developing new types of heuristics for coarsening, initial partitioning, and refinement that are capable of successfully handling multiple constraints. We experimentally evaluate the effectiveness of our multi-constraint partitioners on a variety of synthetically generated problems.

...read moreread less

484 citations

1
…
2
3
4
5
6
7
8
…
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136

Collapse

Cited by

PDF

Open Access

More filters

Book•

Deep Learning

[...]

Ian Goodfellow¹, Yoshua Bengio², Aaron Courville²•Institutions (2)

Google¹, Université de Montréal²

18 Nov 2016

TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.

...read moreread less

Abstract: Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning. The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models. Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

...read moreread less

38,208 citations

Journal Article•DOI•

I and i

[...]

Kevin Barraclough

08 Dec 2001-BMJ

TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.

...read moreread less

Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

...read moreread less

33,785 citations

Journal Article•

다중혈관 관상동맥 환자에서 y-문합을 이용하여 양쪽 내흉동맥만을 사용한 우회술의 조기 성적

[...]

성기익, 이영탁, 박계현, 전태국, 박표원, 한일용, 장윤희 - Show less +3 more

01 Mar 2003-The Korean Journal of Thoracic and Cardiovascular Surgery

28,685 citations

Book•

Data Mining: Concepts and Techniques

[...]

Jiawei Han¹, Micheline Kamber², Jian Pei²•Institutions (2)

University of Illinois at Urbana–Champaign¹, Simon Fraser University²

08 Sep 2000

TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.

...read moreread less

Abstract: The increasing volume of data in modern business and science calls for more complex and sophisticated tools. Although advances in data mining technology have made extensive data collection much easier, it's still always evolving and there is a constant need for new techniques and tools that can help us transform this data into useful information and knowledge. Since the previous edition's publication, great advances have been made in the field of data mining. Not only does the third of edition of Data Mining: Concepts and Techniques continue the tradition of equipping you with an understanding and application of the theory and practice of discovering patterns hidden in large data sets, it also focuses on new, important topics in the field: data warehouses and data cube technology, mining stream, mining social networks, and mining spatial, multimedia and other complex data. Each chapter is a stand-alone guide to a critical topic, presenting proven algorithms and sound implementations ready to be used directly or with strategic modification against live data. This is the resource you need if you want to apply today's most powerful data mining techniques to meet real business challenges. * Presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects. * Addresses advanced topics such as mining object-relational databases, spatial databases, multimedia databases, time-series databases, text databases, the World Wide Web, and applications in several fields. *Provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data

...read moreread less

23,600 citations

Journal Article•DOI•

A Survey on Transfer Learning

[...]

Sinno Jialin Pan¹, Qiang Yang¹•Institutions (1)

Hong Kong University of Science and Technology¹

01 Oct 2010-IEEE Transactions on Knowledge and Data Engineering

TL;DR: The relationship between transfer learning and other related machine learning techniques such as domain adaptation, multitask learning and sample selection bias, as well as covariate shift are discussed.

...read moreread less

Abstract: A major assumption in many machine learning and data mining algorithms is that the training and future data must be in the same feature space and have the same distribution. However, in many real-world applications, this assumption may not hold. For example, we sometimes have a classification task in one domain of interest, but we only have sufficient training data in another domain of interest, where the latter data may be in a different feature space or follow a different data distribution. In such cases, knowledge transfer, if done successfully, would greatly improve the performance of learning by avoiding much expensive data-labeling efforts. In recent years, transfer learning has emerged as a new learning framework to address this problem. This survey focuses on categorizing and reviewing the current progress on transfer learning for classification, regression, and clustering problems. In this survey, we discuss the relationship between transfer learning and other related machine learning techniques such as domain adaptation, multitask learning and sample selection bias, as well as covariate shift. We also explore some potential future issues in transfer learning research.

...read moreread less

18,616 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse