Home
/
Authors
/
Sameep Mehta

Author

Sameep Mehta

Other affiliations: Lady Hardinge Medical College, All India Institute of Medical Sciences, Ohio State University

Bio: Sameep Mehta is an academic researcher from IBM. The author has contributed to research in topics: Service (business) & Resource (project management). The author has an hindex of 22, co-authored 160 publications receiving 2093 citations. Previous affiliations of Sameep Mehta include Lady Hardinge Medical College & All India Institute of Medical Sciences.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002

Papers

PDF

Open Access

More filters

Posted Content•

Increasing Trust in AI Services through Supplier's Declarations of Conformity

[...]

Michael Hind, Sameep Mehta, Aleksandra Mojsilovic, Ravi Nair, Karthikeyan Natesan Ramamurthy, Alexandra Olteanu, Kush R. Varshney - Show less +3 more

22 Aug 2018

TL;DR: In this article, a supplier's declaration of conformity (SDoC) for artificial intelligence (AI) services is proposed to help increase trust in AI services, which is a transparent, standardized, but often not legally required, document used in many industries and sectors to describe the lineage of a product along with safety and performance testing it has undergone.

...read moreread less

Abstract: The accuracy and reliability of machine learning algorithms are an important concern for suppliers of artificial intelligence (AI) services, but considerations beyond accuracy, such as safety, security, and provenance, are also critical elements to engender consumers' trust in a service. In this paper, we propose a supplier's declaration of conformity (SDoC) for AI services to help increase trust in AI services. An SDoC is a transparent, standardized, but often not legally required, document used in many industries and sectors to describe the lineage of a product along with the safety and performance testing it has undergone. We envision an SDoC for AI services to contain purpose, performance, safety, security, and provenance information to be completed and voluntarily released by AI service providers for examination by consumers. Importantly, it conveys product-level rather than component-level functional testing. We suggest a set of declaration items tailored to AI and provide examples for two fictitious AI services.

...read moreread less

94 citations

Proceedings Article•DOI•

ReCon: A tool to Recommend dynamic server Consolidation in multi-cluster data centers

[...]

Sameep Mehta¹, Anindya Neogi¹•Institutions (1)

IBM¹

07 Apr 2008

TL;DR: This paper describes a consolidation recommendation tool, called ReCon, that takes static and dynamic costs of given servers, the costs of VM migration, the historical resource consumption data from the existing environment and provides an optimal dynamic plan of VM to physical server mapping over time.

...read moreread less

Abstract: Renewed focus on virtualization technologies and increased awareness about management and power costs of running under-utilized servers has spurred interest in consolidating existing applications on fewer number of servers in the data center. The ability to migrate virtual machines dynamically between physical servers in real-time has also added a dynamic aspect to consolidation. However, there is a lack of planning tools that can analyze historical data collected from an existing environment and compute the potential benefits of server consolidation especially in the dynamic setting. In this paper we describe such a consolidation recommendation tool, called ReCon. Recon takes static and dynamic costs of given servers, the costs of VM migration, the historical resource consumption data from the existing environment and provides an optimal dynamic plan of VM to physical server mapping over time. We also present the results of applying the tool on historical data obtained from a large production environment.

...read moreread less

77 citations

Proceedings Article•DOI•

A generalized framework for mining spatio-temporal patterns in scientific data

[...]

Hui Yang¹, Srinivasan Parthasarathy¹, Sameep Mehta¹•Institutions (1)

Ohio State University¹

21 Aug 2005

TL;DR: A general framework to discover spatial associations and spatio-temporal episodes for scientific datasets is presented and it is shown that such episodes can be used to reason about critical events.

...read moreread less

Abstract: In this paper, we present a general framework to discover spatial associations and spatio-temporal episodes for scientific datasets. In contrast to previous work in this area, features are modeled as geometric objects rather than points. We define multiple distance metrics that take into account objects' extent and thus are more robust in capturing the influence of an object on other objects in spatial neighborhood. We have developed algorithms to discover four different types of spatial object interaction (association) patterns. We also extend our approach to accommodate temporal information and propose a simple algorithm to derive spatio-temporal episodes. We show that such episodes can be used to reason about critical events. We evaluate our framework on real datasets to demonstrate its efficacy. The datasets originate from two different areas: Computational Molecular Dynamics and Computational Fluid Flow. We present results highlighting the importance of the identified patterns and episodes by using knowledge from the underlying domains. We also show that the proposed algorithms scale linearly with respect to the dataset size.

...read moreread less

75 citations

Proceedings Article•DOI•

Overview and Importance of Data Quality for Machine Learning Tasks

[...]

Abhinav Jain¹, Hima Patel¹, Lokesh Nagalapatti¹, Nitin Gupta¹, Sameep Mehta¹, Shanmukha C. Guttula¹, Shashank Mujumdar¹, Shazia Afzal¹, Ruhi Sharma Mittal¹, Vitobha Munigala¹ - Show less +6 more•Institutions (1)

IBM¹

23 Aug 2020

TL;DR: This tutorial highlights the importance of analysing data quality in terms of its value for machine learning applications and surveys all the important data quality related approaches discussed in literature, focusing on the intuition behind them, highlighting their strengths and similarities, and illustrates their applicability to real-world problems.

...read moreread less

Abstract: It is well understood from literature that the performance of a machine learning (ML) model is upper bounded by the quality of the data. While researchers and practitioners have focused on improving the quality of models (such as neural architecture search and automated feature selection), there are limited efforts towards improving the data quality. One of the crucial requirements before consuming datasets for any application is to understand the dataset at hand and failure to do so can result in inaccurate analytics and unreliable decisions. Assessing the quality of the data across intelligently designed metrics and developing corresponding transformation operations to address the quality gaps helps to reduce the effort of a data scientist for iterative debugging of the ML pipeline to improve model performance. This tutorial highlights the importance of analysing data quality in terms of its value for machine learning applications. This tutorial surveys all the important data quality related approaches discussed in literature, focusing on the intuition behind them, highlighting their strengths and similarities, and illustrates their applicability to real-world problems. Finally we will discuss the interesting work IBM Research is doing in this space.

...read moreread less

71 citations

Proceedings Article•DOI•

Harnessing the crowds for smart city sensing

[...]

Haggai Roitman¹, Jonathan Mamou¹, Sameep Mehta¹, Aharon Satt¹, L. V. Subramaniam¹ - Show less +1 more•Institutions (1)

IBM¹

02 Nov 2012

TL;DR: An high-level overview of a novel crowd sensing system that is developed in IBM for the smart cities domain and some preliminary results using public safety as an example usecase are presented.

...read moreread less

Abstract: In this work we discuss the challenge of harnessing the crowd for smart city sensing. Within a city's context, such reports by citizen or city visitor eye witnesses may provide important information to city officials, additionally to more traditional data gathered by other means (e.g., through the city's control center, emergency services, sensors spread across the city, etc). We present an high-level overview of a novel crowd sensing system that we develop in IBM for the smart cities domain. As a proof of concept, we present some preliminary results using public safety as our example usecase.

...read moreread less

62 citations

1
2
3
4
5
…
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

The spread of true and false news online

[...]

Soroush Vosoughi¹, Deb Roy¹, Sinan Aral¹•Institutions (1)

Massachusetts Institute of Technology¹

09 Mar 2018-Science

TL;DR: A large-scale analysis of tweets reveals that false rumors spread further and faster than the truth, and false news was more novel than true news, which suggests that people were more likely to share novel information.

...read moreread less

Abstract: We investigated the differential diffusion of all of the verified true and false news stories distributed on Twitter from 2006 to 2017. The data comprise ~126,000 stories tweeted by ~3 million people more than 4.5 million times. We classified news as true or false using information from six independent fact-checking organizations that exhibited 95 to 98% agreement on the classifications. Falsehood diffused significantly farther, faster, deeper, and more broadly than the truth in all categories of information, and the effects were more pronounced for false political news than for false news about terrorism, natural disasters, science, urban legends, or financial information. We found that false news was more novel than true news, which suggests that people were more likely to share novel information. Whereas false stories inspired fear, disgust, and surprise in replies, true stories inspired anticipation, sadness, joy, and trust. Contrary to conventional wisdom, robots accelerated the spread of true and false news at the same rate, implying that false news spreads more than the truth because humans, not robots, are more likely to spread it.

...read moreread less

4,241 citations

Journal Article•

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

[...]

杉山拓海

12 Sep 2017-Computers & Graphics

3,940 citations

Social Network Analysis

[...]

Tom A. B. Snijders

01 Jan 2012

3,692 citations

Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification

[...]

Joy Buolamwini, Timnit Gebru

21 Jan 2018

TL;DR: It is shown that the highest error involves images of dark-skinned women, while the most accurate result is for light-skinned men, in commercial API-based classifiers of gender from facial images, including IBM Watson Visual Recognition.

...read moreread less

Abstract: The paper “Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification” by Joy Buolamwini and Timnit Gebru, that will be presented at the Conference on Fairness, Accountability, and Transparency (FAT*) in February 2018, evaluates three commercial API-based classifiers of gender from facial images, including IBM Watson Visual Recognition. The study finds these services to have recognition capabilities that are not balanced over genders and skin tones [1]. In particular, the authors show that the highest error involves images of dark-skinned women, while the most accurate result is for light-skinned men.

...read moreread less

2,528 citations

Posted Content•

A Survey on Bias and Fairness in Machine Learning

[...]

Ninareh Mehrabi¹, Fred Morstatter¹, Nripsuta Saxena¹, Kristina Lerman¹, Aram Galstyan¹ - Show less +1 more•Institutions (1)

Information Sciences Institute¹

23 Aug 2019-arXiv: Learning

TL;DR: This survey investigated different real-world applications that have shown biases in various ways, and created a taxonomy for fairness definitions that machine learning researchers have defined to avoid the existing bias in AI systems.

...read moreread less

Abstract: With the widespread use of AI systems and applications in our everyday lives, it is important to take fairness issues into consideration while designing and engineering these types of systems. Such systems can be used in many sensitive environments to make important and life-changing decisions; thus, it is crucial to ensure that the decisions do not reflect discriminatory behavior toward certain groups or populations. We have recently seen work in machine learning, natural language processing, and deep learning that addresses such challenges in different subdomains. With the commercialization of these systems, researchers are becoming aware of the biases that these applications can contain and have attempted to address them. In this survey we investigated different real-world applications that have shown biases in various ways, and we listed different sources of biases that can affect AI applications. We then created a taxonomy for fairness definitions that machine learning researchers have defined in order to avoid the existing bias in AI systems. In addition to that, we examined different domains and subdomains in AI showing what researchers have observed with regard to unfair outcomes in the state-of-the-art methods and how they have tried to address them. There are still many future directions and solutions that can be taken to mitigate the problem of bias in AI systems. We are hoping that this survey will motivate researchers to tackle these issues in the near future by observing existing work in their respective fields.

...read moreread less

1,571 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse