scispace - formally typeset
Search or ask a question
Author

Sameep Mehta

Bio: Sameep Mehta is an academic researcher from IBM. The author has contributed to research in topics: Service (business) & Resource (project management). The author has an hindex of 22, co-authored 160 publications receiving 2093 citations. Previous affiliations of Sameep Mehta include Lady Hardinge Medical College & All India Institute of Medical Sciences.


Papers
More filters
Proceedings ArticleDOI
26 Dec 2006
TL;DR: A visual analysis system that interactively discovers spatial and spatio-temporal relationships from the trajectories of derived features and demonstrates how the derived relationships can help in explaining the occurrence of critical events like merging and bifurcation of the vortices.
Abstract: Spatio-temporal relationships among features extracted from temporally-varying scientific datasets can provide useful information about the evolution of an individual feature and its interactions with other features. However, extracting such useful relationships without user guidance is cumbersome and often an error prone process. In this paper, we present a visual analysis system that interactively discovers such relationships from the trajectories of derived features. We describe analysis algorithms to derive various spatial and spatio-temporal relationships. A visual interface is presented using which the user can interactively select spatial and temporal extents to guide the knowledge discovery process. We show the usefulness of our proposed algorithms on datasets originating from computational fluid dynamics. We also demonstrate how the derived relationships can help in explaining the occurrence of critical events like merging and bifurcation of the vortices.

8 citations

Book ChapterDOI
07 Nov 2016
TL;DR: This work proposes a method, Select-Link-Rank, that exploits semantic information from Wikipedia to generate diversified query expansions, and shows that this method outperforms the state-of-the-art diversifying query expansion and diversified entity recommendation techniques.
Abstract: A search query, being a very concise grounding of user intent, could potentially have many possible interpretations. Search engines hedge their bets by diversifying top results to cover multiple such possibilities so that the user is likely to be satisfied, whatever be her intended interpretation. Diversified Query Expansion is the problem of diversifying query expansion suggestions, so that the user can specialize the query to better suit her intent, even before perusing search results. We propose a method, Select-Link-Rank, that exploits semantic information from Wikipedia to generate diversified query expansions. SLR does collective processing of terms and Wikipedia entities in an integrated framework, simultaneously diversifying query expansions and entity recommendations. SLR starts with selecting informative terms from search results of the initial query, links them to Wikipedia entities, performs a diversity-conscious entity scoring and transfers such scoring to the term space to arrive at query expansion suggestions. Through an extensive empirical analysis and user study, we show that our method outperforms the state-of-the-art diversified query expansion and diversified entity recommendation techniques.

8 citations

Posted Content
TL;DR: It is shown that similarity based on word vectors beats the classical approach with a large margin, whereas other vector representations of senses and sentences fail to even match the classical baseline.
Abstract: Machine Learning community is recently exploring the implications of bias and fairness with respect to the AI applications. The definition of fairness for such applications varies based on their domain of application. The policies governing the use of such machine learning system in a given context are defined by the constitutional laws of nations and regulatory policies enforced by the organizations that are involved in the usage. Fairness related laws and policies are often spread across the large documents like constitution, agreements, and organizational regulations. These legal documents have long complex sentences in order to achieve rigorousness and robustness. Automatic extraction of fairness policies, or in general, any specific kind of policies from large legal corpus can be very useful for the study of bias and fairness in the context of AI applications. We attempted to automatically extract fairness policies from publicly available law documents using two approaches based on semantic relatedness. The experiments reveal how classical Wordnet-based similarity and vector-based similarity differ in addressing this task. We have shown that similarity based on word vectors beats the classical approach with a large margin, whereas other vector representations of senses and sentences fail to even match the classical baseline. Further, we have presented thorough error analysis and reasoning to explain the results with appropriate examples from the dataset for deeper insights.

8 citations

Patent
30 May 2012
TL;DR: In this article, the authors present a method and associated systems for automatically identifying critical resources in an organization, where an organization creates a model of the dependencies between pairs of resource instances, wherein that model describes how the organization's projects and services are affected when a resource instance becomes unavailable.
Abstract: A method and associated systems for automatically identifying critical resources in an organization. An organization creates a model of the dependencies between pairs of resource instances, wherein that model describes how the organization's projects and services are affected when a resource instance becomes unavailable. This model may be represented as a system of directed graphs. This model may be used to automatically identify a resource instance as “critical” when excessive cost is required to resume all projects and services rendered infeasible by the disruption of that resource instance. This model may also be used to automatically identify a resource instance as “critical for a resource type” when disruption of the resource instance forces the capacity of the resource type available to the entire organization to fall below a threshold value.

7 citations

Patent
30 Nov 2012
TL;DR: In this article, a publically disseminated media transmission is received and public influence of the media transmission was measured via identifying one or more media sources used to disseminate the media transmissions; and obtaining one or some predetermined influence values associated with the media sources.
Abstract: Methods and arrangements for measuring and utilizing media topic influence. A publically disseminated media transmission is received. Public influence of the media transmission is measured via: identifying one or more media sources used to disseminate the media transmission; and obtaining one or more predetermined influence values associated with the one or more media sources.

7 citations


Cited by
More filters
Journal ArticleDOI
09 Mar 2018-Science
TL;DR: A large-scale analysis of tweets reveals that false rumors spread further and faster than the truth, and false news was more novel than true news, which suggests that people were more likely to share novel information.
Abstract: We investigated the differential diffusion of all of the verified true and false news stories distributed on Twitter from 2006 to 2017. The data comprise ~126,000 stories tweeted by ~3 million people more than 4.5 million times. We classified news as true or false using information from six independent fact-checking organizations that exhibited 95 to 98% agreement on the classifications. Falsehood diffused significantly farther, faster, deeper, and more broadly than the truth in all categories of information, and the effects were more pronounced for false political news than for false news about terrorism, natural disasters, science, urban legends, or financial information. We found that false news was more novel than true news, which suggests that people were more likely to share novel information. Whereas false stories inspired fear, disgust, and surprise in replies, true stories inspired anticipation, sadness, joy, and trust. Contrary to conventional wisdom, robots accelerated the spread of true and false news at the same rate, implying that false news spreads more than the truth because humans, not robots, are more likely to spread it.

4,241 citations

01 Jan 2012

3,692 citations

21 Jan 2018
TL;DR: It is shown that the highest error involves images of dark-skinned women, while the most accurate result is for light-skinned men, in commercial API-based classifiers of gender from facial images, including IBM Watson Visual Recognition.
Abstract: The paper “Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification” by Joy Buolamwini and Timnit Gebru, that will be presented at the Conference on Fairness, Accountability, and Transparency (FAT*) in February 2018, evaluates three commercial API-based classifiers of gender from facial images, including IBM Watson Visual Recognition. The study finds these services to have recognition capabilities that are not balanced over genders and skin tones [1]. In particular, the authors show that the highest error involves images of dark-skinned women, while the most accurate result is for light-skinned men.

2,528 citations

Posted Content
TL;DR: This survey investigated different real-world applications that have shown biases in various ways, and created a taxonomy for fairness definitions that machine learning researchers have defined to avoid the existing bias in AI systems.
Abstract: With the widespread use of AI systems and applications in our everyday lives, it is important to take fairness issues into consideration while designing and engineering these types of systems. Such systems can be used in many sensitive environments to make important and life-changing decisions; thus, it is crucial to ensure that the decisions do not reflect discriminatory behavior toward certain groups or populations. We have recently seen work in machine learning, natural language processing, and deep learning that addresses such challenges in different subdomains. With the commercialization of these systems, researchers are becoming aware of the biases that these applications can contain and have attempted to address them. In this survey we investigated different real-world applications that have shown biases in various ways, and we listed different sources of biases that can affect AI applications. We then created a taxonomy for fairness definitions that machine learning researchers have defined in order to avoid the existing bias in AI systems. In addition to that, we examined different domains and subdomains in AI showing what researchers have observed with regard to unfair outcomes in the state-of-the-art methods and how they have tried to address them. There are still many future directions and solutions that can be taken to mitigate the problem of bias in AI systems. We are hoping that this survey will motivate researchers to tackle these issues in the near future by observing existing work in their respective fields.

1,571 citations