scispace - formally typeset
Search or ask a question

Showing papers by "Sameep Mehta published in 2014"


Posted Content
TL;DR: This paper investigates event detection in the context of real-time Twitter streams as observed in real-world crises and presents a novel approach to address the key challenges: the informal nature of text, and the high volume and high velocity characteristics of Twitter streams.
Abstract: The unprecedented use of social media through smartphones and other web-enabled mobile devices has enabled the rapid adoption of platforms like Twitter. Event detection has found many applications on the web, including breaking news identification and summarization. The recent increase in the usage of Twitter during crises has attracted researchers to focus on detecting events in tweets. However, current solutions have focused on static Twitter data. The necessity to detect events in a streaming environment during fast paced events such as a crisis presents new opportunities and challenges. In this paper, we investigate event detection in the context of real-time Twitter streams as observed in real-world crises. We highlight the key challenges in this problem: the informal nature of text, and the high volume and high velocity characteristics of Twitter streams. We present a novel approach to address these challenges using single-pass clustering and the compression distance to efficiently detect events in Twitter streams. Through experiments on large Twitter datasets, we demonstrate that the proposed framework is able to detect events in near real-time and can scale to large and noisy Twitter streams.

19 citations


Proceedings ArticleDOI
21 Mar 2014
TL;DR: This paper proposes to use Wikipedia topic taxonomy to discover the themes from the tweets and use the themes along with traditional word based similarity metric for clustering tweets.
Abstract: In this paper, we present overview of our approach for clustering tweets. Due to short text of tweets, traditional text clustering mechanisms alone may not produce optimal results. We believe that there is an underlying theme/topic present in majority of tweets which is evident in growing usage of hashtag feature in the Twitter network. Clustering tweets based on these themes seems a more natural way for grouping. We propose to use Wikipedia topic taxonomy to discover the themes from the tweets and use the themes along with traditional word based similarity metric for clustering. We show some of our initial results to demonstrate the effectiveness of our approach.

8 citations


Patent
Kalapriya Kannan1, Sameep Mehta1
24 Dec 2014
TL;DR: In this paper, a cloud service request (CSR) is received from a cloud customer in the cloud computing environment, the CSR comprising at least one parameter of one or more existing cloud services accessed by the cloud customer that are provided by one or multiple existing cloud service providers.
Abstract: Embodiments of the invention provide systems, methods and computer program products for optimizing cloud service delivery within a cloud computing environment. A cloud service request (CSR) is received from a cloud customer in the cloud computing environment, the CSR comprising at least one parameter of one or more existing cloud services accessed by the cloud customer that are provided by one or more existing cloud service providers. At least one parameter of the CSR is monitored in a cloud service registry comprising a plurality of cloud services provided by a plurality of cloud service providers and one or more parameters corresponding to each cloud service of the plurality of cloud services. Based on the monitoring, a new cloud service provider is determined who may provide a better cloud service with respect to the at least one parameter in the CSR being monitored.

8 citations


Patent
14 Jan 2014
TL;DR: In this article, an approach is provided for creating a new document, where keywords specifying a subject matter of the new document are received and metadata of documents is determined to match keyword(s) included in the received keywords and the documents are retrieved.
Abstract: An approach is provided for creating a new document. Keywords specifying a subject matter of the new document are received. Metadata of documents is determined to match keyword(s) included in the received keywords and the documents are retrieved. Based on a section being created in the new document, a ranked list of the retrieved documents is generated. A selection of a document included in the ranked list is received. The selected document is added to the new document. The new document is determined to be not complete. The keywords are refined based on the added document. Based on the subject matter and the refined keywords, the new document is completed by repeating the steps of determining the metadata, retrieving the documents, generating the ranked list, receiving the selection, and adding the selected document.

4 citations


Book ChapterDOI
02 Sep 2014
TL;DR: A three-phase Discover-Filer-Merge solution, namely ActMiner, to infer the location-specific relevant and non-redundant activities from community-authored reviews using Dependency-aware, Category-aware and Sense-aware approaches in three sequential phases to accomplish its objective.
Abstract: Location-specific community authored reviews are useful resource for discovering location-specific activities and developing various location-aware activity recommendation applications. Existing works on activity discovery have mostly utilized body-worn sensors, images or human GPS traces and discovered generalized activities that do not convey any location-specific knowledge. Moreover, many of the discovered activities are irrelevant and redundant and hence, significantly affect the performance of a location-aware activity recommender system. In this paper, we propose a three-phase Discover-Filer-Merge solution, namely ActMiner, to infer the location-specific relevant and non-redundant activities from community-authored reviews. The proposed solution uses Dependency-aware, Category-aware and Sense-aware approaches in three sequential phases to accomplish its objective. Experimental results on two real-world data sets show that the accuracy and correctness of ActMiner are better than the existing approaches.

2 citations


Patent
11 Apr 2014
TL;DR: In this paper, the authors propose a method and associated systems for automatically identifying critical resources in an organization, where an organization creates a model of the dependencies between pairs of resource types, wherein that model describes how the organization's projects and services are affected when a resource type becomes unavailable.
Abstract: A method and associated systems for automatically identifying critical resources in an organization. An organization creates a model of the dependencies between pairs of resource types, wherein that model describes how the organization's projects and services are affected when a resource type becomes unavailable. This model may include a system of directed graphs. This model may be used to automatically identify a resource type as critical if unacceptable cost is incurred by resuming projects and services rendered infeasible when the resource type is disrupted. The model may also be used to automatically identify a first resource type as critical for a second resource type when disruption of the first resource type forces the available capacity of the second resource type to fall below a threshold value.

2 citations


Patent
10 Apr 2014
TL;DR: In this paper, the authors propose a method and associated systems for automatically identifying critical resources in an organization, where an organization creates a model of the dependencies between pairs of resource types, wherein that model describes how the organization's projects and services are affected when a resource type becomes unavailable.
Abstract: A method and associated systems for automatically identifying critical resources in an organization. An organization creates a model of the dependencies between pairs of resource types, wherein that model describes how the organization's projects and services are affected when a resource type becomes unavailable. This model may include a system of directed graphs. This model may be used to automatically identify a resource type as critical if unacceptable cost is incurred by resuming projects and services rendered infeasible when the resource type is disrupted. The model may also be used to automatically identify a first resource type as critical for a second resource type when disruption of the first resource type forces the available capacity of the second resource type to fall below a threshold value.

Journal ArticleDOI
TL;DR: A natural ranking problem that arises in settings in which a community of people are engaged in regular interactions with an end goal of creating value is considered and a novel algorithm for computing the ranking is developed.
Abstract: In this paper, we consider a natural ranking problem that arises in settings in which a community of people (or agents) are engaged in regular interactions with an end goal of creating value. Examples of such scenarios are academic collaboration networks, creative collaborations, and interactions between agents of a service delivery organization. For instance, consider a service delivery organization which essentially resolves a sequence of service requests from its customers by deploying its agents to resolve the requests. Typically, resolving a request requires interaction between multiple agents and results in an outcome (or value). The outcome could be success or failure of problem resolution or an index of customer satisfaction. For this scenario, the ranking of the agents of the network should take into account two aspects: importance of the agents in the network structure that arises as a result of interactions and the value generated by the interactions involving the respective agents. Such a ranking can be used for several purposes such as identifying influential agents of the interaction network, effective and efficient spreading of messages in the network. In this paper, we formally model the above ranking problem and develop a novel algorithm for computing the ranking. The key aspect of our approach is creating special nodes in the interaction network corresponding to the outcomes and endowing them independent, external status. The algorithm then iteratively spreads the external status of the outcomes to the agents based on their interactions and the outcome of those interactions. This results in an eigenvector like formulation, which results in a method requiring computing the inverse of a matrix rather than the eigenvector. We present several theoretical characterizations of our algorithmic approach. We present experimental results on the public domain real-life datasets from the Internet Movie Database and a dataset constructed by retrieving impact and citation ratings for papers listed in the DBLP database.