scispace - formally typeset
Search or ask a question

Showing papers by "Sameep Mehta published in 2015"


Journal ArticleDOI
TL;DR: The framework supports the navigation problem for the blind by combining the advantages of the real-time localization technologies so that the user is being made aware of the world, a necessity for independent travel.
Abstract: This paper lays the ground work for assistive navigation using wearable sensors and social sensors to foster situational awareness for the blind. Our system acquires social media messages to gauge the relevant aspects of an event and to create alerts. We propose social semantics that captures the parameters required for querying and reasoning an event-of-interest, such as what, where, who, when, severity, and action from the Internet of things, using an event summarization algorithm. Our approach integrates wearable sensors in the physical world to estimate user location based on metric and landmark localization. Streaming data from the cyber world are employed to provide awareness by summarizing the events around the user based on the situation awareness factor. It is illustrated using disaster and socialization event scenarios. Discovered local events are fed back using sound localization so that the user can actively participate in a social event or get early warning of any hazardous events. A feasibility evaluation of our proposed algorithm included comparing the output of the algorithm to ground truth, a survey with sighted participants about the algorithm output, and a sound localization user interface study with blind-folded sighted participants. Thus, our framework supports the navigation problem for the blind by combining the advantages of our real-time localization technologies so that the user is being made aware of the world, a necessity for independent travel.

41 citations


Proceedings ArticleDOI
25 Aug 2015
TL;DR: This paper investigates event detection in the context of real-time Twitter streams as observed in real-world crises, and presents a novel approach to address the key challenges: the informal nature of text, and the high-volume and high-velocity characteristics of Twitter streams.
Abstract: The unprecedented use of social media through smartphones and other web-enabled mobile devices has enabled the rapid adoption of platforms like Twitter. Event detection has found many applications on the web, including breaking news identification and summarization. The recent increase in the usage of Twitter during crises has attracted researchers to focus on detecting events in tweets. However, current solutions have focused on static Twitter data. The necessity to detect events in a streaming environment during fast paced events such as a crisis presents new opportunities and challenges. In this paper, we investigate event detection in the context of real-time Twitter streams as observed in real-world crises. We highlight the key challenges in this problem: the informal nature of text, and the high-volume and high-velocity characteristics of Twitter streams. We present a novel approach to address these challenges using single-pass clustering and the compression distance to efficiently detect events in Twitter streams. Through experiments on large Twitter datasets, we demonstrate that the proposed framework is able to detect events in near real-time and can scale to large and noisy Twitter streams.

21 citations


Proceedings ArticleDOI
18 May 2015
TL;DR: This paper proposes a framework to use the location data from LBSNs, combine it with the data from maps for associating a set of venue categories with these locations and shows that this approach improves on the state-of-the-art methods for location prediction.
Abstract: Predicting the next location of a user based on their previous visiting pattern is one of the primary tasks over data from location based social networks (LBSNs) such as Foursquare. Many different aspects of these so-called "check-in" profiles of a user have been made use of in this task, including spatial and temporal information of check-ins as well as the social network information of the user. Building more sophisticated prediction models by enriching these check-in data by combining them with information from other sources is challenging due to the limited data that these LBSNs expose due to privacy concerns. In this paper, we propose a framework to use the location data from LBSNs, combine it with the data from maps for associating a set of venue categories with these locations. For example, if the user is found to be checking in at a mall that has cafes, cinemas and restaurants according to the map, all these information is associated. This category information is then leveraged to predict the next checkin location by the user. Our experiments with publicly available check-in dataset show that this approach improves on the state-of-the-art methods for location prediction.

19 citations


Proceedings Article
25 Jul 2015
TL;DR: Using data from the 2012 US presidential elections and the 2013 Philippines General elections, this work provides detailed experiments on methods that use granger causality to identify topics that were most "causal" for public opinion and which in turn give an interpretable insight into "elections topics" that weremost important.
Abstract: In recent times, social media has become a popular medium for many election campaigns. It not only allows candidates to reach out to a large section of the electorate, it is also a potent medium for people to express their opinion on the proposed policies and promises of candidates. Analyzing social media data is challenging as the text can be noisy, sparse and even multilingual. In addition, the information may not be completely trustworthy, particularly in the presence of propaganda, promotions and rumors. In this paper we describe our work for analyzing election campaigns using social media data. Using data from the 2012 US presidential elections and the 2013 Philippines General elections, we provide detailed experiments on our methods that use granger causality to identify topics that were most "causal" for public opinion and which in turn, give an interpretable insight into "elections topics" that were most important. Our system was deployed by the largest media organization in the Philippines during the 2013 General elections and using our work, the media house able to identify and report news stories much faster than competitors and reported higher TRP ratings during the election.

14 citations


Book ChapterDOI
29 Mar 2015
TL;DR: This work proposes a three-phase method for linking web search queries to wikipedia entities using an IR-style scoring of entities against the search query to narrow down to a subset of entities that are expanded using hyperlink information in the second phase to a larger set.
Abstract: We consider the problem of linking web search queries to entities from a knowledge base such as Wikipedia. Such linking enables converting a user’s web search session to a footprint in the knowledge base that could be used to enrich the user profile. Traditional methods for entity linking have been directed towards finding entity mentions in text documents such as news reports, each of which are possibly linked to multiple entities enabling the usage of measures like entity set coherence. Since web search queries are very small text fragments, such criteria that rely on existence of a multitude of mentions do not work too well on them. We propose a three-phase method for linking web search queries to wikipedia entities. The first phase does IR-style scoring of entities against the search query to narrow down to a subset of entities that are expanded using hyperlink information in the second phase to a larger set. Lastly, we use a graph traversal approach to identify the top entities to link the query to. Through an empirical evaluation on real-world web search queries, we illustrate that our methods significantly enhance the linking accuracy over state-of-the-art methods.

5 citations


Proceedings ArticleDOI
17 Oct 2015
TL;DR: A framework, NCFinder, to discover top-k consistent news-casters directly from Twitter, using news headlines published in online news sources to periodically collect authentic news-tweets and employs HITS algorithm on it to score the news- broadcasters on daily basis.
Abstract: News-casters are Twitter users who periodically pick up interesting news from online news media and spread it to their followers' network. Existing works on Twitter user analysis have only analysed a pre-defined set of users for user modeling, influence analysis and news recommendation. The problem of identifying prominent, trustworthy and consistent news-casters is unaddressed so far. In this paper, we present a framework, NCFinder, to discover top-k consistent news-casters directly from Twitter. NCFinder uses news headlines published in online news sources to periodically collect authentic news-tweets and processes them to discover news-casters, news sources and news concepts. Next, NCFinder builds a tripartite graph among news-casters, news source and news concepts and employs HITS algorithm on it to score the news-casters on daily basis. The daily score profiles of the news-casters collected over a time-period are then used to infer top-$k$ consistent news-casters. We run NCFinder from 11th Nov. to 24th Nov., 2014 and discover top-100 consistent news-casters and their profile information.

2 citations


Patent
Sameep Mehta1, Deepak Padmanabhan1
24 Sep 2015
TL;DR: In this article, a computer-implemented method for updating annotator collections using run traces is described, which includes generating one or more alternate versions of annotators selected from a set of multiple document annotators; and outputting an instruction to modify, based on the generated log information for each annotator in the set and each alternate version, at least one document annotator from the set.
Abstract: Methods, systems, and computer program products for updating annotator collections using run traces are provided herein. A computer-implemented method includes generating one or more alternate versions of one or more document annotators selected from a set of multiple document annotators; executing, on one or more document data sets, (i) one or more document annotators from the set of multiple document annotators and (ii) the one or more alternate versions to generate log information for each document annotator in the set and each alternate version of the one or more alternate versions; and outputting an instruction to modify, based on the generated log information for each document annotator in the set and each alternate version, at least one document annotator from the set with at least one alternate version from the one or more alternate versions.

1 citations


Proceedings ArticleDOI
23 Apr 2015
TL;DR: This paper has developed a system that integrates data about movies from various sources across the web and populated the TITAN graph database, which enables it to show that complex information can be retreived using simple queries using Gremlin, a graph query language.
Abstract: The development of the Internet in the recent years has made it possible to access different information systems anywhere in the world. Information Integration is the merging of information from heterogeneous sources with differing conceptual, contextual and typographical representations. In this paper, we exploit Information Integration techniques for movies data from different sources over the web. Graphs are used to model many complex data objects and their relationships in the real world. In recent years, graphs have become increasingly popular in a variety of domains varying from Biology, Chemistry, Healthcare systems and computer vision to Business Intelligence and Social Media Analytics. We have developed a system that integrates data about movies from various sources across the web and populated the TITAN graph database. This enables us to show that complex information can be retreived using simple queries using Gremlin, a graph query language.

1 citations


Patent
Sameep Mehta1, Deepak Padmanabhan
24 Sep 2015
TL;DR: In this article, a computer-implemented method for updating annotator collections using run traces is described, which includes generating one or more alternate versions of annotators selected from a set of multiple document annotators; and outputting an instruction to modify, based on the generated log information for each annotator in the set and each alternate version, at least one document annotator from the set.
Abstract: Methods, systems, and computer program products for updating annotator collections using run traces are provided herein. A computer-implemented method includes generating one or more alternate versions of one or more document annotators selected from a set of multiple document annotators; executing, on one or more document data sets, (i) one or more document annotators from the set of multiple document annotators and (ii) the one or more alternate versions to generate log information for each document annotator in the set and each alternate version of the one or more alternate versions; and outputting an instruction to modify, based on the generated log information for each document annotator in the set and each alternate version, at least one document annotator from the set with at least one alternate version from the one or more alternate versions.