scispace - formally typeset
Open AccessJournal ArticleDOI

Processing and visualizing the data in tweets

TLDR
The first, TweeQL, provides a streaming SQL-like interface to the Twitter API, making common tweet processing tasks simpler, and the second, TwitInfo, shows how end-users can interact with and understand aggregated data from the tweet stream, in addition to showcasing the power of theTweeQL language.
Abstract
Microblogs such as Twitter provide a valuable stream of diverse user-generated data. While the data extracted from Twitter is generally timely and accurate, the process by which developers extract structured data from the tweet stream is ad-hoc and requires reimplementation of common data manipulation primitives. In this paper, we present two systems for querying and extracting structure from Twitter-embedded data. The first, TweeQL, provides a streaming SQL-like interface to the Twitter API, making common tweet processing tasks simpler. The second, TwitInfo, shows how end-users can interact with and understand aggregated data from the tweet stream, in addition to showcasing the power of the TweeQL language. Together these systems show the richness of content that can be extracted from Twitter.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Open challenges for data stream mining research

TL;DR: This article presents a discussion on eight open challenges for data stream mining, which cover the full cycle of knowledge discovery and involve such problems as protecting data privacy, dealing with legacy systems, handling incomplete and delayed information, analysis of complex data, and evaluation of stream mining algorithms.
Proceedings ArticleDOI

STREAMCUBE: Hierarchical spatio-temporal hashtag clustering for event exploration over the Twitter stream

TL;DR: This paper focuses on hierarchical spatio-temporal hashtag clustering techniques and proposes a data structure called STREAMCUBE, which is an extension of the data cube structure from the database community with spatial and temporal hierarchy.
Journal ArticleDOI

An algorithm for local geoparsing of microtext

TL;DR: The geo-parser is a method to geo-parse the short, informal messages known as microtext that uses Natural Language Processing methods to identify references to streets and addresses, buildings and urban spaces, and toponyms, and place acronyms and abbreviations.
Journal ArticleDOI

Geosocial gauge: a system prototype for knowledge discovery from social media

TL;DR: This article presents a system prototype for harvesting, processing, modeling, and integrating heterogeneous social media feeds towards the generation of geosocial knowledge, and addresses primarily two key components of this system prototype: a novel data model for heterogeneoussocial media feeds and a corresponding general system architecture.
Proceedings ArticleDOI

Mercury: A memory-constrained spatio-temporal real-time search on microblogs

TL;DR: Mercury is a system for real-time support of top-k spatio-temporal queries on microblogs, where users are able to browse recent microblogs near their locations, and employs a scalable dynamic in-memory index structure that is capable of digesting all incoming microblogs.
References
More filters
Proceedings ArticleDOI

Pig latin: a not-so-foreign language for data processing

TL;DR: A new language called Pig Latin is described, designed to fit in a sweet spot between the declarative style of SQL, and the low-level, procedural style of map-reduce, which is an open-source, Apache-incubator project, and available for general use.
Journal ArticleDOI

Aurora: a new model and architecture for data stream management

TL;DR: The basic processing model and architecture of Aurora, a new system to manage data streams for monitoring applications, are described and a stream-oriented set of operators are described.
Journal ArticleDOI

The CQL continuous query language: semantic foundations and query execution

TL;DR: This paper presents the structure of CQL's query execution plans as well as details of the most important components: operators, interoperator queues, synopses, and sharing of components among multiple operators and queries.
Journal ArticleDOI

Eddies: continuously adaptive query processing

TL;DR: This paper introduces a query processing mechanism called an eddy, which continuously reorders operators in a query plan as it runs, and describes the moments of symmetry during which pipelined joins can be easily reordered, and the synchronization barriers that require inputs from different sources to be coordinated.
Related Papers (5)
Trending Questions (1)
What are twitter ions?

The first, TweeQL, provides a streaming SQL-like interface to the Twitter API, making common tweet processing tasks simpler.