scispace - formally typeset
Search or ask a question
Author

Rami Alghamdi

Bio: Rami Alghamdi is an academic researcher from University of Minnesota. The author has contributed to research in topics: Spatial analysis & Spatial data infrastructure. The author has an hindex of 5, co-authored 6 publications receiving 83 citations.

Papers
More filters
Proceedings ArticleDOI
18 Jun 2014
TL;DR: TAREEG; a web-service that makes real spatial data, from anywhere in the world, available at the fingertips of every researcher or individual, using the richness of OpenStreetMap dataset; the most comprehensive available spatial data of the world.
Abstract: Real spatial data, e.g., detailed road networks, rivers, buildings, parks, are not really available in most of the world. This hinders the practicality of many research ideas that need a real spatial data for testing experiments. Such data is often available for governmental use, or at major software companies, but it is prohibitively expensive to build or buy for academia or individual researchers. This demo presents TAREEG; a web-service that makes real spatial data, from anywhere in the world, available at the fingertips of every researcher or individual. TAREEG gets all its data by leveraging the richness of OpenStreetMap dataset; the most comprehensive available spatial data of the world. Yet, it is still challenging to obtain OpenStreetMap data due to the size limitations, special data format, and the noisy nature of spatial data. TAREEG employs MapReduce-based techniques to make it efficient and easy to extract OpenStreetMap data in a standard form with minimal effort. TAREEG is accessible via {http://www.tareeg.org/}

31 citations

Proceedings ArticleDOI
04 Nov 2014
TL;DR: TAREEG; a web-service that makes real spatial data, from anywhere in the world, available at the fingertips of every researcher or individual, using MapReduce-based techniques to make it efficient and easy to extract OpenStreetMap data in a standard form with minimal effort.
Abstract: Real spatial data, e.g., detailed road networks, rivers, buildings, parks, are not easily available for most of the world. This hinders the practicality of many research ideas that need a real spatial data for testing and experiments. Such data is often available for governmental use, or at major software companies, but it is prohibitively expensive to build or buy for academia or individual researchers. This paper presents TAREEG; a web-service that makes real spatial data, from anywhere in the world, available at the fingertips of every researcher or individual. TAREEG gets all its data by leveraging the richness of OpenStreetMap data set; the most comprehensive available spatial data of the world. Yet, it is still challenging to obtain OpenStreetMap data due to the size limitations, special data format, and the noisy nature of spatial data. TAREEG employs MapReduce-based techniques to make it efficient and easy to extract OpenStreetMap data in a standard form with minimal effort. Experimental results show that TAREEG is highly accurate and efficient.

27 citations

Proceedings ArticleDOI
16 May 2016
TL;DR: In this paper, the authors propose kFlushing policy that exploits popularity of top-k queries in microblogs to smartly select a subset of microblogs for data flushing to increase memory hit ratio.
Abstract: Searching microblogs, e.g., tweets and comments, is practically supported through main-memory indexing for scalable data digestion and efficient query evaluation. With continuity and excessive numbers of microblogs, it is infeasible to keep data in main-memory for long periods. Thus, once allocated memory budget is filled, a portion of data is flushed from memory to disk to continuously accommodate newly incoming data. Existing techniques come with either low memory hit ratio due to flushing items regardless of their relevance to incoming queries or significant overhead of tracking individual data items, which limit scalability of microblogs systems in either cases. In this paper, we propose kFlushing policy that exploits popularity of top-k queries in microblogs to smartly select a subset of microblogs to flush. kFlushing is mainly designed to increase memory hit ratio. To this end, it identifies and flushes in-memory data that does not contribute to incoming queries. The freed memory space is utilized to accumulate more useful data that is used to answer more queries from memory contents. When all memory is utilized for useful data, kFlushing flushes data that is less likely to degrade memory hit ratio. In addition, kFlushing comes with a little overhead that keeps high system scalability in terms of high digestion rates of incoming fast data. Extensive experimental evaluation shows the effectiveness and scalability of kFlushing to improve main-memory hit by 26–330% while coping up with fast microblog streams of up to 100K microblog/second.

11 citations

Proceedings ArticleDOI
16 May 2016
TL;DR: A vision for a GPU accelerated end-to-end system for performing spatial computations that supports a plethora of spatial operations ranging from basic operations, computational geometry operations to Open Geospatial Consortium (OGC) compliant operations is proposed.
Abstract: Ease of availability of spatial data has increased the interest in the domain of spatial computing. Various services such as Uber, Google maps, and Blue Brain Project have been developed that consume and process such spatial data. Spatial data processing is not only data intensive but also compute intensive. A lot of efforts have been made by the spatial computing community to tackle the problems due to huge volumes of data. However, unfortunately, not enough attention has been given to address the compute intensive nature of the problem. In parallel to the advancements in spatial domain, Graphics Processing Units (GPUs) have emerged as compelling computing units. A lot of work has been done in spatial domain to leverage the computing power of GPUs. However, to the best of our knowledge, none of the work present a holistic system. In this paper, we propose a vision for a GPU accelerated end-to-end system for performing spatial computations. Our envisioned system supports a plethora of spatial operations ranging from basic operations, computational geometry operations to Open Geospatial Consortium (OGC) compliant operations. Our system exploits the power of CPU-GPU co-processing by scheduling the execution of spatial operators either on CPU or GPU based on a cost model. Within the framework of our system we discuss the challenges and open research problems in building such a system. We also provide some preliminary results to show the computational gain achieved by performing spatial operations on GPUs.

9 citations

Journal ArticleDOI
01 Aug 2019
TL;DR: This demonstration focuses on the zonal statistics problem, which computes the statistics over a raster layer for each polygon in a vector layer, and demonstrates three approaches, vector- based, raster-based, and raptor-based approaches.
Abstract: With the increase in amount of remote sensing data, there have been efforts to efficiently process it to help ecologists and geographers answer queries. However, they often need to process this data in combination with vector data, for example, city boundaries. Existing efforts require one dataset to be converted to the other representation, which is extremely inefficient for large datasets. In this demonstration, we focus on the zonal statistics problem, which computes the statistics over a raster layer for each polygon in a vector layer. We demonstrate three approaches, vector-based, raster-based, and raptor-based approaches. The latter is a recent effort of combining raster and vector data without a need of any conversion. This demo will allow users to run their own queries in any of the three methods and observe the differences in their performance depending on different raster and vector dataset sizes.

8 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: IOTSim is designed and implemented which supports and enables simulation of IoT big data processing using MapReduce model in cloud computing environment and validates the efficacy of the simulator.

152 citations

Proceedings ArticleDOI
18 Jun 2014
TL;DR: SpatialHadoop is a comprehensive extension to Hadoop that injects spatial data awareness in each Hadoops layer, namely, the language, storage, MapReduce, and operations layers, which allows more spatial operations to be implemented efficiently using Map Reduce.
Abstract: Recently, MapReduce frameworks, e.g., Hadoop, have been used extensively in different applications that include tera-byte sorting, machine learning, and graph processing. With the huge volumes of spatial data coming from different sources, there is an increasing demand to exploit the efficiency of Hadoop, coupled with the flexibility of the MapReduce framework, in spatial data processing. However, Hadoop falls short in supporting spatial data efficiently as the core is unaware of spatial data properties. This paper describes SpatialHadoop; a full-edged MapReduce framework with native support for spatial data. SpatialHadoop is a comprehensive extension to Hadoop that injects spatial data awareness in each Hadoop layer, namely, the language, storage, MapReduce, and operations layers. In the language layer, SpatialHadoop adds a simple and ex- pressive high level language for spatial data types and operations. In the storage layer, SpatialHadoop adapts traditional spatial index structures, Grid, R-tree and R+-tree, to form a two-level spatial index. SpatialHadoop enriches the MapReduce layer by two new components, SpatialFileSplitter and SpatialRecordReader, for efficient and scalable spatial data processing. In the operations layer, SpatialHadoop is already equipped with a dozen of operations, including range query, kNN, and spatial join. The flexibility and open source nature of SpatialHadoop allows more spatial operations to be implemented efficiently using MapReduce. Extensive experiments on a real system prototype and real datasets show that SpatialHadoop achieves orders of magnitude better performance than Hadoop for spatial data processing.

69 citations

Proceedings ArticleDOI
29 Oct 2015
TL;DR: A new platform called Physical Analytics Integrated Repository and Services (PAIRS) is presented that enables rapid data discovery by automatically updating, joining, and homogenizing data layers in space and time.
Abstract: Geospatial data volume exceeds hundreds of Petabytes and is increasing exponentially mainly driven by images/videos/data generated by mobile devices and high resolution imaging systems. Fast data discovery on historical archives and/or real time datasets is currently limited by various data formats that have different projections and spatial resolution, requiring extensive data processing before analytics can be carried out. A new platform called Physical Analytics Integrated Repository and Services (PAIRS) is presented that enables rapid data discovery by automatically updating, joining, and homogenizing data layers in space and time. Built on top of open source big data software, PAIRS manages automatic data download, data curation, and scalable storage while being simultaneously a computational platform for running physical and statistical models on the curated datasets. By addressing data curation before data being uploaded to the platform, multi-layer queries and filtering can be performed in real time. In addition, PAIRS offers a foundation for developing custom analytics. Towards that end we present two examples with models which are running operationally: (1) high resolution evapo-transpiration and vegetation monitoring for agriculture and (2) hyperlocal weather forecasting driven by machine learning for renewable energy forecasting.

62 citations

Journal ArticleDOI
TL;DR: Considering the spatio-temporal feature of messages and social relationships among users, an overall social network search framework from the perspective of semantics based on existing researches is summarized.

59 citations

Book
28 Dec 2016
TL;DR: This survey summarizes the state-of-the-art work in the area of big spatial data according to approach, architecture, language, indexing, querying, and visualization, and gives case studies of real application systems that make use of these systems to provide services for end users.
Abstract: The recent explosion in the amount of spatial data calls for specializedsystems to handle big spatial data. In this survey, we summarizethe state-of-the-art work in the area of big spatial data. We categorizethe existing work in this area according to six different angles, namely,approach, architecture, language, indexing, querying, and visualization.1 The approaches used to implement spatial query processing can becategorized as on-top, from-scratch and built-in approaches. 2 Theexisting works follow different architectures based on the underlyingsystem they extend such as MapReduce, key-value stores, or parallelDBMS. 3 The high-level language of the system is the main interfacethat hides the complexity of the system and makes it usable fornon-technical users. 4 The spatial indexing is the key feature of manysystems which allows them to achieve orders of magnitude performancespeedup by carefully laying out data in the distributed storage. 5 Thequery processing is at the heart of all the surveyed systems as it definesthe types of queries supported by the system and how efficiently theyare implemented. 6 The visualization of big spatial data is how thesystem is capable of generating images that describe terabytes of datato help users explore them. This survey describes each of these components,in detail, and gives examples of how they are implemented inexisting systems. At the end, we give case studies of real applicationsthat make use of these systems to provide services for end users.

52 citations