scispace - formally typeset
Search or ask a question
Author

Fusheng Wang

Other affiliations: Siemens, Emory University, State University of New York System  ...read more
Bio: Fusheng Wang is an academic researcher from Stony Brook University. The author has contributed to research in topics: XML database & Spatial query. The author has an hindex of 34, co-authored 196 publications receiving 4572 citations. Previous affiliations of Fusheng Wang include Siemens & Emory University.


Papers
More filters
Journal ArticleDOI
01 Aug 2013
TL;DR: Hadoop-GIS - a scalable and high performance spatial data warehousing system for running large scale spatial queries on Hadoop and integrated into Hive to support declarative spatial queries with an integrated architecture is presented.
Abstract: Support of high performance queries on large volumes of spatial data becomes increasingly important in many application domains, including geospatial problems in numerous fields, location based services, and emerging scientific applications that are increasingly data- and compute-intensive. The emergence of massive scale spatial data is due to the proliferation of cost effective and ubiquitous positioning technologies, development of high resolution imaging technologies, and contribution from a large number of community users. There are two major challenges for managing and querying massive spatial data to support spatial queries: the explosion of spatial data, and the high computational complexity of spatial queries. In this paper, we present Hadoop-GIS - a scalable and high performance spatial data warehousing system for running large scale spatial queries on Hadoop. Hadoop-GIS supports multiple types of spatial queries on MapReduce through spatial partitioning, customizable spatial query engine RESQUE, implicit parallel spatial query execution on MapReduce, and effective methods for amending query results through handling boundary objects. Hadoop-GIS utilizes global partition indexing and customizable on demand local spatial indexing to achieve efficient query processing. Hadoop-GIS is integrated into Hive to support declarative spatial queries with an integrated architecture. Our experiments have demonstrated the high efficiency of Hadoop-GIS on query response and high scalability to run on commodity clusters. Our comparative experiments have showed that performance of Hadoop-GIS is on par with parallel SDBMS and outperforms SDBMS for compute-intensive queries. Hadoop-GIS is available as a set of library for processing spatial queries, and as an integrated software package in Hive.

571 citations

Proceedings Article
30 Aug 2005
TL;DR: This system enables semantic RFID data filtering and automatic data transformation based on declarative rules, provides powerful query support of RFID object tracking and monitoring, and can be adapted to different RFID-enabled applications.
Abstract: RFID technology can be used to significantly improve the efficiency of business processes by providing the capability of automatic identification and data capture. This technology poses many new challenges on current data management systems. RFID data are time-dependent, dynamically changing, in large volumes, and carry implicit semantics. RFID data management systems need to effectively support such large scale temporal data created by RFID applications. These systems need to have an explicit temporal data model for RFID data to support tracking and monitoring queries. In addition, they need to have an automatic method to transform the primitive observations from RFID readers into derived data used in RFID-enabled applications. In this paper, we present an integrated RFID data management system -- Siemens RFID Middleware -- based on an expressive temporal data model for RFID data. Our system enables semantic RFID data filtering and automatic data transformation based on declarative rules, provides powerful query support of RFID object tracking and monitoring, and can be adapted to different RFID-enabled applications.

352 citations

Proceedings Article
01 Jan 2017
TL;DR: A framework on managing and sharing EMR data for cancer patient care using blockchain to significantly reduce the turnaround time for EMR sharing, improve decision making for medical care, and reduce the overall cost is proposed.
Abstract: Electronic medical records (EMRs) are critical, highly sensitive private information in healthcare, and need to be frequently shared among peers. Blockchain provides a shared, immutable and transparent history of all the transactions to build applications with trust, accountability and transparency. This provides a unique opportunity to develop a secure and trustable EMR data management and sharing system using blockchain. In this paper, we present our perspectives on blockchain based healthcare data management, in particular, for EMR data sharing between healthcare providers and for research studies. We propose a framework on managing and sharing EMR data for cancer patient care. In collaboration with Stony Brook University Hospital, we implemented our framework in a prototype that ensures privacy, security, availability, and fine-grained access control over EMR data. The proposed work can significantly reduce the turnaround time for EMR sharing, improve decision making for medical care, and reduce the overall cost.

247 citations

Book ChapterDOI
26 Mar 2006
TL;DR: This paper develops an RFID event detection engine that can effectively process complex RFID events, and takes an event-oriented approach to process RFID data, by devising RFID application logic into complex events.
Abstract: Advances of sensor and RFID technology provide significant new power for humans to sense, understand and manage the world. RFID provides fast data collection with precise identification of objects with unique IDs without line of sight, thus it can be used for identifying, locating, tracking and monitoring physical objects. Despite these benefits, RFID poses many challenges for data processing and management: i) RFID observations contain duplicates, which have to be filtered; ii) RFID observations have implicit meanings, which have to be transformed and aggregated into semantic data represented in their data models; and iii) RFID data are temporal, streaming, and in high volume, and have to be processed on the fly. Thus, a general RFID data processing framework is needed to automate the transformation of physical RFID observations into the virtual counterparts in the virtual world linked to business applications. In this paper, we take an event-oriented approach to process RFID data, by devising RFID application logic into complex events. We then formalize the specification and semantics of RFID events and rules. We demonstrate that traditional ECA event engine cannot be used to support highly temporally constrained RFID events, and develop an RFID event detection engine that can effectively process complex RFID events. The declarative event-based approach greatly simplifies the work of RFID data processing, and significantly reduces the cost of RFID data integration.

229 citations

Posted Content
TL;DR: In this paper, a framework for managing and sharing electronic medical records (EMRs) for cancer patient care is proposed, which can significantly reduce the turnaround time for EMR sharing, improve decision making for medical care, and reduce the overall cost.
Abstract: Electronic medical records (EMRs) are critical, highly sensitive private information in healthcare, and need to be frequently shared among peers. Blockchain provides a shared, immutable and transparent history of all the transactions to build applications with trust, accountability and transparency. This provides a unique opportunity to develop a secure and trustable EMR data management and sharing system using blockchain. In this paper, we present our perspectives on blockchain based healthcare data management, in particular, for EMR data sharing between healthcare providers and for research studies. We propose a framework on managing and sharing EMR data for cancer patient care. In collaboration with Stony Brook University Hospital, we implemented our framework in a prototype that ensures privacy, security, availability, and fine-grained access control over EMR data. The proposed work can significantly reduce the turnaround time for EMR sharing, improve decision making for medical care, and reduce the overall cost

208 citations


Cited by
More filters
Journal ArticleDOI

6,278 citations

Journal ArticleDOI
TL;DR: The current status of TCGA Research Network structure, purpose, and achievements are discussed, to provide publicly available datasets to help improve diagnostic methods, treatment standards, and finally to prevent cancer.
Abstract: The Cancer Genome Atlas (TCGA) is a public funded project that aims to catalogue and discover major cancer-causing genomic alterations to create a comprehensive "atlas" of cancer genomic profiles. So far, TCGA researchers have analysed large cohorts of over 30 human tumours through large-scale genome sequencing and integrated multi-dimensional analyses. Studies of individual cancer types, as well as comprehensive pan-cancer analyses have extended current knowledge of tumorigenesis. A major goal of the project was to provide publicly available datasets to help improve diagnostic methods, treatment standards, and finally to prevent cancer. This review discusses the current status of TCGA Research Network structure, purpose, and achievements.

2,530 citations

Journal ArticleDOI
TL;DR: A review of recent advances in medical imaging using the adversarial training scheme with the hope of benefiting researchers interested in this technique.

1,053 citations

Journal ArticleDOI
TL;DR: A general, unifying model is proposed to capture the different aspects of an IFP system and use it to provide a complete and precise classification of the systems and mechanisms proposed so far.
Abstract: A large number of distributed applications requires continuous and timely processing of information as it flows from the periphery to the center of the system. Examples include intrusion detection systems which analyze network traffic in real-time to identify possible attacks; environmental monitoring applications which process raw data coming from sensor networks to identify critical situations; or applications performing online analysis of stock prices to identify trends and forecast future values.Traditional DBMSs, which need to store and index data before processing it, can hardly fulfill the requirements of timeliness coming from such domains. Accordingly, during the last decade, different research communities developed a number of tools, which we collectively call Information flow processing (IFP) systems, to support these scenarios. They differ in their system architecture, data model, rule model, and rule language. In this article, we survey these systems to help researchers, who often come from different backgrounds, in understanding how the various approaches they adopt may complement each other.In particular, we propose a general, unifying model to capture the different aspects of an IFP system and use it to provide a complete and precise classification of the systems and mechanisms proposed so far.

918 citations

Proceedings ArticleDOI
27 Jun 2006
TL;DR: This paper proposes a complex event language that significantly extends existing event languages to meet the needs of a range of RFID-enabled monitoring applications and describes a query plan-based approach to efficiently implementing this language.
Abstract: In this paper, we present the design, implementation, and evaluation of a system that executes complex event queries over real-time streams of RFID readings encoded as events. These complex event queries filter and correlate events to match specific patterns, and transform the relevant events into new composite events for the use of external monitoring applications. Stream-based execution of these queries enables time-critical actions to be taken in environments such as supply chain management, surveillance and facility management, healthcare, etc. We first propose a complex event language that significantly extends existing event languages to meet the needs of a range of RFID-enabled monitoring applications. We then describe a query plan-based approach to efficiently implementing this language. Our approach uses native operators to efficiently handle query-defined sequences, which are a key component of complex event processing, and pipeline such sequences to subsequent operators that are built by leveraging relational techniques. We also develop a large suite of optimization techniques to address challenges such as large sliding windows and intermediate result sizes. We demonstrate the effectiveness of our approach through a detailed performance analysis of our prototype implementation under a range of data and query workloads as well as through a comparison to a state-of-the-art stream processor.

902 citations