scispace - formally typeset
Search or ask a question

Showing papers on "Data management published in 2013"


Journal ArticleDOI
03 Apr 2013-JAMA
TL;DR: The application of big data to health care is discussed, using an economic framework to highlight the opportunities it will offer and the roadblocks to implementation, and suggests that leveraging the collection of patient and practitioner data could be an important way to improve quality and efficiency of health care delivery.
Abstract: THE AMOUNT OF DATA BEING DIGITALLY COLLECTED AND stored is vast and expanding rapidly. As a result, the science of data management and analysis is also advancing to enable organizations to convert this vast resource into information and knowledge that helps them achieve their objectives. Computer scientists have invented the term big data to describe this evolving technology. Big data has been successfully used in astronomy (eg, the Sloan Digital Sky Survey of telescopic information), retail sales (eg, Walmart’s expansive number of transactions), search engines (eg, Google’s customization of individual searches based on previous web data), and politics (eg, a campaign’s focus of political advertisements on people most likely to support their candidate based on web searches). In this Viewpoint, we discuss the application of big data to health care, using an economic framework to highlight the opportunities it will offer and the roadblocks to implementation. We suggest that leveraging the collection of patient and practitioner data could be an important way to improve quality and efficiency of health care delivery. Widespread uptake of electronic health records (EHRs) has generated massive data sets. A survey by the American Hospital Association showed that adoption of EHRs has doubled from 2009 to 2011, partly a result of funding provided by the Health Information Technology for Economic and Clinical Health Act of 2009. Most EHRs now contain quantitative data (eg, laboratory values), qualitative data (eg, text-based documents and demographics), and transactional data (eg, a record of medication delivery). However, much of this rich data set is currently perceived as a byproduct of health care delivery, rather than a central asset to improve its efficiency. The transition of data from refuse to riches has been key in the big data revolution of other industries. Advances in analytic techniques in the computer sciences, especially in machine learning, have been a major catalyst for dealing with these large information sets. These analytic techniques are in contrast to traditional statistical methods (derived from the social and physical sciences), which are largely not useful for analysis of unstructured data such as text-based documents that do not fit into relational tables. One estimate suggests that 80% of business-related data exist in an unstructured format. The same could probably be said for health care data, a large proportion of which is text-based. In contrast to most consumer service industries, medicine adopted a practice of generating evidence from experimental (randomized trials) and quasi-experimental studies to inform patients and clinicians. The evidence-based movement is founded on the belief that scientific inquiry is superior to expert opinion and testimonials. In this way, medicine was ahead of many other industries in terms of recognizing the value of data and information guiding rational decision making. However, health care has lagged in uptake of newer techniques to leverage the rich information contained in EHRs. There are 4 ways big data may advance the economic mission of health care delivery by improving quality and efficiency. First, big data may greatly expand the capacity to generate new knowledge. The cost of answering many clinical questions prospectively, and even retrospectively, by collecting structured data is prohibitive. Analyzing the unstructured data contained within EHRs using computational techniques (eg, natural language processing to extract medical concepts from free-text documents) permits finer data acquisition in an automated fashion. For instance, automated identification within EHRs using natural language processing was superior in detecting postoperative complications compared with patient safety indicators based on discharge coding. Big data offers the potential to create an observational evidence base for clinical questions that would otherwise not be possible and may be especially helpful with issues of generalizability. The latter issue limits the application of conclusions derived from randomized trials performed on a narrow spectrum of participants to patients who exhibit very different characteristics. Second, big data may help with knowledge dissemination. Most physicians struggle to stay current with the latest evidence guiding clinical practice. The digitization of medical literature has greatly improved access; however, the sheer

1,396 citations


Book
14 May 2013
TL;DR: In this article, the authors used multivariate techniques used in network analysis visualization testing Hypotheses Characterizing Whole Networks Centrality Subgroups Equivalence Analyzing Two-Mode Data Large Networks Ego Networks
Abstract: Preface Introduction Mathematical Foundations Research Design Data Collection Data Management Multivariate Techniques Used in Network Analysis Visualization Testing Hypotheses Characterizing Whole Networks Centrality Subgroups Equivalence Analyzing Two-Mode Data Large Networks Ego Networks

1,074 citations


Journal Article
TL;DR: Emerging Internet of Things architecture, large scale sensor network applications, federating sensor networks, sensor data and related context capturing techniques, challenges in cloud-based management, storing, archiving and processing of sensor data are discussed.
Abstract: Internet of Things (IoT) will comprise billions of devices that can sense, communicate, compute and potentially actuate Data streams coming from these devices will challenge the traditional approaches to data management and contribute to the emerging paradigm of big data This paper discusses emerging Internet of Things (IoT) architecture, large scale sensor network applications, federating sensor networks, sensor data and related context capturing techniques, challenges in cloud-based management, storing, archiving and processing of sensor data

459 citations


Proceedings ArticleDOI
20 May 2013
TL;DR: The paper proposes the SDI generic architecture model that provides a basis for building interoperable data or project centric SDI using modern technologies and best practices and introduces the Scientific Data Lifecycle Management (SDLM) model.
Abstract: Big Data are becoming a new technology focus both in science and in industry. This paper discusses the challenges that are imposed by Big Data on the modern and future Scientific Data Infrastructure (SDI). The paper discusses a nature and definition of Big Data that include such features as Volume, Velocity, Variety, Value and Veracity. The paper refers to different scientific communities to define requirements on data management, access control and security. The paper introduces the Scientific Data Lifecycle Management (SDLM) model that includes all the major stages and reflects specifics in data management in modern e-Science. The paper proposes the SDI generic architecture model that provides a basis for building interoperable data or project centric SDI using modern technologies and best practices. The paper explains how the proposed models SDLM and SDI can be naturally implemented using modern cloud based infrastructure services provisioning model and suggests the major infrastructure components for Big Data.

412 citations


Posted Content
TL;DR: This report is intended to help users, especially to the organizations to obtain an independent understanding of the strengths and weaknesses of various NoSQL database approaches to supporting applications that process huge volumes of data.
Abstract: Digital world is growing very fast and become more complex in the volume (terabyte to petabyte), variety (structured and un-structured and hybrid), velocity (high speed in growth) in nature. This refers to as ‘Big Data’ that is a global phenomenon. This is typically considered to be a data collection that has grown so large it can’t be effectively managed or exploited using conventional data management tools: e.g., classic relational database management systems (RDBMS) or conventional search engines. To handle this problem, traditional RDBMS are complemented by specifically designed a rich set of alternative DBMS; such as - NoSQL, NewSQL and Search-based systems. This paper motivation is to provide - classification, characteristics and evaluation of NoSQL databases in Big Data Analytics. This report is intended to help users, especially to the organizations to obtain an independent understanding of the strengths and weaknesses of various NoSQL database approaches to supporting applications that process huge volumes of data.

374 citations


Journal ArticleDOI
01 Dec 2013
TL;DR: This study has identified challenges in the field, including the immense diversity and inconsistency of terminologies, limited documentation, sparse comparison and benchmarking criteria, and nonexistence of standardized query languages.
Abstract: Advances in Web technology and the proliferation of mobile devices and sensors connected to the Internet have resulted in immense processing and storage requirements. Cloud computing has emerged as a paradigm that promises to meet these requirements. This work focuses on the storage aspect of cloud computing, specifically on data management in cloud environments. Traditional relational databases were designed in a different hardware and software era and are facing challenges in meeting the performance and scale requirements of Big Data. NoSQL and NewSQL data stores present themselves as alternatives that can handle huge volume of data. Because of the large number and diversity of existing NoSQL and NewSQL solutions, it is difficult to comprehend the domain and even more challenging to choose an appropriate solution for a specific task. Therefore, this paper reviews NoSQL and NewSQL solutions with the objective of: (1) providing a perspective in the field, (2) providing guidance to practitioners and researchers to choose the appropriate data store, and (3) identifying challenges and opportunities in the field. Specifically, the most prominent solutions are compared focusing on data models, querying, scaling, and security related capabilities. Features driving the ability to scale read requests and write requests, or scaling data storage are investigated, in particular partitioning, replication, consistency, and concurrency control. Furthermore, use cases and scenarios in which NoSQL and NewSQL data stores have been used are discussed and the suitability of various solutions for different sets of applications is examined. Consequently, this study has identified challenges in the field, including the immense diversity and inconsistency of terminologies, limited documentation, sparse comparison and benchmarking criteria, and nonexistence of standardized query languages.

304 citations


Journal ArticleDOI
TL;DR: This paper reviews big data challenges from a data management respective, and discusses big data diversity, big data reduction,big data integration and cleaning,Big data indexing and query, and finally big data analysis and mining.
Abstract: There is a trend that, virtually everyone, ranging from big Web companies to traditional enterprisers to physical science researchers to social scientists, is either already experiencing or anticipating unprecedented growth in the amount of data available in their world, as well as new opportunities and great untapped value. This paper reviews big data challenges from a data management respective. In particular, we discuss big data diversity, big data reduction, big data integration and cleaning, big data indexing and query, and finally big data analysis and mining. Our survey gives a brief overview about big-data-oriented research and problems.

278 citations


Journal ArticleDOI
TL;DR: NGSUtils is a suite of software tools for manipulating data common to next-generation sequencing experiments, such as FASTQ, BED and BAM format files, that provide a stable and modular platform for data management and analysis.
Abstract: Summary: NGSUtils is a suite of software tools for manipulating data common to next-generation sequencing experiments, such as FASTQ, BED and BAM format files. These tools provide a stable and modular platform for data management and analysis. Availability and implementation: NGSUtils is available under a BSD license and works on Mac OS X and Linux systems. Python 2.6+ and virtualenv are required. More information and source code may be obtained from the website: http://ngsutils.org. Contact: ude.iupui@uilnuy Supplemental information: Supplementary data are available at Bioinformatics online.

253 citations


Posted Content
TL;DR: This paper focuses on solving the k-nearest neighbor (kNN) query problem over encrypted database outsourced to a cloud: a user issues an encrypted query record to the cloud, and the cloud returns the k closest records to the user.
Abstract: For the past decade, query processing on relational data has been studied extensively, and many theoretical and practical solutions to query processing have been proposed under various scenarios. With the recent popularity of cloud computing, users now have the opportunity to outsource their data as well as the data management tasks to the cloud. However, due to the rise of various privacy issues, sensitive data (e.g., medical records) need to be encrypted before outsourcing to the cloud. In addition, query processing tasks should be handled by the cloud; otherwise, there would be no point to outsource the data at the first place. To process queries over encrypted data without the cloud ever decrypting the data is a very challenging task. In this paper, we focus on solving the k-nearest neighbor (kNN) query problem over encrypted database outsourced to a cloud: a user issues an encrypted query record to the cloud, and the cloud returns the k closest records to the user. We first present a basic scheme and demonstrate that such a naive solution is not secure. To provide better security, we propose a secure kNN protocol that protects the confidentiality of the data, user's input query, and data access patterns. Also, we empirically analyze the efficiency of our protocols through various experiments. These results indicate that our secure protocol is very efficient on the user end, and this lightweight scheme allows a user to use any mobile device to perform the kNN query.

250 citations


Book
01 Jan 2013
TL;DR: Big Data For Dummies cuts through the confusion and helps you take charge of big data solutions for your organization.
Abstract: Find the right big data solution for your business or organizationBig data management is one of the major challenges facing business, industry, and not-for-profit organizations. Data sets such as customer transactions for a mega-retailer, weather patterns monitored by meteorologists, or social network activity can quickly outpace the capacity of traditional data management tools. If you need to develop or manage big data solutions, you'll appreciate how these four experts define, explain, and guide you through this new and often confusing concept. You'll learn what it is, why it matters, and how to choose and implement solutions that work.Effectively managing big data is an issue of growing importance to businesses, not-for-profit organizations, government, and IT professionalsAuthors are experts in information management, big data, and a variety of solutionsExplains big data in detail and discusses how to select and implement a solution, security concerns to consider, data storage and presentation issues, analytics, and much moreProvides essential information in a no-nonsense, easy-to-understand style that is empoweringBig Data For Dummies cuts through the confusion and helps you take charge of big data solutions for your organization.

230 citations


Journal ArticleDOI
TL;DR: Gaps in knowledge, skills, and confidence were significant constraints, with near-universal support for including bibliometrics and particularly data management in professional education and continuing development programs and the study found that librarians need a multilayered understanding of the research environment.
Abstract: Developments in network technologies, scholarly communication, and national policy are challenging academic libraries to find new ways to engage with research communities in the economic downturn. Librarians are responding with service innovations in areas such as bibliometrics and research data management. Previous surveys have investigated research data support within North America and other research services globally with small samples. An online multiple-choice questionnaire was used to survey bibliometric and data support activities of 140 libraries in Australia, New Zealand, Ireland, and the United Kingdom, including current and planned services, target audiences, service constraints, and staff training needs. A majority of respondents offered or planned bibliometrics training, citation reports, and impact calculations but with significant differences between countries. Current levels of engagement in data management were lower than for bibliometrics, but a majority anticipated future involvement, especially in technology assistance, data deposit, and policy development. Initiatives were aimed at multiple constituencies, with university administrators being important clients and partners for bibliometric services. Gaps in knowledge, skills, and confidence were significant constraints, with near-universal support for including bibliometrics and particularly data management in professional education and continuing development programs. The study also found that librarians need a multilayered understanding of the research environment.

Journal ArticleDOI
14 Nov 2013-Sensors
TL;DR: This paper surveys the data management solutions proposed for IoT, and proposes a data management framework for IoT that takes into consideration the discussed design elements and acts as a seed to a comprehensive IoT data management solution.
Abstract: The Internet of Things (IoT) is a networking paradigm where interconnected, smart objects continuously generate data and transmit it over the Internet. Much of the IoT initiatives are geared towards manufacturing low-cost and energy-efficient hardware for these objects, as well as the communication technologies that provide objects interconnectivity. However, the solutions to manage and utilize the massive volume of data produced by these objects are yet to mature. Traditional database management solutions fall short in satisfying the sophisticated application needs of an IoT network that has a truly global-scale. Current solutions for IoT data management address partial aspects of the IoT environment with special focus on sensor networks. In this paper, we survey the data management solutions that are proposed for IoT or subsystems of the IoT. We highlight the distinctive design primitives that we believe should be addressed in an IoT data management solution, and discuss how they are approached by the proposed solutions. We finally propose a data management framework for IoT that takes into consideration the discussed design elements and acts as a seed to a comprehensive IoT data management solution. The framework we propose adapts a federated, data- and sources-centric approach to link the diverse Things with their abundance of data to the potential applications and services that are envisioned for IoT.

Journal ArticleDOI
11 Jul 2013-Water
TL;DR: The role of real-time data in customer engagement and demand management; data ownership, sharing and privacy; technical data management and infrastructure security, utility workforce skills; and costs and benefits of implementation are addressed.
Abstract: This paper reviews the drivers, development and global deployment of intelligent water metering in the urban context. Recognising that intelligent metering (or smart metering) has the potential to revolutionise customer engagement and management of urban water by utilities, this paper provides a summary of the knowledge-base for researchers and industry practitioners to ensure that the technology fosters sustainable urban water management. To date, roll-outs of intelligent metering have been driven by the desire for increased data regarding time of use and end-use (such as use by shower, toilet, garden, etc.) as well as by the ability of the technology to reduce labour costs for meter reading. Technology development in the water sector generally lags that seen in the electricity sector. In the coming decade, the deployment of intelligent water metering will transition from being predominantly "pilot or demonstration scale" with the occasional city-wide roll-out, to broader mainstream implementation. This means that issues which have hitherto received little focus must now be addressed, namely: the role of real-time data in customer engagement and demand management; data ownership, sharing and privacy; technical data management and infrastructure security, utility workforce skills; and costs and benefits of implementation.

Journal ArticleDOI
TL;DR: Integrative meta-analysis of expression data (INMEX) is introduced, a user-friendly web-based tool designed to support meta- analysis of multiple gene-expression data sets, as well as to enable integration of data sets from gene expression and metabolomics experiments.
Abstract: The widespread applications of various ‘omics’ technologies in biomedical research together with the emergence of public data repositories have resulted in a plethora of data sets for almost any given physiological state or disease condition. Properly combining or integrating these data sets with similar basic hypotheses can help reduce study bias, increase statistical power and improve overall biological understanding. However, the difficulties in data management and the complexities of analytical approaches have significantly limited data integration to enable meta-analysis. Here, we introduce integrative meta-analysis of expression data (INMEX), a user-friendly web-based tool designed to support meta-analysis of multiple gene-expression data sets, as well as to enable integration of data sets from gene expression and metabolomics experiments. INMEX contains three functional modules. The data preparation module supports flexible data processing, annotation and visualization of individual data sets. The statistical analysis module allows researchers to combine multiple data sets based on Pvalues, effect sizes, rank orders and other features. The significant genes can be examined in functional analysis module for enriched Gene Ontology terms or Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, or expression profile visualization. INMEX has built-in support for common gene/metabolite identifiers (IDs), as well as 45 popular microarray platforms for human, mouse and rat. Complex operations are performed through a user-friendly web interface in a step-by-step manner. INMEX is freely available at http://www.inmex.ca.

Patent
16 May 2013
TL;DR: In this article, the authors present a method for generating a query for a database for information stored in the database and then generating an Online Analytical Processing (OLAP) element to represent information received from the query.
Abstract: A method is provided in one example and includes generating a query for a database for information stored in the database. The information relates to data discovered through a capture system. The method further includes generating an Online Analytical Processing (OLAP) element to represent information received from the query. A rule based on the OLAP element is generated and the rule affects data management for one or more documents that satisfy the rule. In more specific embodiments, the method further includes generating a capture rule that defines items the capture system should capture. The method also includes generating a discovery rule that defines objects the capture system should register. In still other embodiments, the method includes developing a policy based on the rule, where the policy identifies how one or more documents are permitted to traverse a network.

Journal ArticleDOI
TL;DR: Diversity in how the term is used is assessed, ambiguities are highlighted and real-world assessments of the value of experimentation within a management framework, as well as of identified challenges and pathologies are needed.
Abstract: Adaptive management (AM) emerged in the literature in the mid-1970s in response both to a realization of the extent of uncertainty involved in management, and a frustration with attempts to use modelling to integrate knowledge and make predictions The term has since become increasingly widely used in scientific articles, policy documents and management plans, but both understanding and application of the concept is mixed This paper reviews recent literature from conservation and natural resource management journals to assess diversity in how the term is used, highlight ambiguities and consider how the concept might be further assessed AM is currently being used to describe many different management contexts, scales and locations Few authors define the term explicitly or describe how it offers a means to improve management outcomes in their specific management context Many do not adhere to the idea as it was originally conceived, despite citing seminal work Significant confusion exists over the distinction between active and passive approaches Over half of the studies reporting to implement AM claimed to have done so successfully, yet none quantified specific benefits, or costs, in relation to possible alternatives Similarly those studies reporting to assess the approach did so only in relation to specific models and their parameterizations; none assessed the benefits or costs of AM in the field AM is regarded by some as an effective and well-established framework to support the management of natural resources, yet by others as a concept difficult to realize and fraught with implementation challenges; neither of these observations is wholly accurate From a scientific and technical perspective many practical questions remain; in particular real-world assessments of the value of experimentation within a management framework, as well as of identified challenges and pathologies, are needed Further discussion and systematic assessment of the approach is required, together with greater attention to its definition and description, enabling the assessment of new approaches to managing uncertainty, and AM itself

Journal ArticleDOI
Yi Jiao1, Yinghui Wang, Shaohua Zhang, Yin Li, Baoming Yang, Lei Yuan 
TL;DR: This study presents a novel cloud approach that, focusing on China's special construction requirements, proposes a series of as-built BIM tools and a self-organised application model that correlates project engineering data and project management data through a seamless BIM and BSNS (business social networking services) federation.

Journal ArticleDOI
TL;DR: Serious consideration of both the similarities and dissimilarities among disciplines will help guide academic librarians and other data curation professionals in developing a range of data-management services that can be tailored to the unique needs of different scholarly researchers.
Abstract: Academic librarians are increasingly engaging in data curation by providing infrastructure (e.g., institutional repositories) and offering services (e.g., data management plan consultations) to support the management of research data on their campuses. Efforts to develop these resources may benefit from a greater understanding of disciplinary differences in research data management needs. After conducting a survey of data management practices and perspectives at our research university, we categorized faculty members into four research domains—arts and humanities, social sciences, medical sciences, and basic sciences—and analyzed variations in their patterns of survey responses. We found statistically significant differences among the four research domains for nearly every survey item, revealing important disciplinary distinctions in data management actions, attitudes, and interest in support services. Serious consideration of both the similarities and dissimilarities among disciplines will help guide academic librarians and other data curation professionals in developing a range of data-management services that can be tailored to the unique needs of different scholarly researchers.

Journal ArticleDOI
04 Nov 2013-PLOS ONE
TL;DR: The re3data.org project as mentioned in this paper provides an overview of the heterogeneous research data repositories (RDRs) and provides a typology of institutional, disciplinary, multidisciplinary and project-specific RDRs.
Abstract: Researchers require infrastructures that ensure a maximum of accessibility, stability and reliability to facilitate working with and sharing of research data. Such infrastructures are being increasingly summarized under the term Research Data Repositories (RDR). The project re3data.org–Registry of Research Data Repositories–has begun to index research data repositories in 2012 and offers researchers, funding organizations, libraries and publishers an overview of the heterogeneous research data repository landscape. In July 2013 re3data.org lists 400 research data repositories and counting. 288 of these are described in detail using the re3data.org vocabulary. Information icons help researchers to easily identify an adequate repository for the storage and reuse of their data. This article describes the heterogeneous RDR landscape and presents a typology of institutional, disciplinary, multidisciplinary and project-specific RDR. Further the article outlines the features of re3data.org, and shows how this registry helps to identify appropriate repositories for storage and search of research data.

Journal ArticleDOI
TL;DR: A systematic review of papers pertaining to the application of knowledge-driven systems in support of emergency management that have been published in the last two decades concludes that only limited work has been done in three EMIS-knowledge management system (KMS) subdomains.

Journal ArticleDOI
TL;DR: The results of a survey conducted by the working groups of the DataONE project are used to present a new understanding of challenges to the development of global data collections and preservation by systematically examining the determinants of the researchers' likelihood to openly publish research data.

Journal ArticleDOI
TL;DR: A framework of business analytics for supply chain analytics (SCA) as IT-enabled, analytical dynamic capabilities composed of data management capability, analytical supply chain process capability, and supply chain performance management capability is proposed.
Abstract: Supply chain management has become more important as an academic topic due to trends in globalization leading to massive reallocation of production related advantages. Because of the massive amount of data that is generated in the global economy, new tools need to be developed in order to manage and analyze the data, as well as to monitor organizational performance worldwide. This paper proposes a framework of business analytics for supply chain analytics (SCA) as IT-enabled, analytical dynamic capabilities composed of data management capability, analytical supply chain process capability, and supply chain performance management capability. This paper also presents a dynamic-capabilities view of SCA and extensively describes a set of its three capabilities: data management capability, analytical supply chain process capability, and supply chain performance management capability. Next, using the SCM best practice, sales & operations planning (S&OP), the paper demonstrates opportunities to apply SCA in an integrated way. In discussing the implications of the proposed framework, finally, the paper examines several propositions predicting the positive impact of SCA and its individual capability on SCM performance.

Patent
15 Mar 2013
TL;DR: In this paper, the authors present a system for enabling read/write operations between near field communication (NFC) devices at multiple levels of access authorization, including access control, authentication, and authorization.
Abstract: Systems, devices, methods, and programming products for enabling read/write operations between near field communication (NFC) devices at multiple levels of access authorization.

Journal ArticleDOI
TL;DR: The paper describes soil sampling methods and technology applications; field and yield mapping with GPS and GIS; harvesters and future research in robotic-based harvester; food processing and packaging technology such as traceability and status of RFID networking research; application of sensor network; data management and execution systems; and the automation and control standards such as fieldbus systems and OMAC guidelines.

01 Jan 2013
TL;DR: The Neuroscience Gateway hides or eliminates, from the point of view of the users, all the administrative and technical barriers and makes parallel neuronal simulation tools easily available and accessible on complex high performance computing machines and handles the running of jobs and data management and retrieval.
Abstract: Last few decades have seen the emergence of computational neuroscience as a mature field where researchers are interested in modeling complex and large neuronal systems and require access to high performance computing machines and associated cyberinfrastructure to manage computational workflow and data. The neuronal simulation tools, used in this research field, are also implemented for parallel computers and suitable for high performance computing machines. But using these tools on complex high performance computing machines remain a challenge due to issues with acquiring computer time on these machines located at national supercomputer centers, dealing with complex user interface of these machines, dealing with data management and retrieval. The Neuroscience Gateway is being developed to alleviate all of these barriers to entry for computational neuroscientist. It hides or eliminates, from the point of view of the users, all the administrative and technical barriers and makes parallel neuronal simulation tools easily available and accessible on complex high performance computing machines and handles the running of jobs and data management and retrieval. This paper describes the architecture it is based on, how it is implemented, and how users can use this for computational neuroscience research using high performance computing at the back end.

Journal ArticleDOI
TL;DR: The design of e-SC, its API and its use in three different case studies: spectral data visualization, medical data capture and analysis, and chemical property prediction are described.
Abstract: This paper describes the e-Science Central (e-SC) cloud data processing system and its application to a number of e-Science projects. e-SC provides both software as a service (SaaS) and platform as a service for scientific data management, analysis and collaboration. It is a portable system and can be deployed on both private (e.g. Eucalyptus) and public clouds (Amazon AWS and Microsoft Windows Azure). The SaaS application allows scientists to upload data, edit and run workflows and share results in the cloud, using only a Web browser. It is underpinned by a scalable cloud platform consisting of a set of components designed to support the needs of scientists. The platform is exposed to developers so that they can easily upload their own analysis services into the system and make these available to other users. A representational state transfer-based application programming interface (API) is also provided so that external applications can leverage the platform's functionality, making it easier to build scalable, secure cloud-based applications. This paper describes the design of e-SC, its API and its use in three different case studies: spectral data visualization, medical data capture and analysis, and chemical property prediction.


Proceedings ArticleDOI
22 Jul 2013
TL;DR: Repositories need to address the missing dimensions of context data reusers need to better support data reuse in archaeology, especially those related to research design.
Abstract: Field archaeology only recently developed centralized systems for data curation, management, and reuse. Data documentation guidelines, standards, and ontologies have yet to see wide adoption in this discipline. Moreover, repository practices have focused on supporting data collection, deposit, discovery, and access more than data reuse. In this paper we examine the needs of archaeological data reusers, particularly the context they need to understand, verify, and trust data others collect during field studies. We then apply our findings to the existing work on standards development. We find that archaeologists place the most importance on data collection procedures, but the reputation and scholarly affiliation of the archaeologists who conducted the original field studies, the wording and structure of the documentation created during field work, and the repository where the data are housed also inform reuse. While guidelines, standards, and ontologies address some aspects of the context data reusers need, they provide less guidance on others, especially those related to research design. We argue repositories need to address these missing dimensions of context to better support data reuse in archaeology.

Journal ArticleDOI
TL;DR: In this article, the authors present a new paradigm for adaptive management that shows that there are no categorical limitations to its appropriate use, the boundaries of application being defined by problem conception and the resources available to managers.
Abstract: Uncertainty is a pervasive feature in natural resource management. Adaptive management, an approach that focuses on identifying critical uncertainties to be reduced via diagnostic management experiments, is one favored approach for tackling this reality. While adaptive management is identified as a key method in the environmental management toolbox, there remains a lack of clarity over when its use is appropriate or feasible. Its implementation is often viewed as suitable only in a limited set of circumstances. Here we restructure some of the ideas supporting this view, and show why much of the pessimism around AM may be unwarranted. We present a new framework for deciding when AM is appropriate, feasible, and subsequently successful. We thus present a new paradigm for adaptive management that shows that there are no categorical limitations to its appropriate use, the boundaries of application being defined by problem conception and the resources available to managers. In doing so we also separate adaptive management as a management tool, from the burden of failures that result from the complex policy, social, and institutional environment within which management occurs.

Proceedings ArticleDOI
20 Aug 2013
TL;DR: A layered reference model for IoT data management is presented and the related research topics and solutions in each layer are elaborate to identify research challenges and opportunities for future work.
Abstract: Internet of Things (IoT) is an important part of the new generation information technology. Data management for IoT plays a crucial role in its effective operations and has become a key research topic of IoT. Much work has been done to enable effective and intelligent data processing and analysis when IoT is evolving from Radio Frequency Identification (RFID), Wireless Sensor Network (WSN) and other related technologies. In this paper, we start from the core definition and architecture of IoT, aiming at examining current research effort to derive a holistic view of existing literatures. We present a layered reference model for IoT data management and elaborate the related research topics and solutions in each layer. Based on our analysis, we identify research challenges and opportunities for future work.