scispace - formally typeset
Search or ask a question

Showing papers on "Data management published in 2007"


Journal ArticleDOI
TL;DR: Giovanni, the Goddard Earth Sciences Data and Information Services Center (GES DISC) Interactive Online Visualization and Analysis Infrastructure, has provided researchers with advanced capabilities to perform data exploration and analysis with observational data from NASA Earth observation satellites.
Abstract: Giovanni, the Goddard Earth Sciences Data and Information Services Center (GES DISC) Interactive Online Visualization and Analysis Infrastructure, has provided researchers with advanced capabilities to perform data exploration and analysis with observational data from NASA Earth observation satellites. In the past 5-10 years, examining geophysical events and processes with remote-sensing data required a multistep process of data discovery, data acquisition, data management, and ultimately data analysis. Giovanni accelerates this process by enabling basic visualization and analysis directly on the World Wide Web. In the last two years, Giovanni has added new data acquisition functions and expanded analysis options to increase its usefulness to the Earth science research community.

767 citations


Proceedings Article
23 Sep 2007
TL;DR: The results show that a vertical partitioned schema achieves similar performance to the property table technique while being much simpler to design, and if a column-oriented DBMS is used instead of a row-oriented database, another order of magnitude performance improvement is observed, with query times dropping from minutes to several seconds.
Abstract: Efficient management of RDF data is an important factor in realizing the Semantic Web vision. Performance and scalability issues are becoming increasingly pressing as Semantic Web technology is applied to real-world applications. In this paper, we examine the reasons why current data management solutions for RDF data scale poorly, and explore the fundamental scalability limitations of these approaches. We review the state of the art for improving performance for RDF databases and consider a recent suggestion, "property tables." We then discuss practically and empirically why this solution has undesirable features. As an improvement, we propose an alternative solution: vertically partitioning the RDF data. We compare the performance of vertical partitioning with prior art on queries generated by a Web-based RDF browser over a large-scale (more than 50 million triples) catalog of library data. Our results show that a vertical partitioned schema achieves similar performance to the property table technique while being much simpler to design. Further, if a column-oriented DBMS (a database architected specially for the vertically partitioned case) is used instead of a row-oriented DBMS, another order of magnitude performance improvement is observed, with query times dropping from minutes to several seconds.

716 citations


Journal ArticleDOI
TL;DR: In this article, the authors highlight the need to fully take into account the complexity of the systems to be managed and to give more attention to uncertainties in the management of water resources.
Abstract: The management of water resources is currently undergoing a paradigm shift toward a more integrated and participatory management style. This paper highlights the need to fully take into account the complexity of the systems to be managed and to give more attention to uncertainties. Achieving this requires adaptive management approaches that can more generally be defined as systematic strategies for improving management policies and practices by learning from the outcomes of previous management actions. This paper describes how the principles of adaptive water management might improve the conceptual and methodological base for sustainable and integrated water management in an uncertain and complex world. Critical debate is structured around four questions: (1) What types of uncertainty need to be taken into account in water management? (2) How does adaptive management account for uncertainty? (3) What are the characteristics of adaptive management regimes? (4) What is the role of social learning in managing change? Major transformation processes are needed because, in many cases, the structural requirements, e.g., adaptive institutions and a flexible technical infrastructure, for adaptive management are not available. In conclusion, we itemize a number of research needs and summarize practical recommendations based on the current state of knowledge.

691 citations


Proceedings Article
01 Jan 2007
TL;DR: This paper proposes a new data integration architecture, PAYGO, which is inspired by the concept of dataspaces and emphasizes pay-as-you-go data management as means for achieving web-scale data integration.
Abstract: The World Wide Web is witnessing an increase in the amount of structured content – vast heterogeneous collections of structured data are on the rise due to the Deep Web, annotation schemes like Flickr, and sites like Google Base. While this phenomenon is creating an opportunity for structured data management, dealing with heterogeneity on the web-scale presents many new challenges. In this paper, we highlight these challenges in two scenarios – the Deep Web and Google Base. We contend that traditional data integration techniques are no longer valid in the face of such heterogeneity and scale. We propose a new data integration architecture, PAYGO, which is inspired by the concept of dataspaces and emphasizes pay-as-you-go data management as means for achieving web-scale data integration.

357 citations


01 Jan 2007
TL;DR: This track is new and is growing very fast at the moment, assisted by new developments in IT.
Abstract: IT-Track KM = Management of Information. Researchers and practitioners in this field tend to have their education in computer and/or information science. They are involved in construction of information management systems, AI, reengineering, group ware etc. To them Knowledge = Objects that can be identified and handled in information systems. This track is new and is growing very fast at the moment, assisted by new developments in IT.

299 citations


Journal ArticleDOI
TL;DR: A mobiscope is a federation of distributed mobile sensors into a taskable sensing system that achieves high-density sampling coverage over a wide area through mobility as discussed by the authors, which introduces challenges in data management and integrity, privacy, and network system design.
Abstract: Mobiscopes extend the traditional sensor network model, introducing challenges in data management and integrity, privacy, and network system design. Researchers need an architecture and general methodology for designing future mobiscopes. A mobiscope is a federation of distributed mobile sensors into a taskable sensing system that achieves high-density sampling coverage over a wide area through mobility.

277 citations


Journal ArticleDOI
TL;DR: The HUPO-PSI has further developed the PSI-MI XML schema to enable the description of interactions between a wider range of molecular types, for example nucleic acids, chemical entities, and molecular complexes.
Abstract: Molecular interaction Information is a key resource in modern biomedical research. Publicly available data have previously been provided in a broad array of diverse formats, making access to this very difficult. The publication and wide implementation of the Human Proteome Organisation Proteomics Standards Initiative Molecular Interactions (HUPO PSI-MI) format in 2004 was a major step towards the establishment of a single, unified format by which molecular interactions should be presented, but focused purely on protein-protein interactions. The HUPO-PSI has further developed the PSI-MI XML schema to enable the description of interactions between a wider range of molecular types, for example nucleic acids, chemical entities, and molecular complexes. Extensive details about each supported molecular interaction can now be captured, including the biological role of each molecule within that interaction, detailed description of interacting domains, and the kinetic parameters of the interaction. The format is supported by data management and analysis tools and has been adopted by major interaction data providers. Additionally, a simpler, tab-delimited format MITAB2.5 has been developed for the benefit of users who require only minimal information in an easy to access configuration. The PSI-MI XML2.5 and MITAB2.5 formats have been jointly developed by interaction data producers and providers from both the academic and commercial sector, and are already widely implemented and well supported by an active development community. PSI-MI XML2.5 enables the description of highly detailed molecular interaction data and facilitates data exchange between databases and users without loss of information. MITAB2.5 is a simpler format appropriate for fast Perl parsing or loading into Microsoft Excel.

274 citations


Patent
10 Oct 2007
TL;DR: A diabetes data management system selects variable parameters and one or more devices with data that are utilized in a report as discussed by the authors, and analyzes data during a selected period, including carbohydrate, insulin, and glucose data, around and during meal events and other user defined events.
Abstract: A diabetes data management system selects variable parameters and one or more devices with data that are utilized in a report. The diabetes data management system analyzes data during a selected period. The system generates reports which highlight data from one or more device during the selected period including carbohydrate, insulin, and glucose data, reports which highlight data around and during meal events and other user- defined events, reports which overlay multiple data based on time of day and other factors, and automatically prepared logbook reports.

249 citations


Journal ArticleDOI
TL;DR: It is hypothesized and empirically tested the belief that more integration is needed between technologies intended to support knowledge and those supporting business operations and led to a revised approach to developing organizational knowledge management systems.

232 citations



Journal ArticleDOI
TL;DR: In this paper, the authors present a resource dependence framework for network management that can encompass the existing models and their new data on the environment in which network management occurs, as well as a series of propositions that flow from their reconsideration of network management.
Abstract: Although policy and collaborative networks have been studied since the 1970s and 1980s, only recently has the management of these entities come under greater scrutiny. Studies of “network management” are designed to better understand the unique challenges of operating in a context where bureaucracy no longer provides the primary tool for “social steering.” These studies typically make three assumptions about networks, public managers, and the tasks of network management that empirical evidence from our casework in “Newstatia” suggests are suspect at best. If so, then network management theory needs to be reconsidered. The second half of this article begins this process. We have organized this article into six sections. The first defines policy and collaborative networks and discusses why analyzing them and their management independently is probably flawed. The second presents our data and justifications for believing the assumptions outlined above are oversimplifications. The third section reviews three perspectives and two partial models of network management and points out how the perspectives and models need integration. The fourth section develops a resource dependence framework for network management that can encompass the existing models and our new data on the environment in which network management occurs. The final section outlines a series of propositions that flow from our reconsideration of network management.

Journal ArticleDOI
TL;DR: The vision of a worldwide sensor Web is close to becoming a reality with the rapidly increasing number of large-scale sensor network deployments.
Abstract: Harvesting the benefits of a sensor-rich world presents many data management challenges. Recent advances in research and industry aim to address these challenges. With the rapidly increasing number of large-scale sensor network deployments, the vision of a worldwide sensor Web is close to becoming a reality.

Proceedings ArticleDOI
26 Mar 2007
TL;DR: A brief overview of RFID technology is provided and a few of the data management challenges that are suitable topics for exploratory research are highlighted.
Abstract: Radio-frequency identification (RFID) technology promises to revolutionize the way we track items in supply chain, retail store, and asset management applications. The size and different characteristics of RFID data pose many interesting challenges in the current data management systems. In this paper, we provide a brief overview of RFID technology and highlight a few of the data management challenges that we believe are suitable topics for exploratory research.

Journal ArticleDOI
TL;DR: This survey analyzes multiresolution approaches that exploit a certain semi-regularity of the data, including dynamic scene management, out-of-core data organization and compression, as well as numerical accuracy.
Abstract: Rendering high quality digital terrains at interactive rates requires carefully crafted algorithms and data structures able to balance the competing requirements of realism and frame rates, while taking into account the memory and speed limitations of the underlying graphics platform. In this survey, we analyze multiresolution approaches that exploit a certain semi-regularity of the data. These approaches have produced some of the most efficient systems to date. After providing a short background and motivation for the methods, we focus on illustrating models based on tiled blocks and nested regular grids, quadtrees and triangle bin-trees triangulations, as well as cluster-based approaches. We then discuss LOD error metrics and system-level data management aspects of interactive terrain visualization, including dynamic scene management, out-of-core data organization and compression, as well as numerical accuracy.

Journal ArticleDOI
TL;DR: Reading encyclopedia of communities of practice in information and knowledge management is a good habit; you can develop this habit to be such interesting way.
Abstract: Will reading habit influence your life? Many say yes. Reading encyclopedia of communities of practice in information and knowledge management is a good habit; you can develop this habit to be such interesting way. Yeah, reading habit will not only make you have any favourite activity. It will be one of guidance of your life. When reading has become a habit, you will not make it as disturbing activities or as boring activity. You can gain many benefits and importances of reading.

Proceedings Article
05 Jan 2007
TL;DR: This paper characterize the performance of a commercial database server running on emerging chip multiprocessor technologies and finds that the major bottleneck of current software is data cache stalls, with L2 hit stalls rising from oblivion to become the dominant execution time component in some cases.
Abstract: Prior research shows that database system performance is dominated by off-chip data stalls, resulting in a concerted effort to bring data into on-chip caches. At the same time, high levels of integration have enabled the advent of chip multiprocessors and increasingly large (and slow) on-chip caches. These two trends pose the imminent technical and research challenge of adapting high-performance data management software to a shifting hardware landscape. In this paper we characterize the performance of a commercial database server running on emerging chip multiprocessor technologies. We find that the major bottleneck of current software is data cache stalls, with L2 hit stalls rising from oblivion to become the dominant execution time component in some cases. We analyze the source of this shift and derive a list of features for future database designs to attain maximum performance.

Proceedings Article
01 Jan 2007
TL;DR: This work articulate a vision of a storage-centric sensor network where sensor nodes will be equipped with high-capacity and energy-efficient local flash storage, and describes how StonesDB enables this vision through a number of innovative features including energyefficient use of flash memory, multi-resolution storage and aging, query processing, and intelligent caching.
Abstract: Data management in wireless sensor networks has been an area of significant research in recent years. Many existing sensor data management systems view sensor data as a continuous stream that is sensed, filtered, processed, and aggregated as it “flows” from sensors to users. We argue that technology trends in flash memories and embedded platforms call for re-thinking this architecture. We articulate a vision of a storage-centric sensor network where sensor nodes will be equipped with high-capacity and energy-efficient local flash storage. We argue that the data management infrastructure will need substantial redesign to fully exploit the presence of local storage and processing capability in order to reduce expensive communication. We then describe how StonesDB enables this vision through a number of innovative features including energyefficient use of flash memory, multi-resolution storage and aging, query processing, and intelligent caching.

Journal ArticleDOI
TL;DR: The existing user interface frameworks and component technologies used in presentation integration are discussed, their strengths and weaknesses are illustrated, and some opportunities for future work are presented.
Abstract: Creating composite applications from reusable components is an important technique in software engineering and data management. Although a large body of research and development covers integration at the data and application levels, little work has been done to facilitate it at the presentation level. This article discusses the existing user interface frameworks and component technologies used in presentation integration, illustrates their strengths and weaknesses, and presents some opportunities for future work

Patent
Hagit Perry1, Uri Ron1
21 Nov 2007
TL;DR: In this article, an electronic messager with a predictive text editor, including a storage unit for storing a data structure associating, for each one of a plurality of a user's contacts, usage data for the user's history of usage of words in communications with the user contact, a data manager coupled with the storage unit, and for updating the data structure as additional communications with each user contact are performed and additional usage data is obtained therefrom.
Abstract: An electronic messager with a predictive text editor, including a storage unit for storing a data structure associating, for each one of a plurality of a user's contacts, usage data for the user's history of usage of words in communications with the user contact, a data manager coupled with the storage unit for generating the data structure in the storage unit, and for updating the data structure as additional communications with each user contact are performed and additional usage data is obtained therefrom, and a text predictor coupled with the storage unit, for receiving as input a character string and a designated user contact, and for generating as output an ordered list of predicted. A method is also described and claimed.

Journal ArticleDOI
TL;DR: Themes identified in this study suggest that at least some common data management needs will best be served by improving access to basic level tools such that researchers can solve their own problems.

Journal Article
TL;DR: It is argued that new methods for collecting social nework strucuture, and the shift in scale of these networks, introduces a greater degree of imprecision that requires rethinking on how SNA techniques can be applied.
Abstract: Social network analysis (SNA) has become a mature scientific field over the last 50 years and is now an area with massive commercial appeal and renewed research interest. In this paper, we argue that new methods for collecting social nework strucuture, and the shift in scale of these networks, introduces a greater degree of imprecision that requires rethinking on how SNA techniques can be applied. We discuss a new area in data management, probabilistic databases, whose main research goal is to provide tools to manage and manipulate imprecise or uncertain data. We outline the application building blocks necessary to build a large scale social networking application and the extent to which current research in probabilisitc databases addresses these challenges.

Journal ArticleDOI
TL;DR: This study argues for the need to revise data quality metrics and measurement techniques to incorporate and better reflect contextual assessment, and develops new metrics for assessing data quality along commonly used dimensions - completeness, validity, accuracy, and currency.
Abstract: Data consumers assess quality within specific business contexts or decision tasks. The same data resource may have an acceptable level of quality for some contexts but this quality may be unacceptable for other contexts. However, existing data quality metrics are mostly derived impartially, disconnected from the specific contextual characteristics. This study argues for the need to revise data quality metrics and measurement techniques to incorporate and better reflect contextual assessment. It contributes to that end by developing new metrics for assessing data quality along commonly used dimensions - completeness, validity, accuracy, and currency. The metrics are driven by data utility, a conceptual measure of the business value that is associated with the data within a specific usage context. The suggested data quality measurement framework uses utility as a scaling factor for calculating quality measurements at different levels of data hierarchy. Examples are used to demonstrate the use of utility-driven assessment in real-world data management scenarios and the broader implications for data management are discussed

Journal ArticleDOI
TL;DR: In this article, the authors present the development of a Virtual Research Environment dedicated to the exploitation of intra-site Cultural Heritage data, which is based on open-source software modules dedicated to Internet, so users can avoid being software driven and can register and consult data from different computers.

Journal ArticleDOI
TL;DR: The Purdue Ionomics Information Management System (PiiMS) provides integrated workflow control, data storage, and analysis to facilitate high-throughput data acquisition, along with integrated tools for data search, retrieval, and visualization for hypothesis development.
Abstract: The advent of high-throughput phenotyping technologies has created a deluge of information that is difficult to deal with without the appropriate data management tools. These data management tools should integrate defined workflow controls for genomic-scale data acquisition and validation, data storage and retrieval, and data analysis, indexed around the genomic information of the organism of interest. To maximize the impact of these large datasets, it is critical that they are rapidly disseminated to the broader research community, allowing open access for data mining and discovery. We describe here a system that incorporates such functionalities developed around the Purdue University high-throughput ionomics phenotyping platform. The Purdue Ionomics Information Management System (PiiMS) provides integrated workflow control, data storage, and analysis to facilitate high-throughput data acquisition, along with integrated tools for data search, retrieval, and visualization for hypothesis development. PiiMS is deployed as a World Wide Web-enabled system, allowing for integration of distributed workflow processes and open access to raw data for analysis by numerous laboratories. PiiMS currently contains data on shoot concentrations of P, Ca, K, Mg, Cu, Fe, Zn, Mn, Co, Ni, B, Se, Mo, Na, As, and Cd in over 60,000 shoot tissue samples of Arabidopsis (Arabidopsis thaliana), including ethyl methanesulfonate, fast-neutron and defined T-DNA mutants, and natural accession and populations of recombinant inbred lines from over 800 separate experiments, representing over 1,000,000 fully quantitative elemental concentrations. PiiMS is accessible at www.purdue.edu/dp/ionomics.

Patent
23 Mar 2007
TL;DR: In this article, a system for disease management that employs diagnostic testing devices and medication delivery devices for providing data to a repository in real-time and automatically is presented, which can be analyzed to determine such information as actual test strip use, patient health parameters to outside prescribed ranges, testing and medication compliance, patient profiles or stakeholders to receive promotional items or incentives, and so on.
Abstract: Methods, devices and a system for disease management are provided that employ diagnostic testing devices (e.g., blood glucose meters) and medication delivery devices (e.g., insulin delivery devices) for providing data to a repository in real-time and automatically. Repository data can be analyzed to determine such information as actual test strip use, patient health parameters to outside prescribed ranges, testing and medication delivery compliance, patient profiles or stakeholders to receive promotional items or incentives, and so on. Connected meters and medication delivery devices and repository data analysis are also employed to associate a diagnostic test to a mealtime based on timing of a therapeutic intervention performed by an individual.

Journal ArticleDOI
TL;DR: An overview of generic knowledge management critical success factors, in conjunction with an overview of the factors that has been found to be critical in implementation journeys in selected South African companies are provided.
Abstract: Purpose – The purpose of this article is to provide an overview of generic knowledge management critical success factors, in conjunction with an overview of the factors that has been found to be critical in implementation journeys in selected South African companies.Design/methodology/approach – Literature research was used.Findings – Most of these factors are very specific to the organizational context and have had a significant impact on the success of implementations. These unique factors include the creation of a shared understanding of the concept of knowledge management, identifying the value of co‐creation of the knowledge management strategy, and positioning of knowledge management as strategic focus area in the organization.Originality/value – Knowledge management is a complex discipline with many factors contributing to successful implementation. The factors that contribute to successful implementation of knowledge management are highly dependent on the environment and specific context, and can ...

Journal ArticleDOI
TL;DR: A vision of intelligent components, which know their identities, locations and history, and communicate this information to their environments is provided, and streamlining information flow through supply chains by utilizing radio frequency identification (RFID) technology is proposed.

Patent
06 Nov 2007
TL;DR: In this article, the authors present a system and method for data management whereby a data management application manages data across a managed service environment, a mail server environment, and a client environment, allowing a customer to optimize data management functions such as archiving, recovering, monitoring, authenticating, synchronizing, transferring, copying, stubbing, chunking, harvesting and securing.
Abstract: The present invention discloses a system and method for data management whereby a data management application manages data across a managed service environment, a mail server environment, and a client environment. The present invention allows a customer to optimize data management functions such as archiving, recovering, monitoring, authenticating, synchronizing, transferring, copying, stubbing, chunking, harvesting, and securing.

Journal ArticleDOI
TL;DR: The opportunities and challenges to the global science system associated with establishing an open data policy are reviewed.
Abstract: he digital revolution has transformed the accumulation of properly curated public research data into an essential upstream resource whose value increases with use. The potential contributions of such data to the creation of new knowledge and downstream economic and social goods can in many cases be multiplied exponentially when the data are made openly available on digital networks. Most developed countries spend large amounts of public resources on research and related scientific facilities and instruments that generate massive amounts of data. Yet precious little of that investment is devoted to promoting the value of the resulting data by preserving and making them broadly available. The largely ad hoc approach to managing such data, however, is now beginning to be understood as inadequate to meet the exigencies of the national and international research enterprise. The time has thus come for the research community to establish explicit responsibilities for these digital resources. This article reviews the opportunities and challenges to the global science system associated with establishing an open data policy.

Journal ArticleDOI
TL;DR: A conceptual framework that can be used in studying the changing nature of management control in organizations is developed based on four components of the management control system, namely: organizational structure and strategy; corporate culture; management information systems; and core control package.
Abstract: Purpose – This paper aims to develop a conceptual framework that can be used in studying the changing nature of management control in organizations. It is based on four components of the management control system, namely: organizational structure and strategy; corporate culture; management information systems; and core control package.Design/methodology/approach – A range of published works is reviewed to explore the nature of management control.Findings – The conceptual framework developed in the paper can be used in studying the changing nature of management control in organizations.Research limitations/implications – This is not an empirical investigation of management control.Originality/value – The framework presented in this article is useful to both practitioners and researchers of management control.