scispace - formally typeset
Search or ask a question
Topic

Data management

About: Data management is a research topic. Over the lifetime, 31574 publications have been published within this topic receiving 424326 citations.


Papers
More filters
Patent
24 Aug 2004
TL;DR: In this article, the present invention provides systems and methods for efficient data storage, management, and back up, in particular, for efficient replication of data, and provides devices, software, and processes for efficient replicating data.
Abstract: The present invention provides systems and methods for efficient data storage, management, and back up. In particular, the present invention provides devices, software, and processes for efficient replication of data.

234 citations

Journal ArticleDOI
TL;DR: The background and state of the art of scholarly data management and relevant technologies are examined, and data analysis methods, such as statistical analysis, social network analysis, and content analysis for dealing with big scholarly data are reviewed.
Abstract: With the rapid growth of digital publishing, harvesting, managing, and analyzing scholarly information have become increasingly challenging. The term Big Scholarly Data is coined for the rapidly growing scholarly data, which contains information including millions of authors, papers, citations, figures, tables, as well as scholarly networks and digital libraries. Nowadays, various scholarly data can be easily accessed and powerful data analysis technologies are being developed, which enable us to look into science itself with a new perspective. In this paper, we examine the background and state of the art of big scholarly data. We first introduce the background of scholarly data management and relevant technologies. Second, we review data analysis methods, such as statistical analysis, social network analysis, and content analysis for dealing with big scholarly data. Finally, we look into representative research issues in this area, including scientific impact evaluation, academic recommendation, and expert finding. For each issue, the background, main challenges, and latest research are covered. These discussions aim to provide a comprehensive review of this emerging area. This survey paper concludes with a discussion of open issues and promising future directions.

234 citations

Patent
02 Dec 1997
TL;DR: In this paper, a data management system has a plurality of data managers and is provided with a plurality data managers in one or more layers of a layered architecture, which performs with a data manager and with a user input via an API.
Abstract: A Data Management System has a plurality of data managers and is provided with a plurality of data managers in one or more layers of a layered architecture The system performs with a data manager and with a user input via an API a plurality of processs on data residing in heterogeneous data repositories of the computer system including promotion, check-in, check-out, locking, library searching, setting and viewing process results, tracking aggregations, and managing parts, releases and problem fix data under management control of a virtual control reposisitory having one or more physical heterogeneous repositories. The system provides for storing, accessing, tracking data residing in the one or more data repositories managed by the virtual control repository. User Interfaces provide a combination of command line, scripts, GUI, Menu, WebBrowser maps of the user's view to a PFVL paradigm. Configurable Managers include a query control repository for existence of peer managers and provide logic switches to dynamically interact with peers. A control repository layer provides a common process interface across all managers data view maps to a relational table paradigm and maps control repository layer (CRL) calls to sequences of SQL queries. A command translator for a relations data base provides pass through of SQL queries. Table files map SQL Queries into a set of FILE I/O's with appropriate inter I/O processing, and meta data maps SQL Queries into Meta data API calls with appropriate inter I/O processing. PFVL paradigm calls are mapped into DataManager(s)/Control Repository calls.

232 citations

Journal ArticleDOI
01 Aug 2008
TL;DR: This paper reports on the results of an independent evaluation of the techniques presented in the VLDB 2007 paper "Scalable Semantic Web Data Management Using Vertical Partitioning", as well as a complementary analysis of state-of-the-art RDF storage solutions.
Abstract: This paper reports on the results of an independent evaluation of the techniques presented in the VLDB 2007 paper "Scalable Semantic Web Data Management Using Vertical Partitioning", authored by D. Abadi, A. Marcus, S. R. Madden, and K. Hollenbach [1]. We revisit the proposed benchmark and examine both the data and query space coverage. The benchmark is extended to cover a larger portion of the query space in a canonical way. Repeatability of the experiments is assessed using the code base obtained from the authors. Inspired by the proposed vertically-partitioned storage solution for RDF data and the performance figures using a column-store, we conduct a complementary analysis of state-of-the-art RDF storage solutions. To this end, we employ MonetDB/SQL, a fully-functional open source column-store, and a well-known -- for its performance -- commercial row-store DBMS. We implement two relational RDF storage solutions -- triple-store and vertically-partitioned -- in both systems. This allows us to expand the scope of [1] with the performance characterization along both dimensions -- triple-store vs. vertically-partitioned and row-store vs. column-store -- individually, before analyzing their combined effects. A detailed report of the experimental test-bed, as well as an in-depth analysis of the parameters involved, clarify the scope of the solution originally presented and position the results in a broader context by covering more systems.

232 citations

Patent
19 Dec 2008
TL;DR: In this paper, the authors present methods, system, and apparatuses for generating and delivering analytic results for any simple or highly complex problem for which data exists that software or similar automated means can analyze.
Abstract: The present invention comprises methods, system, and apparatuses for generating and delivering analytic results for any simple or highly complex problem for which data exists that software or similar automated means can analyze. The present invention thus contemplates methods, systems, apparatuses, software, software processes, computer-readable medium, and/or data structures to enable performance of these and other features. In one embodiment, a method of the present invention comprises extracting and converting data using a data management component into a form usable by a data mining component, performing data mining to develop a model in response to a question or problem posed by a user.

232 citations


Network Information
Related Topics (5)
Information system
107.5K papers, 1.8M citations
90% related
Software
130.5K papers, 2M citations
88% related
Cluster analysis
146.5K papers, 2.9M citations
83% related
The Internet
213.2K papers, 3.8M citations
82% related
Cloud computing
156.4K papers, 1.9M citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023218
2022485
2021959
20201,435
20191,745
20181,719