scispace - formally typeset
Search or ask a question
Topic

Data mart

About: Data mart is a research topic. Over the lifetime, 559 publications have been published within this topic receiving 8550 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A novel tool for accessing and combining large-scale genomic databases of single nucleotide polymorphisms (SNP) in widespread use in human population genetics: SPSmart (SNPs for Population Studies).
Abstract: In the last five years large online resources of human variability have appeared, notably HapMap, Perlegen and the CEPH foundation. These databases of genotypes with population information act as catalogues of human diversity, and are widely used as reference sources for population genetics studies. Although many useful conclusions may be extracted by querying databases individually, the lack of flexibility for combining data from within and between each database does not allow the calculation of key population variability statistics. We have developed a novel tool for accessing and combining large-scale genomic databases of single nucleotide polymorphisms (SNPs) in widespread use in human population genetics: SPSmart (SNPs for Population Studies). A fast pipeline creates and maintains a data mart from the most commonly accessed databases of genotypes containing population information: data is mined, summarized into the standard statistical reference indices, and stored into a relational database that currently handles as many as 4 × 109 genotypes and that can be easily extended to new database initiatives. We have also built a web interface to the data mart that allows the browsing of underlying data indexed by population and the combining of populations, allowing intuitive and straightforward comparison of population groups. All the information served is optimized for web display, and most of the computations are already pre-processed in the data mart to speed up the data browsing and any computational treatment requested. In practice, SPSmart allows populations to be combined into user-defined groups, while multiple databases can be accessed and compared in a few simple steps from a single query. It performs the queries rapidly and gives straightforward graphical summaries of SNP population variability through visual inspection of allele frequencies outlined in standard pie-chart format. In addition, full numerical description of the data is output in statistical results panels that include common population genetics metrics such as heterozygosity, Fst and In.

116 citations

Patent
27 Sep 2001
TL;DR: In this paper, a data migration, data integration, data warehousing, and business intelligence system including a data storage model is provided that allows a business to effectively utilize its data to make business decisions.
Abstract: A data migration, data integration, data warehousing, and business intelligence system including a data storage model is provided that allows a business to effectively utilize its data to make business decisions. The system can be designed to include a number of data storage units including a data dock, a staging area, a data vault, a data mart, a data collection area, a metrics repository, and a metadata repository. Data received from a number of source systems moves through the data storage units and is processed along the way by a number of process areas including a profiling process area, a cleansing process area, a data loading process area, a business rules and integration process area, a propagation, aggregation and subject area breakout process area, and a business intelligence and decision support systems process area. Movement of the data is performed by metagates. The processed data is then received by corporate portals for use in making business decisions. The system may be implemented by an implementation team made up of members including a project manager, a business analyst, a system architect, a data modeler/data architect, a data migration expert, a DSS/OLAP expert, a data profiler/cleanser, and a trainer.

106 citations

Journal ArticleDOI
TL;DR: The main concepts and terminology of temporal databases are introduced and the open research issues also in connection with their implementation on commercial tools are discussed.
Abstract: Data warehouses are information repositories specialized in supporting decision making. Since the decisional process typically requires an analysis of historical trends, time and its management acquire a huge importance. In this paper we consider the variety of issues, often grouped under term temporal data warehousing, implied by the need for accurately describing how information changes over time in data warehousing systems. We recognize that, with reference to a three-levels architecture, these issues can be classified into some topics, namely: handling data/schema changes in the data warehouse, handling data/schema changes in the data mart, querying temporal data, and designing temporal data warehouses. After introducing the main concepts and terminology of temporal databases, we separately survey these topics. Finally, we discuss the open research issues also in connection with their implementation on commercial tools.

94 citations

Patent
07 Jun 2001
TL;DR: In this article, the authors present a method and apparatus for transporting data for a data warehousing application, where data is extracted from one or more source containing data having a standard data structure and is translated into data that contains meaningful business terms.
Abstract: A method and apparatus for transporting data for a data warehousing application. Data is extracted from one or more source containing data having a standard data structure and is translated into data that contains meaningful business terms. The translated data is then stored. In the present embodiment, an analytic business component is operable for extracting data from the source, translating the extracted data and for storing the translated data into a staging area. The translated data is then processed to obtain data having a common structure. In the present embodiment, a source adapter processes the translated data to obtain data having a common structure. The data having a common structure is then transformed into a format suitable for loading into a data mart. In the present embodiment, an analytic data interface receives the data having a common structure and transforms the data for loading into a data warehouse. The data is then stored in a data warehouse.

94 citations


Network Information
Related Topics (5)
Information system
107.5K papers, 1.8M citations
77% related
The Internet
213.2K papers, 3.8M citations
72% related
Scheduling (computing)
78.6K papers, 1.3M citations
72% related
Cloud computing
156.4K papers, 1.9M citations
71% related
Software
130.5K papers, 2M citations
70% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202113
202020
201926
201823
201726
201627