Showing papers on "Data mart published in 2008"

PDF

Open Access

Journal Article•DOI•

SPSmart: adapting population based SNP genotype databases for fast and comprehensive web access

[...]

Jorge Amigo¹, Antonio Salas¹, Christopher Phillips¹, Angel Carracedo¹•Institutions (1)

10 Oct 2008-BMC Bioinformatics

TL;DR: A novel tool for accessing and combining large-scale genomic databases of single nucleotide polymorphisms (SNP) in widespread use in human population genetics: SPSmart (SNPs for Population Studies).

...read moreread less

Abstract: In the last five years large online resources of human variability have appeared, notably HapMap, Perlegen and the CEPH foundation. These databases of genotypes with population information act as catalogues of human diversity, and are widely used as reference sources for population genetics studies. Although many useful conclusions may be extracted by querying databases individually, the lack of flexibility for combining data from within and between each database does not allow the calculation of key population variability statistics. We have developed a novel tool for accessing and combining large-scale genomic databases of single nucleotide polymorphisms (SNPs) in widespread use in human population genetics: SPSmart (SNPs for Population Studies). A fast pipeline creates and maintains a data mart from the most commonly accessed databases of genotypes containing population information: data is mined, summarized into the standard statistical reference indices, and stored into a relational database that currently handles as many as 4 × 109 genotypes and that can be easily extended to new database initiatives. We have also built a web interface to the data mart that allows the browsing of underlying data indexed by population and the combining of populations, allowing intuitive and straightforward comparison of population groups. All the information served is optimized for web display, and most of the computations are already pre-processed in the data mart to speed up the data browsing and any computational treatment requested. In practice, SPSmart allows populations to be combined into user-defined groups, while multiple databases can be accessed and compared in a few simple steps from a single query. It performs the queries rapidly and gives straightforward graphical summaries of SNP population variability through visual inspection of allele frequencies outlined in standard pie-chart format. In addition, full numerical description of the data is output in statistical results panels that include common population genetics metrics such as heterozygosity, Fst and In.

...read moreread less

116 citations

Journal Article•DOI•

Two approaches to the integration of heterogeneous data warehouses

[...]

Riccardo Torlone¹•Institutions (1)

Roma Tre University¹

01 Feb 2008-Distributed and Parallel Databases

TL;DR: This paper starts by tackling the basic issue of matching heterogeneous dimensions and provides a number of general properties that a dimension matching should fulfill, and proposes two different approaches to the problem of integration that try to enforce matchings satisfying these properties.

...read moreread less

Abstract: In this paper we address the problem of integrating independent and possibly heterogeneous data warehouses, a problem that has received little attention so far, but that arises very often in practice. We start by tackling the basic issue of matching heterogeneous dimensions and provide a number of general properties that a dimension matching should fulfill. We then propose two different approaches to the problem of integration that try to enforce matchings satisfying these properties. The first approach refers to a scenario of loosely coupled integration, in which we just need to identify the common information between data sources and perform join operations over the original sources. The goal of the second approach is the derivation of a materialized view built by merging the sources, and refers to a scenario of tightly coupled integration in which queries are performed against the view. We also illustrate architecture and functionality of a practical system that we have developed to demonstrate the effectiveness of our integration strategies.

...read moreread less

56 citations

Journal Article•DOI•

Business and e-government intelligence for strategically leveraging information retrieval

[...]

Alan D. Smith¹•Institutions (1)

Robert Morris University¹

14 Jan 2008-Electronic Government, An International Journal

TL;DR: The figures reported in the paper should support the notion that BGIS-related systems' applications are potentially a good investment and worthy of considerable research in the knowledge management fields.

...read moreread less

Abstract: The purpose of this paper is to provide practitioners of information exploration of the need for Business and e-Government Intelligence Systems (BGIS), the role such intelligence plays in competitive market research, industry, through the comparison of vendors, advantages and disadvantages, comparing the costs and benefits and some future insights. A review of the applied literature on topics that focus is on utilising Business Intelligence (BI) as a competitive tool in an online retrieval environment. The growth for BI systems may be dramatic [actual (2004, $5.3 billion; 2004, $5.6 billion) and predicted growth (2005, $6 billion; 2006, $6.5 billion; 2007, $7 billion and in 2008, $7.3 billion)], its associated costs may be equally stunning, especially in end-user query, reporting, analysis, data-mining applications and packaged data mart and/or warehousing applications. However, the figures reported in the paper should support the notion that BGIS-related systems' applications are potentially a good investment and worthy of considerable research in the knowledge management fields.

...read moreread less

23 citations

Journal Article•DOI•

Original papers: Design and development of data mart for animal resources

[...]

Anil Rai¹, Vipin Dubey¹, K. K. Chaturvedi¹, P. K. Malhotra¹•Institutions (1)

Indian Agricultural Statistics Research Institute¹

01 Dec 2008-Computers and Electronics in Agriculture

TL;DR: This article will provide guidelines for design and development of similar complex data marts in agricultural sector, particularly in the field of livestock management with special reference to animal resource management.

...read moreread less

20 citations

Book Chapter•DOI•

Integrated Model-Driven Development of Goal-Oriented Data Warehouses and Data Marts

[...]

Jesús Pardillo¹, Juan Trujillo¹•Institutions (1)

University of Alicante¹

20 Oct 2008

TL;DR: This approach consists on linking information requirements to specific data marts elicited by using the goal-oriented requirement engineering, which are automatically translated into the implementation of the corresponding data repositories by means of model-driven engineeringtechniques.

...read moreread less

Abstract: A corporate data warehouseis a repository that provides decision makers with a large amount of historical data concerning the overall enterprise strategy. In order to customize the data warehouse, many organizations develop concrete data martsfocused on a particular department or business process. However, their integrated development is still an open problem for many organizations due to the technical and organizational challenges involved during the design of these repositories as a complete solution. Therefore, we present here a design approach in order to build both the corporate data warehouse and data marts from user's requirements in an integrated way. Our approach consists on linking information requirements to specific data marts elicited by using the goal-oriented requirement engineering, which are automatically translated into the implementation of the corresponding data repositories by means of model-driven engineeringtechniques. Its great advantage is that user's requirements are captured since the very-early development stages of a data-warehousing project in order to automatically translate them into the entire data-warehousing platform.

...read moreread less

11 citations

Proceedings Article•

Evaluating business intelligence platforms: a case study

[...]

Carlo Dell'Aquila¹, Francesco Di Tria¹, Ezio Lefons¹, Filippo Tangorra¹•Institutions (1)

University of Bari¹

20 Feb 2008

TL;DR: The paper examines some common platforms supporting Business Intelligence activities in order to state evaluation criteria for the system choice and experimental results are reported which show the advantages and the drawbacks of each considered system.

...read moreread less

Abstract: The paper examines some common platforms supporting Business Intelligence activities in order to state evaluation criteria for the system choice. The evaluation considers a software measurement method based on the analysis of the functional complexity of the platforms. The study has been performed on an academic warehouse that uses historical data available in legacy databases. Experimental results are reported which show the advantages and the drawbacks of each considered system.

...read moreread less

11 citations

Journal Article•

Business intelligence systems: a comparative analysis

[...]

Carlo Dell'Aquila¹, Francesco Di Tria¹, Ezio Lefons¹, Filippo Tangorra¹•Institutions (1)

University of Bari¹

01 May 2008-WSEAS Transactions on Information Science and Applications archive

TL;DR: A set of evaluation criteria is described and considered for comparing some popular OLAP systems that support Business Intelligence that involve critical aspects such as: information delivery, system and user administration, and OLAP queries.

...read moreread less

Abstract: A set of evaluation criteria is described and considered for comparing some popular OLAP systems that support Business Intelligence. These criteria involve critical aspects such as: information delivery, system and user administration, and OLAP queries. The measurement method is based on the functional complexity analysis. Experimental results have been carried out using a data warehouse in academic environment and they allow to evidence the weaknesses and the points of force of each compared system.

...read moreread less

10 citations

Posted Content•

Web-enabled Data Warehouse and Data Webhouse

[...]

Cerasela Pirvu, Ion Buligiu, Anca Mehedintu

01 Jan 2008

TL;DR: The objectives are to understanding what data warehouse means examine the reasons for doing so, appreciate the implications of the convergence of Web technologies and those of the data warehouse and examine the steps for building a Web-enabled data warehouse.

...read moreread less

Abstract: In this paper, our objectives are to understanding what data warehouse means examine the reasons for doing so, appreciate the implications of the convergence of Web technologies and those of the data warehouse and examine the steps for building a Web-enabled data warehouse. The web revolution has propelled the data warehouse out onto the main stage, because in many situations the data warehouse must be the engine that controls or analysis the web experience. In order to step up to this new responsibility, the data warehouse must adjust. The nature of the data warehouse needs to be somewhat different. As a result, our data warehouses are becoming data webhouses. The data warehouse is becoming the infrastructure that supports customer relationship management (CRM). And the data warehouse is being asked to make the customer clickstream available for analysis. This rebirth of data warehousing architecture is called the data webhouse.

...read moreread less

8 citations

Book Chapter•DOI•

Administering and Managing a Data Warehouse

[...]

James E. Yao¹, Chang Liu², Qiyang Chen¹, June Lu³•Institutions (3)

Montclair State University¹, Northern Illinois University², University of Houston–Victoria³

01 Jan 2008

TL;DR: Data warehousing has been increasingly recognized as an effective tool for organizations to transform data into useful information for strategic decision-making and to achieve competitive advantages via data warehousing, data warehouse management is crucial.

...read moreread less

Abstract: As internal and external demands on information from managers are increasing rapidly, especially the information that is processed to serve managers’ specific needs, regular databases and decision support systems (DSS) cannot provide the information needed. Data warehouses came into existence to meet these needs, consolidating and integrating information from many internal and external sources and arranging it in a meaningful format for making accurate business decisions (Martin, 1997). In the past five years, there has been a significant growth in data warehousing (Hoffer, Prescott, & McFadden, 2005). Correspondingly, this occurrence has brought up the issue of data warehouse administration and management. Data warehousing has been increasingly recognized as an effective tool for organizations to transform data into useful information for strategic decision-making. To achieve competitive advantages via data warehousing, data warehouse management is crucial (Ma, Chou, & Yen, 2000).

...read moreread less

7 citations

Proceedings Article•

Business intelligence solution for university management

[...]

Carlo Dell'Aquila¹, Francesco Di Tria¹, Ezio Lefons¹, Filippo Tangorra¹•Institutions (1)

University of Bari¹

02 May 2008

TL;DR: In this article, the authors describe a proposal of the architecture of a Business Intelligence system and the flow of data processing for the University of Aarhus in Denmark, which is based on the same model as the one described in this paper.

...read moreread less

Abstract: Traditional users of data warehouses were banks, financial services, or chains of supermarkets. Instead, Institutional Organizations (e.g. Academies) in the past did not use the large amount of transactional data for strategic decision making. The optimal management of a University can now be considered as critical as the management of a big enterprise. In fact, the factors affecting the management of a University are the same involved in the business processes. The paper describes a proposal of the architecture of a Business Intelligence system and the flow of data processing for our University.

...read moreread less

4 citations

Book Chapter•DOI•

Data Warehousing for Decision Support

[...]

John Wang¹, James E. Yao¹, Qiyang Chen¹•Institutions (1)

Montclair State University¹

01 Jan 2008

Journal Article•

Technology for the construction of the geoscience spatial data warehouse.

[...]

Wang Yong-zhi

01 Jan 2008-Geological bulletin of China

TL;DR: The authors present a geoscience spatial data warehouse architecture that conforms to China's national conditions and have five levels, i.e. the data source, spatial ETL, spatial data storage, application service based on SOA and client application, and a three-level physical deployment scheme.

...read moreread less

Abstract: The authors took the geoscience spatial data warehouse as a scheme of data integration in order to integrate multi-source, heterogeneous and disperse geological data of China and provide effective data for resource assessmentThey for the first time present a geoscience spatial data warehouse architecture that conforms to China's national conditions and have five levels,iethe data source, spatial ETL,spatial data storage,application service based on SOA and client applicationThe authors designed a three-level(state,ad- ministrative regions and provinces)physical deployment scheme for the geoscience spatial data warehouse system according to the ad- ministration regions of China's geological work and distribution of dataIt can realize the objectives of geoscience data integration Research results show that this is a complete and feasible geoscience data integration scheme that conforms to the actual situation of geoscience of China

...read moreread less

Book•

Data Mart Based Research in Heart Surgery

[...]

Bert Arnrich

07 Aug 2008

TL;DR: The proposed data mart based information system has proven to be useful and effective in the particular application domain of clinical research in heart surgery and integrates the current and historical data from all relevant data sources without imposing any considerable operational or liability contract risk for the existing hospital information systems.

...read moreread less

Abstract: The proposed data mart based information system has proven to be useful and effective in the particular application domain of clinical research in heart surgery. In contrast to common data warehouse systems who are focused primarily on administrative, managerial, and executive decision making, the primary objective of the designed and implemented data mart was to provide an ongoing, consolidated and stable research basis. Beside detail-oriented patient data also aggregated data are incorporated in order to fulfill multiple purposes. Due to the chosen concept, this technique integrates the current and historical data from all relevant data sources without imposing any considerable operational or liability contract risk for the existing hospital information systems (HIS). By this means the possible resistance of involved persons in charge can be minimized and the project specific goals effectively met. The challenges of isolated data sources, securing a high data quality, data with partial redundancy and consistency, valuable legacy data in special file formats, and privacy protection regulations are met with the proposed data mart architecture. The applicability was demonstrated in several fields, including (i) to permit easy comprehensive medical research, (ii) to assess preoperative risks of adverse surgical outcomes, (iii) to get insights into historical performance changes, (iv) to monitor surgical results, (v) to improve risk estimation, and (vi) to generate new knowledge from observational studies. The data mart approach allows to turn redundant data from the electronically available hospital data sources into valuable information. On the one hand, redundancies are used to detect inconsistencies within and across HIS. On the other hand, redundancies are used to derive attributes from several data sources which originally did not contain the desired semantic meaning. Appropriate verification tools help to inspect the extraction and transformation processes in order to ensure a high data quality. Based on the verification data stored during data mart assembly, various aspects on the basis of an individual case, a group, or a specific rule can be inspected. Invalid values or inconsistencies must be corrected in the primary source data bases by the health professionals. Due to all modifications are automatically transferred to the data mart system in a subsequent cycle, a consolidated and stable research data base is achieved throughout the system in a persistent manner. In the past, performing comprehensive observational studies at the Heart Institute Lahr had been extremely time consuming and therefore limited. Several attempts had already been conducted to extract and combine data from the electronically available data sources. Dependent on the desired scientific task, the processes to extract and connect the data were often rebuilt and modified. Consequently the semantics and the definitions of the research data changed from one study to the other. Additionally, it was very difficult to maintain an overview of all data variants and derived research data sets. With the implementation of the presented data mart system the most time and effort consuming process with conducting successful observational studies could be replaced and the research basis remains stable and leads to reliable results.

...read moreread less

Book Chapter•DOI•

A brief history of data warehousing and first-generation data warehouses

[...]

William H. Inmon, Derek Strauss, Genia Neushloss

01 Jan 2008

TL;DR: This chapter provides an overview of the history of data warehousing with a focus on the first-generation data warehouse, which evolved to include disciplined data ETL from legacy applications in a granular, historical, integrated data warehouse.

...read moreread less

Abstract: This chapter provides an overview of the history of data warehousing. Data warehousing has come a long way since the frustrating days when user data was limited to operational application data that was accessible only through an IT department intermediary. Data warehousing has evolved to meet the needs of end users who require integrated, historical, granular, flexible, and accurate information. The first-generation data warehouse evolved to include disciplined data ETL (extract/transform/load) from legacy applications in a granular, historical, integrated data warehouse. With the growing popularity of data warehousing came numerous changes—volumes of data, a spiral development approach, heuristic processing, and more. As the evolution of data warehousing continued, some mutant forms emerged like active data warehousing, federated data warehousing, star schema data warehouses, and data mart data warehouses. While each of these mutant forms of data warehousing has some advantages, they also have introduced a host of new and significant disadvantages. Therefore, the time for the next generation of data warehousing has come.

...read moreread less

A regulatory impact analysis (ria) approach based on evolutionary association patterns

[...]

Alfonso Iodice D'Enza, Francesco Palumbo

01 Jan 2008

TL;DR: A suitable use of multidimensional data analysis (MDA) is proposed to investigate the associations characterizing the indicators/attributes of the system to asses the impact of an adopted policy by measuring system performance.

...read moreread less

Abstract: The present paper focuses on ex post analysis to asses the impact of an adopted policy by measuring system performance. Since accurate impact assessment requires in-depth knowledge of the structure underlying the system, this contribution proposes a suitable use of multidimensional data analysis (MDA) to investigate the associations characterizing the indicators/attributes of the system. The general aim is to identify homogeneous subsets of objects that are described by subsets of attributes. This approach was planned to study students performance in Italian universities: the focus is on student careers. The example data set is a data mart selected from the University of Macerata data base and refers to the students at the Economics Faculty from 2001 to 2007.

...read moreread less

Time histograms with interactive selection of time unit and dimension

[...]

Snezana Savoska, Suzana Loskovska, Vlatko Blazeski

17 Oct 2008

TL;DR: Time Histograms with Interactive Selection of Time Unit and Dimension (THISTUD) as mentioned in this paper is an improved technique of time histograms, which can be used to improve the usability and efficiency of the sale data analysis methods.

...read moreread less

Abstract: Many researchers are working on the improvement of the usability and efficiency of the sale data analysis methods which are required by the users – analytical staff and managers. In this paper, we present the improved technique of time histograms. We named proposed method Time Histograms with Interactive Selection of Time Unit and Dimension (THISTUD). Performed modifications and developed interactive user interface are described. The system is tested on the data warehouse that includes data for the ten years. The created sale data mart and visualization results are presented in the paper too.

...read moreread less

Proceedings Article•

Using a diabetes data mart in individualizing diabetes management.

[...]

Youssef T. Al-Sheikh¹, Peter J. Haug², Anthony Wong¹, Homer R. Warner¹, Alan H. Morris¹, Katherine A. Sward¹ - Show less +2 more•Institutions (2)

University of Utah¹, Intermountain Healthcare²

06 Nov 2008

TL;DR: Initial design of a "patient-specific" hybrid system (physiological-causal probabilistic) of adaptive diabetes models and insulin treatment algorithms will be presented.

...read moreread less

Abstract: Constantly changing diabetes care standards makes it challenging to deliver care adapted to the unique condition of the individual patient. The availability of large amounts of data from patient's electronic medical records makes it possible to individualize diabetes management. Initial design of a "patient-specific" hybrid system (physiological-causal probabilistic) of adaptive diabetes models and insulin treatment algorithms will be presented. The system is uniquely derived and tested using a diabetes data mart of about 33,000 patients.

...read moreread less

Journal Article•

A method of product design knowledge processing based on data mart

[...]

Shao Fa-sheng

01 Jan 2008-Manufacturing Automation

TL;DR: Building data mart of product design case for corporations, a knowledge processing method for product design was presented and a new discrete method was put forward, and the veracity and efficiency of case retrieval has been improved.

...read moreread less

Abstract: Data mart is a cheap method to give management analysis for product design knowledge processing.Building data mart of product design case for corporations,a knowledge processing method for product design was presented.Designers could input inquiries into Case-Based Reasoning(CBR) system.Then On-Line Analysis and Processing(OLAP) drilled down and found the similar case.Knowledge reduction techniques were adopted to reduce the retrieved similar cases output from OLAP,which improved CBR.Rough set theory was applied to calculate the important degree of each feature attribute and remove the redundant ones.And to deal with the quantitative features,a new discrete method was put forward,and the veracity and efficiency of case retrieval has been improved.The last an example was introduced.

...read moreread less

Patent•

Translational data mart

[...]

Xiaoming Wang, Olufunmilayo I. Olopade

07 Feb 2008

TL;DR: In this paper, the authors present methods and systems for assembling, managing, and using a continuous translational data system. But they do not provide a detailed description of the system itself.

...read moreread less

Abstract: Provided are methods and systems for assembling, managing, and using a continuous translational data system.

...read moreread less

Development of a Physician Profiling Data Mart

[...]

Connie Chambers

01 Jan 2008

TL;DR: A design research project was undertaken to demonstrate that an Access-based data mart could successfully streamline this report generating process and demonstrate the need to eliminate excessive detail and deliver highly summarized reports.

...read moreread less

Abstract: Hospitals and medical centers participate in a physician profiling process. This process is important to ensure that physicians are providing safe care and to comply with regulations. One medical center was struggling with the ongoing generation of physician performance reports that were an important part of the profiling process. A design research project was undertaken to demonstrate that an Access-based data mart could successfully streamline this report generating process. The research also demonstrated the need to eliminate excessive detail and deliver highly summarized reports. In addition, the research provided thorough documentation of the entire data mart development approach. This documentation can serve as a resource for future research and/or for other medical centers that might be struggling to manage the profiling report requirements. Profiling Data Mart vii Table of

...read moreread less

Proceedings Article•DOI•

Viability of in-house datamarting approaches for population genetics analysis of snp genotypes

[...]

Jorge Amigo¹, Christopher Phillips¹, Antonio Salas¹•Institutions (1)

University of Santiago de Compostela¹

30 Oct 2008

TL;DR: This article describes the viability of implementing scripts for handling extensive datasets of SNP genotypes with low computational costs, and shows that the updating of these data marts is straightforward, permitting easy implementation of new external data and the computation of new statistical indices.

...read moreread less

Abstract: Databases containing very large amounts of SNP (Single Nucleotide Polymorphism) data are now freely available for researchers interested in medical or population genetics applications. While many of these SNP repositories have implemented data retrieval tools for general purpose mining, these alone cannot cover the broad spectrum of needs of most medical and population genetics studies. To address this limitation, we propose building in-house customized data marts from the raw data provided by the largest public databases. In particular, for population genetics analysis based on genotypes we propose building a set of data processing scripts that would deal with raw data coming from the major SNP variation databases (e.g. HapMap, Perlegen) that can be stripped into single genotypes and then grouped into populations. This allows not only in-house standardization and normalization of the genotyping data retrieved from different repositories, but also the calculation of statistical indices from simple allele frequency estimates up to elaborate genetic differentiation tests within populations, together with the ability to combine population samples from different databases. This article describes the viability of implementing scripts for handling extensive datasets of SNP genotypes with low computational costs, and shows that the updating of these data marts is straightforward, permitting easy implementation of new external data and the computation of new statistical indices.

...read moreread less

Book Chapter•DOI•

Organizational Data Warehousing

[...]

John Wang¹, Xiaohua Hu², Dan Zhu³•Institutions (3)

Montclair State University¹, Drexel University², Iowa State University³

01 Jan 2008

Book Chapter•DOI•

Chapter 21 – Miscellaneous topics

[...]

W.H. Inmon

01 Jan 2008

TL;DR: This chapter focuses on topics like data marts, monitoring the DW 2.0 environment, moving data from one data mart to another, what to do about bad data, the speed of the movement of data within DW 2,0, and data warehouse utilities.

...read moreread less

Abstract: Publisher Summary This chapter focuses on topics like data marts, monitoring the DW 2.0 environment, moving data from one data mart to another, what to do about bad data, the speed of the movement of data within DW 2.0, and data warehouse utilities. DW 2.0 is presented as a representation of the base data that resides at the core of the DW 2.0 enterprise data warehouse. However, there are independent structures that use that data for analytical purposes. The exploration facility is one such structure. Another structure that takes data from DW 2.0 is the data mart. Data marts contain departmental data for the purpose of decision making. There are lots of reasons for the creation of a data mart, including the cost of machine cycles is low, the end user has control, and the performance of the DW 2.0 environment is enhanced. When bad data enters the DW 2.0 environment, the source of the bad data should be identified and corrected, a balancing entry can be created, a value may be reset, and actual corrections can be made to the data.

...read moreread less