Topic

Data warehouse

About: Data warehouse is a research topic. Over the lifetime, 15903 publications have been published within this topic receiving 304655 citations. The topic is also known as: DWH & data warehousing.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

[...]

Sergey Melnik¹, Hector Garcia-Molina¹, Erhard Rahm²•Institutions (2)

Stanford University¹, Leipzig University²

26 Feb 2002

TL;DR: This paper presents a matching algorithm based on a fixpoint computation that is usable across different scenarios and conducts a user study, in which the accuracy metric was used to estimate the labor savings that the users could obtain by utilizing the algorithm to obtain an initial matching.

...read moreread less

Abstract: Matching elements of two data schemas or two data instances plays a key role in data warehousing, e-business, or even biochemical applications. In this paper we present a matching algorithm based on a fixpoint computation that is usable across different scenarios. The algorithm takes two graphs (schemas, catalogs, or other data structures) as input, and produces as output a mapping between corresponding nodes of the graphs. Depending on the matching goal, a subset of the mapping is chosen using filters. After our algorithm runs, we expect a human to check and if necessary adjust the results. As a matter of fact, we evaluate the 'accuracy' of the algorithm by counting the number of needed adjustments. We conducted a user study, in which our accuracy metric was used to estimate the labor savings that the users could obtain by utilizing our algorithm to obtain an initial matching. Finally, we illustrate how our matching algorithm is deployed as one of several high-level operators in an implemented testbed for managing information models and mappings.

...read moreread less

1,613 citations

Journal Article•DOI•

An empirical investigation of the factors affecting data warehousing success

[...]

Barbara H. Wixom¹, Hugh J. Watson²•Institutions (2)

University of Virginia¹, University of Georgia²

01 Mar 2001-Management Information Systems Quarterly

TL;DR: It was found that management support and resources help to address organizational issues that arise during warehouse implementations; resources, user participation, and highly-skilled project team members increase the likelihood that warehousing projects will finish on-time, on-budget, with the right functionality; and diverse, unstandardized source systems and poor development technology will increase the technical issues that project teams must overcome.

...read moreread less

Abstract: The IT implementation literature suggests that various implementation factors play critical roles in the success of an information system; however, there is little empirical research about the implementation of data warehousing projects. Data warehousing has unique characteristics that may impact the importance of factors that apply to it. In this study, a cross-sectional survey investigated a model of data warehousing success. Data warehousing managers and data suppliers from 111 organizations completed paired mail questionnaires on implementation factors and the success of the warehouse. The results from a Partial Least Squares analysis of the data identified significant relationships between the system quality and data quality factors and perceived net benefits. It was found that management support and resources help to address organizational issues that arise during warehouse implementations; resources, user participation, and highly-skilled project team members increase the likelihood that warehousing projects will finish on-time, on-budget, with the right functionality; and diverse, unstandardized source systems and poor development technology will increase the technical issues that project teams must overcome. The implementation's success with organizational and project issues, in turn, influence the system quality of the data warehouse; however, data quality is best explained by factors not included in the research model.

...read moreread less

1,579 citations

Proceedings Article•

Generic Schema Matching with Cupid

[...]

Jayant Madhavan¹, Philip A. Bernstein², Erhard Rahm³•Institutions (3)

University of Washington¹, Microsoft², Leipzig University³

11 Sep 2001

TL;DR: This paper proposes a new algorithm, Cupid, that discovers mappings between schema elements based on their names, data types, constraints, and schema structure, using a broader set of techniques than past approaches.

...read moreread less

Abstract: Schema matching is a critical step in many applications, such as XML message mapping, data warehouse loading, and schema integration. In this paper, we investigate algorithms for generic schema matching, outside of any particular data model or application. We first present a taxonomy for past solutions, showing that a rich range of techniques is available. We then propose a new algorithm, Cupid, that discovers mappings between schema elements based on their names, data types, constraints, and schema structure, using a broader set of techniques than past approaches. Some of our innovations are the integrated use of linguistic and structural matching, context-dependent matching of shared types, and a bias toward leaf structure where much of the schema content resides. After describing our algorithm, we present experimental results that compare Cupid to two other schema matching systems.

...read moreread less

1,533 citations

Journal Article•DOI•

Scalable SQL and NoSQL data stores

[...]

Rick Cattell

06 May 2011

TL;DR: This paper examines a number of SQL and socalled "NoSQL" data stores designed to scale simple OLTP-style application loads over many servers, and contrasts the new systems on their data model, consistency mechanisms, storage mechanisms, durability guarantees, availability, query support, and other dimensions.

...read moreread less

Abstract: In this paper, we examine a number of SQL and socalled "NoSQL" data stores designed to scale simple OLTP-style application loads over many servers. Originally motivated by Web 2.0 applications, these systems are designed to scale to thousands or millions of users doing updates as well as reads, in contrast to traditional DBMSs and data warehouses. We contrast the new systems on their data model, consistency mechanisms, storage mechanisms, durability guarantees, availability, query support, and other dimensions. These systems typically sacrifice some of these dimensions, e.g. database-wide transaction consistency, in order to achieve others, e.g. higher availability and scalability.

...read moreread less

1,412 citations

Journal Article•DOI•

Building the data warehouse

[...]

Stephen R. Gardner¹•Institutions (1)

NCR Corporation¹

01 Sep 1998-Communications of The ACM

TL;DR: Y the authors' company decides to build a data warehouse and you are designated the project manager, and you have specific questions that need specific answers, and building a data Warehouse is an extremely complex process.

...read moreread less

Abstract: Y our company decides to build a data warehouse and you are designated the project manager. What are your first steps? You’ve read the books, attended the conferences, and perused the trade publications. Now you have to act. There are numerous vendors, all touting the wonders of their products, but you have specific questions that need specific answers, and building a data warehouse is an extremely complex process. Questions you have to weigh fall into the following general categories:

...read moreread less

1,272 citations

Collapse

Network Information

Performance

Metrics

16,502

Papers

319,557

Citations

No. of papers in the topic in previous years
Year	Papers
2023	169
2022	432
2021	295
2020	426
2019	558
2018	595

Data warehouse

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics