Topic

Data mart

About: Data mart is a research topic. Over the lifetime, 559 publications have been published within this topic receiving 8550 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Patent•

Defining a data analysis process

[...]

Marcus Dill, Harish Hoskere Mahabal, Torsten Bachmann

05 Apr 2004

TL;DR: In this paper, a data analysis workbench enables a user to define data analysis process that includes an extract sub-process to obtain transactional data from a source system, a load subprocess for providing the extracted data to a data warehouse or data mart, a data mining analysis subprocess to use the obtained transactional datasets, and a deployment subprocess is used to make the data mining results accessible by another computer program.

...read moreread less

Abstract: A data analysis workbench enables a user to define a data analysis process that includes an extract sub-process to obtain transactional data from a source system, a load sub-process for providing the extracted data to a data warehouse or data mart, a data mining analysis sub-process to use the obtained transactional data, and a deployment sub-process to make the data mining results accessible by another computer program. Common settings used by each of the sub-processes are defined, as are specialized settings relevant to each of the sub-processes. The invention also enables a user to define an order in which the defined sub-processes are to be executed.

...read moreread less

38 citations

Book Chapter•DOI•

Information Retrieval System for Medical Narrative Reports

[...]

Lior Rokach¹, Oded Maimon¹, Mordechai Averbuch¹•Institutions (1)

Tel Aviv University¹

24 Jun 2004

TL;DR: The comparison indicates that the proposed system can be used for non-critical case-finding applications such as: finding appropriate patients for clinical trials and several caching mechanisms.

...read moreread less

Abstract: This paper presents a novel information retrieval system designed specifically for medical case finding applications. The proposed system begins by extracting medical information from free-text narrative reports and storing it in a predefined relational clinical data mart. The extraction is performed using a medical thesaurus and a regular expression pattern match. Following the extraction phase, inclusion/exclusion criteria are provided to the system using a physician-friendly user interface. The system converts the entered criteria into a single SQL command which can be then executed on the relational data mart. In order to achieve the appropriate response time required for on-line analysis, the system implements several caching mechanisms. The proposed system has been examined on real-world database. The performance of the system has been compared to the results obtained manually by a physician. The comparison indicates that the proposed system can be used for non-critical case-finding applications such as: finding appropriate patients for clinical trials.

...read moreread less

38 citations

Book•

Building a Scalable Data Warehouse with Data Vault 2.0

[...]

Daniel Linstedt, Michael Olschimke

15 Sep 2015

TL;DR: "Building a Scalable Data Warehouse" covers everything one needs to know to create a scalable data warehouse end to end, including a presentation of the Data Vault modeling technique, which provides the foundations to creating a technical data warehouse layer.

...read moreread less

Abstract: The Data Vault was invented by Dan Linstedt at the U.S. Department of Defense, and the standard has been successfully applied to data warehousing projects at organizations of different sizes, from small to large-size corporations. Due to its simplified design, which is adapted from nature, the Data Vault 2.0 standard helps prevent typical data warehousing failures. "Building a Scalable Data Warehouse" covers everything one needs to know to create a scalable data warehouse end to end, including a presentation of the Data Vault modeling technique, which provides the foundations to create a technical data warehouse layer. The book discusses how to build the data warehouse incrementally using the agile Data Vault 2.0 methodology. In addition, readers will learn how to create the input layer (the stage layer) and the presentation layer (data mart) of the Data Vault 2.0 architecture including implementation best practices. Drawing upon years of practical experience and using numerous examples and an easy to understand framework, Dan Linstedt and Michael Olschimke discuss: How to load each layer using SQL Server Integration Services (SSIS), including automation of the Data Vault loading processes. Important data warehouse technologies and practices. Data Quality Services (DQS) and Master Data Services (MDS) in the context of the Data Vault architecture. Provides a complete introduction to data warehousing, applications, and the business context so readers can get-up and running fast Explains theoretical concepts and provides hands-on instruction on how to build and implement a data warehouseDemystifies data vault modeling with beginning, intermediate, and advanced techniquesDiscusses the advantages of the data vault approach over other techniques, also including the latest updates to Data Vault 2.0 and multiple improvements to Data Vault 1.0

...read moreread less

37 citations

Patent•

Real time sessions in an analytic application

[...]

Mohan Sankaran, Sanjeev K. Gupta, Sen Ma

25 Jun 2001

TL;DR: In this article, a method and system for performing real-time transformations of dynamically increasing databases is described, in which a session, identified as a real time session, is initialized.

...read moreread less

Abstract: A method and system thereof for performing real time transformations of dynamically increasing databases is described. A session, identified as a real time session, is initialized. The real time session repeatedly executes a persistent (e.g., continually running) data transport pipeline of the analytic application. The data transport pipeline extracts data from a changing database, transforms the data, and writes the transformed data to storage (e.g., a data warehouse or data mart). The data transport pipeline is executed at the end of each time interval in a plurality of contiguous time intervals occurring during the real time session. The data transport pipeline remains running after it is executed, until the real time session is completed. Accordingly, new data are transformed in a timely manner, and processing resources are not consumed by having to repeatedly re-establish (re-initialize) the data transport pipeline.

...read moreread less

36 citations

Journal Article•DOI•

A personalization process for spatial data warehouse development

[...]

Octavio Glorio¹, Jose-Norberto Mazón¹, Irene Garrigós¹, Juan Trujillo¹•Institutions (1)

University of Alicante¹

01 Mar 2012

TL;DR: This work argues for considering spatiality as a personalization feature within a formal design process so that each decision maker will be able to access its own personalized SMD schema with its required spatial structures and instances, suitable to be properly analyzed at a glance.

...read moreread less

Abstract: Spatial data warehouses (SDW) rely on extended multidimensional (MD) models in order to provide decision makers with appropriate structures to intuitively explore spatial data by using different analysis techniques such as OLAP (On-Line Analytical Processing) or data mining. Current development approaches are focused on defining a unique and static Spatial multidimensional (SMD) schema at the conceptual level over which all decision makers fulfill their current spatial information needs. However, considering the required spatiality for each decision maker is likely to derive in a potentially misleading SMD schema (even if a departmental DW or data mart is being defined). Furthermore, spatial needs of each decision maker could change over time or depending on the context, thus requiring the SMD schema to be continuously updated with changes that can hamper decision making. Therefore, if a unique and static SMD schema is designed, acquiring the required spatial information is more costly than expected for decision makers and they may get frustrated during the analysis. To overcome these drawbacks, we argue for considering spatiality as a personalization feature within a formal design process. In this way, each decision maker will be able to access its own personalized SMD schema with its required spatial structures and instances, suitable to be properly analyzed at a glance. Our approach considers several novel artifacts: (i) a UML profile for spatial multidimensional modeling at the conceptual level, (ii) a spatial-aware user model in order to define decision maker profile; and (iii) a spatial personalization language to define spatial needs of decision makers as personalization rules. The definition of personalized SMD schemas by using these artifacts is formally defined using the Software Process Engineering Metamodel Specification (SPEM) standard. Finally, the applicability of our approach is shown through a running example based on our Eclipse-based tool for SDW development.

...read moreread less

35 citations

Collapse

Network Information

Performance

Metrics

559

Papers

8,906

Citations

No. of papers in the topic in previous years
Year	Papers
2021	13
2020	20
2019	26
2018	23
2017	26
2016	27

Data mart

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics