scispace - formally typeset

Topic

Data mart

About: Data mart is a(n) research topic. Over the lifetime, 559 publication(s) have been published within this topic receiving 8550 citation(s).


Papers
More filters
Book
01 Jan 1992
TL;DR: This Second Edition of Building the Data Warehouse is revised and expanded to include new techniques and applications of data warehouse technology and update existing topics to reflect the latest thinking.
Abstract: From the Publisher: The data warehouse solves the problem of getting information out of legacy systems quickly and efficiently. If designed and built right, data warehouses can provide significant freedom of access to data, thereby delivering enormous benefits to any organization. In this unique handbook, W. H. Inmon, "the father of the data warehouse," provides detailed discussion and analysis of all major issues related to the design and construction of the data warehouse, including granularity of data, partitioning data, metadata, lack of creditability of decision support systems (DSS) data, the system of record, migration and more. This Second Edition of Building the Data Warehouse is revised and expanded to include new techniques and applications of data warehouse technology and update existing topics to reflect the latest thinking. It includes a useful review checklist to help evaluate the effectiveness of the design.

2,820 citations

Patent
26 Sep 2001
Abstract: Processed payment transaction records of consumer and business payers are received into a multi-dimensional networked data mart from databases originating from a multitude of financial institutions and payment processors. A post-processor linked to the data mart assigns all such transaction records with universal consumer and business expenditure categories used for payer financial management. Post-processed payment transaction records are indexed in the data mart by time, geography, and the universal consumer and business expenditure categories. Mathematical and analytical tools are applied to aggregated payment transaction records according to geographic, topographical, meteorological, chronological, demographic and other parameters. Endusers interact electronically with the data mart to view, create, synthesize and receive post-processed payment data for economic, investment, business, and marketing analysis.

273 citations

01 Jan 2000
TL;DR: A method for developing dimensional models from traditional Entity Relationship models, which can be used to design data warehouses and data marts based on enterprise data models is described.
Abstract: This paper describes a method for developing dimensional models from traditional Entity Relationship models. This can be used to design data warehouses and data marts based on enterprise data models. The first step of the method involves classifying entities in the data model into a number of categories. The second step involves identifying hierarchies that exist in the model. The final step involves collapsing these hierarchies and aggregating transaction data to form dimensional models. A number of design alternatives are presented, including a flat schema, a terraced schema, a star schema and a snowflake schema. We also define a new type of schema called a star cluster schema. This is a restricted form of snowflake schema, which minimises the number of tables while avoiding overlap between different dimensional hierarchies. Individual schemas can be collected together to form constellations or galaxies. We illustrate the method using a simple example.

264 citations

Journal ArticleDOI
TL;DR: The result shows that the K-Nearest Neighbor classifier is transparent, consistent, straightforward, simple to understand, high tendency to possess desirable qualities and easy to implement than most other machine learning techniques specifically when there is little or no prior knowledge about data distribution.
Abstract: The major problem of many on-line web sites is the presentation of many choices to the client at a time; this usually results to strenuous and time consuming task in finding the right product or information on the site. In this work, we present a study of automatic web usage data mining and recommendation system based on current user behavior through his/her click stream data on the newly developed Really Simple Syndication (RSS) reader website, in order to provide relevant information to the individual without explicitly asking for it. The K-Nearest-Neighbor (KNN) classification method has been trained to be used on-line and in Real-Time to identify clients/visitors click stream data, matching it to a particular user group and recommend a tailored browsing option that meet the need of the specific user at a particular time. To achieve this, web users RSS address file was extracted, cleansed, formatted and grouped into meaningful session and data mart was developed. Our result shows that the K-Nearest Neighbor classifier is transparent, consistent, straightforward, simple to understand, high tendency to possess desirable qualities and easy to implement than most other machine learning techniques specifically when there is little or no prior knowledge about data distribution.

229 citations

Patent
16 Jul 2001
Abstract: A method for logical view visualization of user behavior in a networked computer environment that includes sites that a user may visit and wherein the sites comprise pages that the user may view and/or resources that the user may request includes the step of collecting raw data representing user behavior which can include requesting resources, viewing pages and visiting sites by the user. This raw data is then refined or pre-processed into page views and visit data and stored in a data mart. Pages are clustered into super pages, and page to super page mappings reflecting the relationship between pages and super pages are stored in the data mart. An automated clustering means is applied to the page view, visit and super page data in the data mart to discover clusters of visits to define super visits having visit behavior characteristics. The visit data stored in the data mart is then scored against the super visit clusters to classify visits into super visits according to visit behavior characteristics. A system is also provided.

188 citations


Network Information
Related Topics (5)
Information system

107.5K papers, 1.8M citations

77% related
The Internet

213.2K papers, 3.8M citations

72% related
Scheduling (computing)

78.6K papers, 1.3M citations

72% related
Cloud computing

156.4K papers, 1.9M citations

71% related
Software

130.5K papers, 2M citations

70% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202113
202020
201926
201823
201726
201627