scispace - formally typeset
Search or ask a question

Showing papers on "Data mart published in 2016"


Journal ArticleDOI
TL;DR: The result shows that the K-Nearest Neighbor classifier is transparent, consistent, straightforward, simple to understand, high tendency to possess desirable qualities and easy to implement than most other machine learning techniques specifically when there is little or no prior knowledge about data distribution.

312 citations


Journal ArticleDOI
TL;DR: The authors have built a robust and scalable data mart based on their implementation of EPIC containing data from across the perioperative period and greatly simplified the code required to extract data, making the data accessible to individuals who lacked a strong coding background.
Abstract: Extraction of data from the electronic medical record is becoming increasingly important for quality improvement initiatives such as the American Society of Anesthesiologists Perioperative Surgical Home. To meet this need, the authors have built a robust and scalable data mart based on their implementation of EPIC containing data from across the perioperative period. The data mart is structured in such a way so as to first simplify the overall EPIC reporting structure into a series of Base Tables and then create several Reporting Schemas each around a specific concept (operating room cases, obstetrics, hospital admission, etc.), which contain all of the data required for reporting on various metrics. This structure allows centralized definitions with simplified reporting by a large number of individuals who access only the Reporting Schemas. In creating the database, the authors were able to significantly reduce the number of required table identifiers from >10 to 3, as well as to correct errors in linkages affecting up to 18.4% of cases. In addition, the data mart greatly simplified the code required to extract data, making the data accessible to individuals who lacked a strong coding background. Overall, this infrastructure represents a scalable way to successfully report on perioperative EPIC data while standardizing the definitions and improving access for end users.

48 citations


Patent
04 Aug 2016
TL;DR: The analytical data mart (ADM) associated data structure as discussed by the authors is designed to allow data from disparate sources to be integrated, enabling streamlined business intelligence, reporting and ad hoc analysis.
Abstract: A device and method are described for a universal analytical data mart and data structure for same. The analytical data mart (ADM) associated data structure is designed to allow data from disparate sources to be integrated, enabling streamlined business intelligence, reporting and ad hoc analysis. Conceptually, the ADM enables analytics and business intelligence from multiple frames of reference including people, such as parties and actors including individuals and organizations, places, such as addresses with geographic information at various levels of view, objects, such as insured properties, automobiles and machinery, and events, milestones which happen at points in time and provide analytical/business value.

12 citations


Patent
03 Feb 2016
TL;DR: In this paper, an electric power material dispatching platform system, which comprises an ERP data layer (1) and a data platform, is characterized in that the data platform is provided with an EPM interface layer, the ERP datastore is connected with the EMP interface layer on data platform so as to carry out two-way communication, and data platform comprises an ODS layer (2), a data warehouse layer (3), and a Data mart layer (4), the data mart layer was connected with EPM interfaces on data warehouse and data mart, respectively, so
Abstract: The invention relates to an electric power material dispatching platform system, which comprises an ERP data layer (1) and a data platform, and is characterized in that the data platform is provided with an ERP interface layer, the ERP data layer (1) is connected with the ERP interface layer on the data platform so as to carry out two-way communication, the data platform comprises an ODS layer (2), a data warehouse layer (3) and a data mart layer (4), the data mart layer (4) is connected with the data warehouse layer (3) so as to carry out two-way communication, the data warehouse layer (3) is connected with the ODS layer (2) so as to carry out two-way communication, and the ODS layer (2) is connected with the ERP interface layer on the data platform so as to carry out two-way communication. The electric power material dispatching platform system can uniformly collect material demand, contract signing and performing, warehousing distribution and supplier management information, comprehensively analyze and monitor implementation conditions of supply chain key nodes, timely discover and process supplier contract performing risks, and realize active monitoring and early warning for the whole process of material supply.

10 citations


Patent
07 Dec 2016
TL;DR: In this article, a data loading cleaning engine, dispatching and storage system, which comprises a data source, a data warehouse and a user display module, is described, wherein the data warehouse is connected with an ETL management module.
Abstract: The invention discloses a data loading cleaning engine, dispatching and storage system, which comprises a data source, a data warehouse and a user display module, wherein the data warehouse is connected with an ETL management module; the ETL management module comprises an ETL dispatching module, an ETL monitoring module, a data quality module and an ETL task module; the data warehouse comprises an interface file region, a detail data temporary storage region SSA, a detail data SOR, a data mart, a data summarizing module, a feedback module and a metadata storage MDR. The system provided by the invention has the advantages that the practicability is high; the data management is convenient and fast; the flexibility is high; the popularization is easy; the high-efficiency data processing is realized; the throughput is great; the dealing with the addition of more data sources can be realized; more analysis requirements are supported.

9 citations


Journal ArticleDOI
TL;DR: A natural language-based method for the design of data mart schemas that facilitates requirements specification through a template covering all concepts of the decision-making process while providing for the acquisition of analytical requirements written in a structured natural language format.
Abstract: Data warehousing projects still face challenges in the various phases of the development life cycle. In particular, the success of the design phase, the focus of this paper, is hindered by the cross-disciplinary competences it requires. This paper presents a natural language-based method for the design of data mart schemas. Compared to existing approaches, our method has three main advantages: first, it facilitates requirements specification through a template covering all concepts of the decision-making process while providing for the acquisition of analytical requirements written in a structured natural language format. Second, it supports requirement validation with respect to a data source used in the ETL process. Third, it provides for a semi-automatic generation of conceptual data mart schemas that are directly mapped onto the data source; this mapping assists the definition of ETL procedures. The performance of the proposed method is illustrated through a software prototype used in an empir...

6 citations


Patent
10 Aug 2016
TL;DR: In this article, a coal mine multi-dimensional data warehousing system based on multiple data marts, comprising four layers of architectures: a data source, an ETL module, a data warehousehousing/data mart and a client, is presented.
Abstract: The invention provides a coal mine multi-dimensional data warehousing system based on multiple data marts, comprising four layers of architectures: a data source, an ETL module, a data warehousing/data mart and a client The system provided by the invention covers the whole process from obtaining the data source to integrating data for an enterprise policymaker to use, a thought of common dimension is introduced among the multiple data marts, a cross-link function among the data marts is realized, the ETL module adopts an ODS incremental data ETL architecture, and thus, the enterprises need of quickly and accurately obtaining data in real time is satisfied The system developed by the invention may manage enterprise data comprehensively, provides high-quality data for the enterprise, and provides the needed data for the enterprise policymaker more pointedly

5 citations


Patent
03 Aug 2016
TL;DR: In this paper, the authors propose a technique for synchronizing and processing data by a data pool, which comprises four steps of data extracting, data processing, data storage, and data mart.
Abstract: The invention aims to provide a technique for synchronizing and processing data by a data pool. The technique comprises four steps of data extracting, data processing, data storage, and data mart. According to the method, a conventional manner of storing, treating, querying and applying measurement data is changed, flexible storage and real-time treatment of massive measurement data are realized, and the real-time requirements of other service systems for data is conveniently met; besides, a high-efficient data query function is also realized, and a flexible data dissemination manner is provided.

5 citations


Patent
02 Nov 2016
TL;DR: In this article, an enterprise performance assessment system consisting of a data warehouse subsystem which is connected with at least one business system and performing structured processing on the source data so as to obtain structured data is described.
Abstract: The invention discloses an enterprise performance assessment system and method. The enterprise performance assessment system comprises a data warehouse subsystem which is connected with at least one business system and used for acquiring source data of at least one business system and performing structured processing on the source data so as to obtain structured data; a data mart subsystem which is connected with the data warehouse subsystem and used for acquiring the structured data of the data warehouse subsystem and classifying the structured data according to the classification rules so as to obtain the classified basic performance data, wherein the classification rules include that classification is performed according to business classes, organization levels and personnel; and a performance management subsystem which is connected with the data mart subsystem and used for processing the basic performance data according to the assessment rules so as to obtain the performance assessment result. According to the enterprise performance assessment system, the performance assessment requirements under various assessment rules can be met for various business systems so that the enterprise performance assessment system has high flexibility.

5 citations


Patent
11 May 2016
TL;DR: In this paper, a photovoltaic power generation prediction method based on data mining is proposed, which is related to the technical field of new energy and relates to the development of new technologies.
Abstract: The invention provides a photovoltaic power generation prediction method based on data mining, and relates to the technical field of new energy The operation steps are as follows: collecting a large quantity of data to establish a data source; extracting data from the data source to establish a data warehouse for storing and managing the extracted data; carrying out ETL processing on the data in the data warehouse, establishing a data mart, and classifying and storing the processed data; and processing and analyzing all kinds of data in the data mart and predicted weather environment data by adopting cluster analysis and an association rule, and comprehensively analyzing the results of the two methods to obtain a photovoltaic power generation prediction result The prediction method is simple, the accuracy of the prediction result is high, and the photovoltaic power generation prediction precision is higher

4 citations


Patent
17 Feb 2016
TL;DR: In this paper, a data warehouse index management method, apparatus and system is presented, where the index output is sent to a data mart through a first interface, wherein the first interface is a unified interface between the data mart and an intermediate layer, and the intermediate layer is a preset interface layer positioned between the dat mart and a business system.
Abstract: The application provides a data warehouse index management method, apparatus and system. The data warehouse index management method comprises: receiving an index output by a data mart through a first interface, wherein the first interface is a unified interface between the data mart and an intermediate layer, and the intermediate layer is a preset interface layer positioned between the data mart and a business system; storing the index in a corresponding index pool; and outputting the index to the corresponding business system through a second interface, wherein the second interface is a unified interface between the intermediate layer and the business system. The method can solve the problem caused by table-level management in a conventional mode, so that the problems of non-unified size, repeated development and difficult management are avoided.

Patent
25 May 2016
TL;DR: In this paper, the authors proposed a method and equipment for realizing clinical information sharing, which comprises the following steps: acquiring basic clinical data of a plurality of data systems of a hospital, then establishing a corresponding clinical data warehouse according to the basic clinical Data, and establishing a clinical data mart with different dimensionalities based on the clinical Data Warehouse.
Abstract: The application aims to provide a method and equipment for realizing clinical information sharing. Compared with the prior art, the method provided by the application comprises the following steps: acquiring basic clinical data of a plurality of data systems of a hospital, then establishing a corresponding clinical data warehouse according to the basic clinical data, and establishing a corresponding clinical data mart with different dimensionalities based on the clinical data warehouse. According to the method, integration and sorting of various data in the plurality of data systems of the hospital can be realized, the requirements of clinic, management, scientific research and the like on analysis and utilization of the data in the data systems of the hospital are met, and the overall competitive advantages of the hospital are improved.

Journal ArticleDOI
19 May 2016
TL;DR: This research discusses related design, construction and implementation of business intelligence solutions dashboard starting from the architecture business intelligence, data warehouse in form of a data mart, ETL (extraction, transformation, and loading) process, which would represent key business process with PT XYZ company.
Abstract: Information and data are the important assets for a company. To process data into useful information and getting to be knowledge in the decision making process at management level, there are so many companies which have difficulties, including PT XYZ which was engaging in the production of agricultural materials. PT XYZ had the difficulties in monitoring process and decision-making process related to its main business, sales and account receivable. Business Intelligence dashboard could be the best solution to overcome these problems. This research discusses related design, construction and implementation of business intelligence solutions dashboard starting from the architecture business intelligence, data warehouse in form of a data mart, ETL (extraction, transformation, and loading) process are useful for generating the quality of data, and visualization in the form of dashboard that would represent key business process with XYZ company. The method used in the development of business intelligence dashboard refers to the six-step method of executive information system that is of the justification, planning, business analysis, design, construction, deployment that would generate digital dashboard for level management. The result is a website application consisting of three digital dashboards that become visualization tools to display the information and knowledge needed in the monitoring process and become material to produce decisions related to sales and receivables. The BI solutions that are developed are in line with expectations and can meet the business needs of PT XYZ.

Patent
21 Sep 2016
TL;DR: In this paper, a data transporter assembles and normalizes data from different data sources, including curated and metadata stores, to improve access to and analysis of data and generation and presentation of results for a remote user.
Abstract: A system, computer-implemented method, and computer program for enhancing business intelligence and peer analysis by improving access to and analysis of data and generation and presentation of results for a remote user. A data transporter assembles and normalizes data from different sources. The data is stored in a data mart having multiple data stores include curated and metadata stores. An API receives a request for a time series and/or geospatial dataset with attributes indicating levels of aggregation and granularity, interacts with the metadata to generate a corresponding SQL query, executes the SQL against the curated data, and communicates the resulting dataset to a BI software application in an open standard data format. The data mart may host the curated data in a star schema having a plurality of dimensions, the data transporter component may include a key management controller, and the data transporter may also interact with the metadata.

Posted Content
TL;DR: In the last decade, the amount of data that an organization processes and stores has grown exponentially and one needs a computer system able to store complex and very large quantities of data to be able to support the business process.
Abstract: In the last decade, the amount of data that an organization processes and stores has grown exponentially. In most cases, the data stored is used to support the business process through accurate and up-to-date information about the business environment and activity of the company. In order for a company's managers to be capable of generating the reports they need to make decisions, one needs a computer system able to store complex and very large quantities of data. At the same time, for the development of such an information system, one must take into account the cost of it.

Book ChapterDOI
01 Jan 2016
TL;DR: This chapter describes different kinds of data repositories: operational data stores (ODSs), clinical data warehouses, clinical data marts, and clinical registries.
Abstract: This chapter describes different kinds of data repositories: operational data stores (ODSs), clinical data warehouses, clinical data marts, and clinical registries. The purpose of the ODS is to serve as a location for the processes of extraction, transfer, and load prior to creating a warehouse or data marts, both of which are physically integrated databases optimized for rapid query. Virtual data integration or federation is not recommended unless political realities make physical integration infeasible. Registries are repositories that limit their contents to patients with specific disease conditions: their formats are often archaic, reflecting their long existence. The act of creating warehouses or marts may paradoxically make reporting and data extraction needs increase. It is important to set end-users’ expectations by identifying limitations in the quality of the data sources. It is important to map data elements during warehousing if only to increase the usability of the resulting system.

Patent
Jan Wenda1
14 Jan 2016
TL;DR: In this paper, the authors present a data mart assembly method based on a plurality of slices of structured data, where the first user is provided with an option to utilize the data mart for a second project where the second project is different than the first project.
Abstract: Included are embodiments for data mart assembly. Some embodiments of the method include receiving a plurality of slices of structured data, providing an option to graphically assemble a data mart from at least a portion of the plurality of slices of structured data for a first project, and providing an option to access a previously created data mart from a data mart repository for inclusion in the first project. Some embodiments include associating the data mart with a first user, such that the first user is provided with an option to utilize the data mart for a second project, where the first project is different than the second project.

Posted Content
TL;DR: The results demonstrated that the recommendation system powered by Bayesian classification model can produce accurate, faster and efficient Real-Time recommendation to the client consistently.
Abstract: In recent time, there is an increasingly growth in the volume of information available in electronic forms and databases, this therefore makes locating relevant information to be tedious and time consuming. In this paper we have used a unique toolsets to exploit web usage data mining technique to identify a client/visitor’s navigation pattern of a particular website specifically, the Really Simple Syndication (RSS) reader’s web site, based on the user’s current behavior by acting upon the user click stream data, in order to provide tailored information to the individuals so as to ease navigation on the site without too many choices at a time. The Bayesian classification has been trained to be used online and in real time to identify active user click stream data, matching it to a particular user group and recommends a tailored browsing options that satisfies the need of the user at a given period. To achieve this, data mart of user’s RSS address URL data extracted from the server database was developed. Experimenting with our work shows that the scalability problem peculiar to this type of system can be overcome through our approach and our results demonstrated that the recommendation system powered by Bayesian classification model can produce accurate, faster and efficient Real-Time recommendation to the client consistently.


Journal ArticleDOI
TL;DR: This study uses a generalization method to perform the process of establishing a data mart and results obtained are a collection of some Subclasspredetermined or selected later formed a Superclass useful to accommodate the resources of the Subclass.
Abstract: Technology today causing the data needs of an agency or company to process the data or analyze data quickly, dense and higher. Companies or institutions want the data analysis processcan save time as much as possible. The data warehouse is a data analysis technology that is useful to resolve the issue. The data warehouse is a repository of data that is useful to accommodate all the history data held by agencies or companies. Data marts are part kacil of the data warehouse. Data mart is focused on a single subject. This study uses a generalization method to perform the process of establishing a data mart. Generalization is a useful method to reduce or narrow the differences in the data based Subclass. Subclasswere integrated into a Superclass useful to collect some data from the Subclass. Subclassis the data that is more descriptive. Superclass is more general in nature of data. The result obtained is a collection of some Subclasspredetermined or selected later formed a Superclass useful to accommodate the resources of the Subclass.

Patent
08 Feb 2016
TL;DR: In this article, an electronic medical chart system consisting of an input unit, a central processing unit and an output section is proposed to confirm progress information of a duty of an on-duty medical worker.
Abstract: PROBLEM TO BE SOLVED: To provide an electronic medical chart system capable of allowing an off-duty medical worker to confirm progress information of a duty of an on-duty medical workerSOLUTION: The electronic medical chart system is constituted of a computer system which includes an input unit, a central processing unit and an output section The electronic medical chart system includes: a base data mart 1201 that stores an ID or name of a duty and an ID or name of medical workers while associating both each other; and an individual data mart 1301 that selects an ID or name of off-duty medical workers who are available to the duty, and stores the ID or name of off-duty medical workers and the ID or name of the duty while associating both with each otherThe input unit inputs an ID or name of a specific user in the off-duty medical workers The central processing unit picks up a piece of progress information of a duty of an on-duty medical worker in the individual data mart of the specific user The output section displays the progress information on a terminal of the specific userSELECTED DRAWING: Figure 1

01 Jan 2016
TL;DR: In this paper, the principal novelty is raw Data Vault (DV) loads from source systems, and experiments with effects of allowing certain kinds of permissible errors to be kept in the Data Vault until correct values are supplied.
Abstract: The principal novelty in this work is raw Data Vault (DV) loads from source systems, and experiments with effects of allowing certain kinds of permissible errors to be kept in the Data Vault until correct values are supplied.

Patent
20 Jul 2016
TL;DR: In this paper, a tax data subject architecture based on cloud monitoring is proposed, and four main tax business subject domains are determined according to tax business and are a party subject domain, an agreement domain, a product domain and an event subject domain respectively.
Abstract: The invention discloses a tax data subject architecture based on cloud monitoring The tax data subject architecture is based on cloud monitoring, and four main tax business subject domains are determined according to tax business and are a party subject domain, an agreement subject domain, a product subject domain and an event subject domain respectively The parties refer to all stakeholders involved in tax affairs Agreements refer to rules suitable for tax payment activities between the parties and tax authorities and between the parties at stipulated certain moments or within periods and are the base for all tax related activities of taxpayers Products refer to entities or services which can cause payments of the parties and bring the positive effects to the parties in the tax activities The event subject domain comprises an order subject domain and a transaction subject domain The tax data subject architecture can support business management and decision making of departments of the State Administration of Taxation, performs statistics analysis, data mining and the like by establishing an integrated library, a summarized library and a data mart and meets the business departments of the departments

Book ChapterDOI
01 Jan 2016
TL;DR: This paper details the business process analysis and modeling stage of designing an automated solution for banking, focusing on the management of the lending activities business process, and presents the resulting business process model as well as the entities model.
Abstract: In this paper we will detail the business process analysis and modeling stage of designing an automated solution for banking, focusing on the management of the lending activities business process. We will present the resulting business process model as well as the entities model. The latter represents the base upon which we design a Data Mart needed for implementing a scoring algorithm and data mining algorithms for profiling clients. The data mart is part of a departmental data warehouse to ensure better flexibility in the analysis of current activity.

Patent
09 Nov 2016
TL;DR: In this paper, a cutting parameter mining method based on a data warehouse is proposed, which comprises the following steps: determining a parameter which affects cutting, and searching a data source; establishing the data warehouse, and establishing a data mart by aiming at different parameters; carrying out extraction processing on data in the data warehousing, and screening the data which conforms to a target object; establishing a mathematic model for the screened data; carried out multiple linear regression modeling; according to a degree of deviation between the selected parameters and the constructed model, judging the reliability of the selected parameter;
Abstract: The invention discloses a cutting parameter mining method based on a data warehouse. With the advent of the industrialization and information age, the accumulation of data sizes increasingly grows while enterprises enjoy convenience brought by informationization, and how to quickly obtain effective data from mass data and provide support for decisions becomes urgent affairs. The method comprises the following steps: determining a parameter which affects cutting, and searching a data source; establishing the data warehouse, and establishing a data mart by aiming at different parameters; carrying out extraction processing on data in the data warehouse, and screening the data which conforms to a target object; establishing a mathematic model for the screened data; carrying out multiple linear regression modeling; according to a degree of deviation between the selected parameters and the constructed model, judging the reliability of the selected parameter; and determining an optimal cutting parameter. The invention provides the cutting parameter mining method based on the data warehouse.

Dissertation
06 Oct 2016
TL;DR: This work focuses on the introduction of a data virtualization layer to read and consolidate data from heterogeneous sources (Hadoop system, a data mart and a data warehouse) and provide a single point of data access to all data consumers.
Abstract: This work focuses on the introduction of a data virtualization layer to read and consolidate data from heterogeneous sources (Hadoop system, a data mart and a data warehouse) and provide a single point of data access to all data consumers.

Book ChapterDOI
01 Jan 2016
TL;DR: This work focuses on the design of trajectory data warehouse schema and proposes automating this task to reduce human intervention since it is done manually and requires good knowledge of the domain.
Abstract: A mobile object is a spatial object that changes the form and the location permanently over the time. Each displacement creates a trajectory that reflects the evolution of its position in space during a given time interval. It generates, then, a huge amount of trajectory data that are stored into trajectory data warehouse because it is the only tool that can analysis the historical trajectory data. In this work, we focus on the design of trajectory data warehouse schema and we propose automating this task to reduce human intervention since it is done manually and requires good knowledge of the domain. To achieve this goal, firstly, we automate the extraction of trajectory data mart schemas from a moving data base. Then, we merge them to get the trajectory data warehouse schema using a new schema integration methodology that is composed by schema matching and schema mapping.