scispace - formally typeset
Search or ask a question

Showing papers on "Data mart published in 2019"



Book ChapterDOI
06 Nov 2019
TL;DR: The database and migration from the database to the data warehouse is discussed and the design modeling techniques with respect to data mining and query optimization technique is presented to save time and resource in the analysis of data.
Abstract: In the new age, digital data is the most important source of acquiring knowledge. For this purpose, collect data from various sources like websites, blogs, webpages, and most important databases. Database and relational databases both provide help to decision making in the future work. Nowadays these approaches become time and resource consuming there for new concept use name data warehouse. Which can analyze many databases at a time on a common plate from with very efficient way. In this paper, we will discuss the database and migration from the database to the data warehouse. Data Warehouse (DW) is the special type of a database that stores a large amount of data. DW schemas organize data in two ways in which star schema and snowflakes schema. Fact and dimension tables organize in them. Distinguished by normalization of tables. Nature of data leads the designer to follow the DW schemas on the base of data, time and resources factor. Both design-modeling techniques compare with the experiment on the same data and results of applying the same query on them. After the performance evaluation, using bitmap indexing to improve the schemas performance. We also present the design modeling techniques with respect to data mining and improve query optimization technique to save time and resource in the analysis of data.

9 citations


Proceedings ArticleDOI
28 Mar 2019
TL;DR: A detailed analysis is presented that compares Data Lake and Data Warehouse key concepts and emphasizes the complementary of the two technologies by showing the most appropriate use case of each of them.
Abstract: Since data is at the heart of information systems, new technologies and approaches dealing with storing, processing and analyzing data have proliferated. Data Warehouses are among the most known approaches that tackle data storing and processing. However, they reached their limits in dealing with large quantities of data as those of Big Data. Consequently, a new concept which is an evolution of Data Warehouse known as "Data Lake" is emerging. This paper presents a detailed analysis that compares Data Lake and Data Warehouse key concepts. It sheds lights on the aspects and characteristics for the sake of revealing similarities and differences. It also emphasizes the complementary of the two technologies by showing the most appropriate use case of each of them.

9 citations


Journal ArticleDOI
TL;DR: In this article, the authors present an approach to automating the integration of data sources in an Agri environment, where new sources are examined before an attempt to merge them with existing data marts.
Abstract: The global food and agricultural industry has a total market value of USD 8 trillion in 2016, and decision makers in the Agri sector require appropriate tools and up-to-date information to make predictions across a range of products and areas. Traditionally, these requirements are met with information processed into a data warehouse and data marts constructed for analyses. Increasingly however, data are coming from outside the enterprise and often in unprocessed forms. As these sources are outside the control of companies, they are prone to change and new sources may appear. In these cases, the process of accommodating these sources can be costly and very time consuming. To automate this process, what is required is a sufficiently robust extract–transform–load process; external sources are mapped to some form of ontology, and an integration process to merge the specific data sources. In this paper, we present an approach to automating the integration of data sources in an Agri environment, where new sources are examined before an attempt to merge them with existing data marts. Our validation uses a case study of real world Agri data to demonstrate the robustness of our approach and the efficiency of materializing data marts.

7 citations


Journal ArticleDOI
01 Jan 2019
TL;DR: The article represents the experience of practical application of visual analytics tools that allow to make deliberated and reasonable managerial decisions based on the visualization of large volume of information using Business Intelligence tools.
Abstract: The article represents the experience of practical application of visual analytics tools that allow to make deliberated and reasonable managerial decisions based on the visualization of large volume of information using Business Intelligence tools. The key feature of the proposed approach is the development of visual models of key performance indicators (KPI) at the strategic level of management of the enterprise. The developed models are considered as a tool for justification of managerial decisions, building business models of high-tech companies, measuring the efficiency of functioning and evaluating the effectiveness of the development of the selected activities. The Business Intelligence tool was used to build a number of visual models for a high-tech enterprise in telecommunications. The novelty of the proposed approach lies in the application of intensively implemented in both Russian and Western companies performance management and management decision making technologies which are aimed at improving competitiveness, import substitution, cost reduction and optimization of business processes. The requirements of a dynamically changing environment, shortening the products' life cycles, global competition with the necessity lead to create specialized visual situational business models built on the basis of powerful automated systems of business planning and graphics, data analysis and processing of information arrays.

6 citations


Patent
27 Dec 2019
TL;DR: In this paper, an industrial big data multidimensional analysis and visualization method based on a JSON document structure is proposed, which comprises the following steps: by taking JSON as a basic carrier of data, constructing an industrial data mart in parallel by utilizing Spark and ElasticSearch through configuring a relational database and a file system data source and defining data conversion and data cleaning operations; configuring an overall process of data analysis in a graphical mode to construct an analysis data set of a multi-dimensional structure, and customizing each dimension calculation index of the data analysis report in a visual dragging
Abstract: The invention belongs to the technical field of industrial big data application, and particularly relates to an industrial big data multidimensional analysis and visualization method based on a JSON document structure. The method comprises the following steps: by taking JSON as a basic carrier of data, constructing an industrial data mart in parallel by utilizing Spark and ElasticSearch through configuring a relational database and a file system data source and defining data conversion and data cleaning operations; configuring an overall process of data analysis in a graphical mode to construct an analysis data set of a multi-dimensional structure, and repeated association operation on massive data being avoided; and for a specific data analysis scene, customizing each dimension calculation index of the data analysis report in a visual dragging mode based on a pre-constructed multi-dimensional analysis data set, and generating an interactive graphic analysis report. According to the method, the JSON document format is used as a carrier of basic data, and the advantages of the JSON document format in storage and analysis are utilized, so that multi-dimensional analysis structure modeling and user-defined interactive analysis are more convenient and efficient.

5 citations


Proceedings ArticleDOI
01 Jun 2019
TL;DR: The design of a Business Intelligence (BI) solution aimed at obtaining a unified view of operational information from the Navy is described, based on an analytical processing system whose data is transferred to a data warehouse and then to aData mart.
Abstract: The volume of data related to operational activity, produced and exchanged among the several Portuguese Navy naval units is high and diversified The way each naval unit currently collects and processes data makes difficult to consolidate information in the process of analysis and decision making This work describes the design of a Business Intelligence (BI) solution aimed at obtaining a unified view of operational information from the Navy For this purpose, it was necessary to conceive, design and implement a system that would allow the collection, treatment, integration, consolidation, analysis and visualization of the data contained in the operational data shared by naval units, in order to generate information relevant to a more effective and efficient decision making The Business Intelligence system presented is based on an analytical processing system whose data is transferred to a data warehouse and then to a data mart The solution of this system was validated through a survey applied to target users The survey responses allowed to gather important information with the purpose of improving the practicability and usefulness in future iterations with the BI solution developed

4 citations


Patent
22 Feb 2019
TL;DR: In this paper, the authors proposed a big data label data updating method and device, a medium and an electronic device, which consists of the following steps: acquiring massive user data, and establishing a data mart according to user data; establishing a user tag through user attributes of the data mart; receiving an update mode and a tag combination, and updating that tag combination according to the update mode.
Abstract: The embodiment of the invention relates to the technical field of big data, and provides a big data label data updating method and device, a medium and an electronic device. The tag data updating method comprises the following steps: acquiring massive user data, and establishing a data mart according to the user data, wherein, the data mart comprises a plurality of user attributes of each user; establishing a user tag through user attributes of the data mart; receiving an update mode and a tag combination, and updating that tag combination according to the update mode. The technical proposal of the embodiment of the invention is completed based on the mass user data, which is beneficial to the flexibility of updating the tag data, and at the same time, meets the personalized requirements of the tag consumption layer, thereby improving the practicability of the tag data.

3 citations



Proceedings ArticleDOI
02 Dec 2019
TL;DR: This paper shows how to materialize spatial relationship information between crime activities and task-relevant other features into data marts, and to discover interesting crime patterns from the data mart using a spatial association rule mining technique.
Abstract: The structure of crime activity logs stored in police databases is not designed for decision support systems and hence not for complex crime analysis. This paper shows how crime log data could be converted to significantly useful information using data warehousing and data mining techniques. Crime incident data and other relevant data are organized with applying data warehousing concepts. Spatial association rule mining is used for finding interesting local relationship patterns of crime incidents with other spatial features. This paper shows how to materialize spatial relationship information between crime activities and task-relevant other features into data marts, and to discover interesting crime patterns from the data mart using a spatial association rule mining technique. A proof of concept is carried out with real crime data and points of interest in a study area to illustrate and evaluate the proposed approach. The case study results show the usefulness of such data warehousing and spatial association rule mining for crime data analysis.

3 citations


Journal ArticleDOI
TL;DR: The HF data mart will be used to enhance care, assist in clinical decision-making, and improve overall quality of care and holds the potential to be scaled and generalized beyond the initial focus and setting.
Abstract: Objective: The purpose of this project was to build and formatively evaluate a near-real time heart failure (HF) data mart. Heart Failure (HF) is a leading cause of hospital readmissions. Increased efforts to use data meaningfully may enable healthcare organizations to better evaluate effectiveness of care pathways and quality improvements, and to prospectively identify risk among HF patients. Methods and procedures: We followed a modified version of the Systems Development Life Cycle: 1) Conceptualization, 2) Requirements Analysis, 3) Iterative Development, and 4) Application Release. This foundational work reflects the first of a two-phase project. Phase two (in process) involves the implementation and evaluation of predictive analytics for clinical decision support. Results: We engaged stakeholders to build working definitions and established automated processes for creating an HF data mart containing actionable information for diverse audiences. As of December 2017, the data mart contains info...

Patent
12 Mar 2019
TL;DR: Wang et al. as discussed by the authors proposed a method and a system for identifying abnormal transactions based on a fund transaction network, the abnormal transaction identification method comprises the following steps: the step of constructing a data mart through multi-subject data.
Abstract: The invention discloses a method and a system for identifying abnormal transactions based on a fund transaction network, The abnormal transaction identification method comprises the following steps: the step of constructing a data mart through multi-subject data, the step of determining the basic index data, the step of generating the first risk characteristic data, the step of forming a pluralityof fund transaction networks based on the transaction data, the step of generating suspicious cases of abnormal transactions and the step of generating suspicious screening reports; The abnormal transaction identification system includes: data mart building module, basic index mining module, risk characteristics generation module, transaction network building module, suspicious case generation module and screening report generation module. Based on the established fund transaction network and transaction risk model, the invention can completely restore the money laundering process and the money laundering scene, and provides strong support and assistance for the money laundering behavior investigation. The invention also has the prominent advantages of high precision, good comprehensiveness, strong objectivity and the like.

Journal ArticleDOI
TL;DR: A natural language based and goal-oriented template for requirements specification that includes all concepts of the decision-making process and automatic requirements elicitation helps analysts to overcome their lack of domain knowledge, which avoids producing erroneous requirements.
Abstract: The design phase of a data warehousing project remains difficult for both decision makers and requirements analysts. In this paper, we tackle this difficulty through two contributions. First, we propose a natural language based and goal-oriented template for requirements specification that includes all concepts of the decision-making process. The use of familiar concepts and natural language makes our template more accessible and helps decision makers in validating the specified requirements, which avoids producing data mart that does not meet their needs. Secondly, we propose a decision-making ontology that provides for a systematic decomposition of decision-making goals, which allows new requirements to emerge. This automatic requirements elicitation helps analysts to overcome their lack of domain knowledge, which avoids producing erroneous requirements.

Dissertation
01 Jan 2019
TL;DR: Kimball methodology is used on this project due to its design focused on small and low-cost business intelligence projects and is used to identify the procedures needed to achieve the final objective.
Abstract: The present degree work has the aim to solve the problem in ARCOTEL Company. In particular, the current processes in the homologation area are being doing manually, it cause a slowed indicators. Kimball methodology is used on this project due to its design focused on small and low-cost business intelligence projects. This methodology is oriented to the creation of a Data warehouse and a data mart and is used to identify the procedures needed to achieve the final objective. The Extraction, Transformation and Load (ETL) must be analyzed on each table in order to avoid data loading inconsistencies. The final tables must store only clean information for the indicators calculations. The ETL process will be done on the Microsoft SQL Server Integration Services tool. At the end a control dashboard must be obtained which shown each indicator estimating. This chart will be done on the desktop version of the Power bi tool.

Patent
14 Jun 2019
TL;DR: In this paper, a student-tailored AI STEM education platform based on big data and machine learning is presented, which is capable of processing mass data mart, which processes correlated data with high speed using a knowledge system map based on evaluation result data of students, a student profile, and unit knowledge and question bank DB for evaluation.
Abstract: The present invention relates to a student-tailored AI STEM education platform based on big data and machine learning and, more specifically, to a student-tailored AI STEM education platform based on big data and machine learning capable of processing mass data mart, which processes correlated data, with high speed using a knowledge system map based on evaluation result data of students, a student profile, and unit knowledge and question bank DB for evaluation.

Proceedings ArticleDOI
25 Oct 2019
TL;DR: A method of metadata repository developing in terms of metadata responsible for describing business objects and the relationships between them is discussed, which allows organizing data storage within the data warehouse using a metadata repository based on the multidimensional organization principle.
Abstract: When organizing automated data collection in a data warehouse under the conditions of increasing data volume and complicating the business model of an enterprise, an information system data model control becomes one of the priority tasks. The article discusses a method of metadata repository developing in terms of metadata responsible for describing business objects and the relationships between them. The choice of "Data vault" determines the construction of a data warehouse within the framework of an information system based on the classical design approach with a 3-level data presentation architecture, which includes a data preparation area, or an online data warehouse, data warehouse and thematic data marts. The proposed approach allows organizing data storage within the data warehouse using a metadata repository based on the multidimensional organization principle. The metadata repository is responsible for the data collection process, the data storage process, and the presentation of data for analysis. The metadata repository is presented in the form of a metamodel that is semantically related to the domain of the system, is easily reconstructed in case of changes in the business model of the domain, and allows data marts to be created with the structure of a multidimensional data model based on the Star relational scheme. This allows you to organize the human-computer interaction when describing a metamodel, using mainly knowledge about the structure of the subject area. When describing a metamodel, the first-order predicate calculus language is used, which makes it possible to control the metamodel using a declarative programming style - the "Prolog" language. The key point in the structure of the information system is the way of transition from the "Data vault" model to a multidimensional data representation model based on associative rules of dependence between information objects.

Journal ArticleDOI
TL;DR: The purpose of this article is to provide an approach that assists non-expert users in the data warehouse design process and integrates their contextual data, and consists of a context model and a comprehensive Data Warehouse construction method.
Abstract: Data warehouses are now widely used for analysis and decision support purposes. The availability of software solutions, which are more and more user-friendly and easy to manipulate has made it possible to extend their use to end users who are not specialists in the field of business intelligence. The purpose of this article is to provide an approach that assists non-expert users in the data warehouse design process and integrates their contextual data. As well as to provide a method that assists non-expert users in data warehouse design process while incorporating their contextual data. Our proposal consists of a context model and a comprehensive Data Warehouse construction method that attaches the context to data warehouses and uses it to produce customized data marts adapted to the decision makers context.

Journal ArticleDOI
TL;DR: This work aimed develop a prototype called Spatial On -Line Analytic Processing (SOLAP) to carry out multidimensional analysis and to anticipate the extension of the area of radio antennas to give a clear image on the future working strategy respecting the urban planning, and the digital terrain model.
Abstract: . Mobile networks carrier gather and accumulate in their database system a considerable volume of data, that carries geographic information which is crucial for the growth of the company. This work aimed develop a prototype called Spatial On -Line Analytic Processing (SOLAP) to carry out multidimensional analysis and to anticipate the extension of the area of radio antennas. To this end, the researcher started by creating a Data warehouse that allows storing Big Data received from the Radio antennas. Then, doing the OLAP(online analytic processing) in order to perform multidimensional Analysis which used through GIS to represent the Data in different scales in satellite image as a topographic background). As a result, this prototype enables the carriers to receive continuous reports on different scales (Town, city, country) and to identify the BTS that works and performs well or shows the rate of its working (the way behaves) its pitfalls. By the end, it gives a clear image on the future working strategy respecting the urban planning, and the digital terrain model (DTM).

Proceedings ArticleDOI
17 Mar 2019
TL;DR: An analytical data mart is proposed to support the accreditation process to obtain suitable information for its fast understanding and management, and avoid the dispersion of the data required for the university accreditation.
Abstract: The CAACES (Consejo de Evaluacion, Acreditacion y Aseguramiento de la Calidad de la Educacion Superior), develops evaluation processes to accredit Higher Education Institutions of Ecuador. In order to support the accreditation process we propose an analytical data mart to a) obtain suitable information for its fast understanding and management, b) avoid the dispersion of the data required for the university accreditation. The implementation was initially developed with the indicators of students' criterion according to Institutional Evaluation Model of Universities and Polytechnic Schools incorporated by CEAACES and in the future, other indicators can be added by other accreditation criteria. The project was developed using the Kimball methodology and an open source tool of BI (Business Intelligence), the Pentaho suite. The BI solution allows the efficient monitoring of indicators prior to university accreditation, reducing the response time and resources in the report generation process.

Patent
Kawasaki Shunsuke1, Sengoku Koji
29 Aug 2019
TL;DR: In this paper, the authors proposed a data registration system that can efficiently register data related to vehicles while improving confidentiality by using an integration processing server and an integrated database server, where the integration server creates a data mart associating weather data, vehicle state data and vehicle movement state data with each other and encrypts the vehicle ID and user ID in the data mart.
Abstract: To provide a data registration system that can efficiently register data related to vehicles while improving confidentiality.SOLUTION: A data registration system 1 includes an integration processing server 3 and an integrated database server 4. The integration processing server 3 creates a data mart associating weather data, vehicle state data, vehicle movement state data, fuel consumption data, navigation data, vehicle ID, and user ID with each other (STEP 22-27), and encrypts the vehicle ID and user ID in the data mart to create an encrypted data mart (STEP 50-61). The integrated database server 4 stores the encrypted data mart as registration data in a storage area (STEP 80-90).SELECTED DRAWING: Figure 3

Journal ArticleDOI
19 Mar 2019
TL;DR: The results obtained from this study are data marts that simplify and accelerate the provision of data and information to support decision making, so as to provide a basis for developing DSS and EIS applications.
Abstract: The role of information technology in each company is very influential in providing complete and accurate information. With a data warehouse or data mart the company can utilize important asset data to present the information needed.. Shipping firms nowadays are in operation with extremely competitive and challenging environment. Vast volume of knowledge is generated from various operational systems and these are used for determination of several business issues that needed imperative handling. The results obtained from this study are data marts that simplify and accelerate the provision of data and information to support decision making, so as to provide a basis for developing DSS and EIS applications. Conclusions obtained by the data warehouse or data mart can provide complete, accurate and integrated information as a basis for consideration for the executive in making decisions so that decisions taken are based on real facts owned by the company.

Patent
01 Apr 2019
TL;DR: In this article, a method and system for automatic generation of a program code for an enterprise data warehouse is presented, where metadata are obtained that describe the configuration of data transformation mechanisms for loading them to the level of a detailed data storage layer and calculating data mart storefronts.
Abstract: FIELD: information technology.SUBSTANCE: invention relates to a method and system for automatic generating of a program code for an enterprise data warehouse. In the method, metadata are obtained that describe the configuration of data transformation mechanisms for loading them to the level of a detailed data storage layer and calculating data mart storefronts, data update template of the detailed data storage layer and data marts of the data storage are obtained, a program code for loading data into the detailed data storage layer is generated and for calculation the data marts based on the received metadata and the generated data update template, program code generated in the previous step is installed on the data storage medium to perform the downloads, metadata are reused to update the detailed data storage layer and data marts by making delta changes to the mentioned data update template in case of making changes to the problem statement or by switching data update templates in case of changing the relational database type to use the received data update template during the subsequent generation of the program code.EFFECT: technical result is to manage the update of a data warehouse.10 cl, 5 dwg, 1 tbl

Patent
04 Jul 2019
TL;DR: In this article, the authors present a method for the automated generation of software code for a corporate data warehouse, which includes obtaining metadata describing a setting for data transformation mechanisms for loading data to a detailed layer level and calculating the data marts of a warehouse; generating at least one template for updating the data of the detailed layer and a data mart of the data warehouse.
Abstract: This technical solution relates generally to the field of computer technology, and more particularly to systems and methods for the automated generation of software code for a corporate data warehouse. A method for the automated generation of software code for a corporate data warehouse includes: obtaining metadata describing a setting for data transformation mechanisms for loading data to a detailed layer level and calculating the data marts of a warehouse; generating at least one template for updating the data of the detailed layer and a data mart of the data warehouse; generating software code for loading data to the detailed layer of the data warehouse and calculating data marts on the basis of the metadata obtained and the data update template generated; installing the software code generated in the previous step in the data warehouse environment for performing loading; reusing the detailed layer and data mart update metadata. The technical result is an increase in the stability of detailed layer and data mart algorithms and a decrease in the number of incidents in the data warehouse.

Proceedings ArticleDOI
01 Dec 2019
TL;DR: The main emphasis is on the approach of data mining including one of the tools used by SB that uses Decision Trees and Artificial Intelligence techniques along with other techniques.
Abstract: This paper presents the main concepts of smart business (SB) and the application of smart business to provide solutions to the problems faced by a corporation. In this context, the main emphasis is on the approach of data mining including one of the tools used by SB that uses Decision Trees and Artificial Intelligence techniques along with other techniques. Moreover, algorithms such as Genetic Algorithms system and the Neural Networks are considered in its implementation. The concept of Smart Business (SB) has clarified these problems and consists of broad category technologies and application programs used to extract, store, analyze and transform high dimensional data. The approach of Smart Business is presented according to the concepts of Data Warehouse, Data Mart and Data Mining.

23 Oct 2019
TL;DR: A detailed description of the creation of a data mart dedicated to the sales of the fashion company through an optimal solution of best practices of an ETL process resulting in the Snowflake schema and the Star schema, perfect for the data visualization is proposed.
Abstract: The evolution of business intelligence began decades ago with the first report mainframe, called the system output. They were mainly printed on paper, to then be distributed periodically to the manager. The first queries have sped up the process and made it possible for managers with technical expertise to create customized ad hoc reports, but few managers had the time and the skills to do so. The emergence of the data warehouse has given a great impetus to the BI aggregating all the data in one place, where he could be interrogated interactively without impacting applications with online queries and reports with increasingly easy graphical interfaces use. The advent of the data warehouse, the data marts and analytic analysis tools have made BI accessible to more operators and allowed managers to obtain information and critical responses efficiently and quickly. The proposed project will be dedicated to the detailed description of the creation of a data mart dedicated to the sales of the fashion company through an optimal solution of best practices of an ETL process resulting in the Snowflake schema and the Star schema, perfect for the data visualization. In addition, using the classification process including both corporate open data, I had the possibility of locating the most effective area to open a new store and to offer an explanation as to why some shops were closed in the recent past. In conclusions, which will be displayed in Power BI, Microsoft software for data visualization.

25 Sep 2019
TL;DR: In this article, the authors describe the process of developing a business intelligence model using Pentaho, to help improve the analysis of information in the Academic Management of the National University Micaela Bastidas de Apurimac.
Abstract: The article describes the process of developing a Business Intelligence model using Pentaho, to help improve the analysis of information in the Academic Management of the National University Micaela Bastidas de Apurimac. To this end, a Data Mart was implemented to obtain a series of academic indicators and display the information in analytical reports and statistical graphs. The guidelines of the Hephaestus methodology were followed to design the multidimensional database; as an analytical solution according to the information needs, OLAP online analytical processing tools were used. Pentaho Community tools such as Pentaho Data Integration, Mondrian OLAP Server, Schema Workebench and Pentaho Business Analytics 7.0 were used to achieve the implementation. The implementation of the Business Intelligence Model using Pentaho, allowed users immersed in academic management to visualize and analyze the behavior of academic data, so that they can make informed decisions based on the knowledge extracted from the historical information stored in the college.