scispace - formally typeset
Search or ask a question

Showing papers on "Apriori algorithm published in 2022"


Journal ArticleDOI
TL;DR: In this article , the Apriori algorithm was used in conjunction with other machine learning tools to cluster the potential backers and provide more accurate recommendations for crowdfunding projects, and the proposed solution outperforms the other five benchmark methods and offers an imporved matchmaking by connecting the listed crowdfunding projects to the right backers.
Abstract: Traditional clustering methods fail to accurately cluster the feature vectors of backers and macth the potential backers to compatible crowdfunding projects, mainly due to their sensitivity to the setting of the initial value. In this paper, we use the Apriori algorithm in conjunction with other machine learning tools to cluster the potential backers and provide more accurate recommendations for crowdfunding projects. Focusing on potential projects listed in a major reward-based crowdfunding platform, we first train the data obtained from the available list of backers. Using the Apriori algorithm, the degree of association between different project backers is then obtained, and weight calculation of the backers is carried out according to the association degree of the backers. The degree of association is used as a key index to cluster similar backers. Finally, we test the model and determine whether clustering can correctly classify the data in the test set based on the Apriori algorithm. Our experimental results show that there is 90% accuracy, precision and recall of the model. The proposed solution outperforms the other five benchmark methods and offers an imporved matchmaking by connecting the listed crowdfunding projects to the right backers.

26 citations


Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors developed a general modeling and analysis procedure for risk interactions based on association rule mining and the weighted network theory, and then take China as an example to investigate the interactions among DFPP safety risks.

20 citations


Journal ArticleDOI
TL;DR: A novel data science life-cycle and process model with Recency, Frequency, and Monetary (RFM) analysis method with the combination of various analytics algorithms are utilized in this study for sales prediction and product recommendation through user behavior analytics.
Abstract: The COVID-19 has brought us unprecedented difficulties and thousands of companies have closed down. The general public has responded to call of the government to stay at home. Offline retail stores have been severely affected. Therefore, in order to transform a traditional offline sales model to the B2C model and to improve the shopping experience, this study aims to utilize historical sales data for exploring, building sales prediction and recommendation models. A novel data science life-cycle and process model with Recency, Frequency, and Monetary (RFM) analysis method with the combination of various analytics algorithms are utilized in this study for sales prediction and product recommendation through user behavior analytics. RFM analysis method is utilized for segmenting customer levels in the company to identify the importance of each level. For the purchase prediction model, XGBoost and Random Forest machine learning algorithms are used to build prediction models and 5-fold Cross-Validation method is utilized to evaluate their. For the product recommendation model, the association rules theory and Apriori algorithm are used to complete basket analysis and recommend products according to the outcomes. Moreover, some suggestions are proposed for the marketing department according to the outcomes. Overall, the XGBoost model achieved better performance and better accuracy with F1-score around 0.789. The proposed recommendation model provides good recommendation results and sales combinations for improving sales and market responsiveness. Furthermore, it recommend specific products to new customers. This study offered a very practical and useful business transformation case that assists companies in similar situations to transform their business models.

11 citations


Journal ArticleDOI
TL;DR: In this article , a new IoT-based smart product-recommender system based on an apriori algorithm and fuzzy logic is presented, which employs association rules to display the interdependencies and linkages among many data objects.
Abstract: The Internet of Things (IoT) has recently become important in accelerating various functions, from manufacturing and business to healthcare and retail. A recommender system can handle the problem of information and data buildup in IoT-based smart commerce systems. These technologies are designed to determine users' preferences and filter out irrelevant information. Identifying items and services that customers might be interested in and then convincing them to buy is one of the essential parts of effective IoT-based smart shopping systems. Due to the relevance of product-recommender systems from both the consumer and shop perspectives, this article presents a new IoT-based smart product-recommender system based on an apriori algorithm and fuzzy logic. The suggested technique employs association rules to display the interdependencies and linkages among many data objects. The most common use of association rule discovery is “shopping cart analysis.” Customers' buying habits and behavior are studied based on the numerous goods they place in their shopping carts. As a result, the association rules are generated using a fuzzy system. The apriori algorithm then selects the product based on the provided fuzzy association rules. The results revealed that the suggested technique had achieved acceptable results in terms of mean absolute error, root-mean-square error, precision, recall, diversity, novelty, and catalog coverage when compared to cutting-edge methods. Finally, the method helps increase recommender systems' diversity in IoT-based smart shopping.

11 citations


Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors improved the traditional association rule mining (ARM) method by adding fuzzy set theory, and they extended ARM by considering not only items sold but also sales amounts.
Abstract: Abstract Online stores assist customers in buying the desired products online. Great competition in the e-commerce sector necessitates technology development. Many e-commerce systems not only present products but also offer similar products to increase online customer interest. Due to high product variety, analyzing products sold together similar to a recommendation system is a must. This study methodologically improves the traditional association rule mining (ARM) method by adding fuzzy set theory. Besides, it extends the ARM by considering not only items sold but also sales amounts. Fuzzy association rule mining (FARM) with the Apriori algorithm can catch the customers’ choice from historical transaction data. It discovers fuzzy association rules from an e-commerce company to display similar products to customers according to their needs in amount. The experimental result shows that the proposed FARM approach produces much information about e-commerce sales for decision-makers. Furthermore, the FARM method eliminates some traditional rules considering their sales amount and can produce some rules different from ARM.

10 citations


Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper improved the traditional association rule mining (ARM) method by adding fuzzy set theory, and they extended ARM by considering not only items sold but also sales amounts.
Abstract: Abstract Online stores assist customers in buying the desired products online. Great competition in the e-commerce sector necessitates technology development. Many e-commerce systems not only present products but also offer similar products to increase online customer interest. Due to high product variety, analyzing products sold together similar to a recommendation system is a must. This study methodologically improves the traditional association rule mining (ARM) method by adding fuzzy set theory. Besides, it extends the ARM by considering not only items sold but also sales amounts. Fuzzy association rule mining (FARM) with the Apriori algorithm can catch the customers’ choice from historical transaction data. It discovers fuzzy association rules from an e-commerce company to display similar products to customers according to their needs in amount. The experimental result shows that the proposed FARM approach produces much information about e-commerce sales for decision-makers. Furthermore, the FARM method eliminates some traditional rules considering their sales amount and can produce some rules different from ARM.

9 citations


Journal ArticleDOI
TL;DR: In this paper , a process model is proposed, which is the combination of recency, frequency, and monetary (RFM) analysis method and the k-means clustering algorithm.
Abstract: The COVID-19 pandemic instigated thousands of companies' closures and affected offline retail shops. Thus, online B2C business models enable traditional offline stores to boost their sales. This study aims to explore the use of historical sales and behavioral data analytics to construct a recommendation model. A process model is proposed, which is the combination of recency, frequency, and monetary (RFM) analysis method and the k-means clustering algorithm. RFM analysis is used to segment customer levels in the company while the association rule theory and the apriori algorithm are utilized for completing the shopping basket analysis and recommending products based on the results. The proposed recommendation model provides a good marketing mix to improve sales and market responsiveness. In addition, it recommends specific products to new customers as well as specific groups of target customers. This study offered a practical business transformation case that can assist companies in a similar situation to transform their business model and improve their profits.

7 citations


Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors applied the data mining concept and technology based on Hadoop to the construction of data resource management platform of biomass energy engineering, which solved the difficult problems of mass data collection, storage, processing and analysis of biomass.

7 citations


Journal ArticleDOI
TL;DR: From the results of the trials in this study, it was found that the greater the minimum support and minimum confidence, the less time it takes to produce recommendations and the fewer recommendations are given, but the recommendations given come from transactions that often appear.
Abstract: Food is the ingredient that enables people to grow, develop, and achieve. For this reason, food quality and types of food must be considered so that they are safe for consumption and managed. Some plant-based foodstuffs are often processed and consumed by the community, even the most needed in food processing. In this case, the research was carried out using data mining with market basket analysis algorithms to obtain very valuable information to decide the inventory of the type of material needed. Market Based Analysis method is used to analyze all data and create patterns for each data. One method of Market Based Analysis in question is the association rule with a priori algorithm. This algorithm produces sales transactions with strong associations between items in the transaction which are used as sales recommendations that help users (owners) get recommendations when users see details of the itemset purchased. From the results of the trials in this study, it was found that the greater the minimum support (minsup) and minimum confidence (minconf), the less time it takes to produce recommendations and the fewer recommendations are given, but the recommendations given come from transactions that often appear.

7 citations


Journal ArticleDOI
TL;DR: The proposed method combines text classification, association rules, and the Sankey diagrams and provides a novel approach for mining semi-structured text and is useful and efficient for exploring near-miss distribution laws in hydropower engineering construction.
Abstract: Accidents of various types in the construction of hydropower engineering projects occur frequently, which leads to significant numbers of casualties and economic losses. Identifying and eliminating near misses are a significant means of preventing accidents. Mining near-miss data can provide valuable information on how to mitigate and control hazards. However, most of the data generated in the construction of hydropower engineering projects are semi-structured text data without unified standard expression, so data association analysis is time-consuming and labor-intensive. Thus, an artificial intelligence (AI) automatic classification method based on a convolutional neural network (CNN) is adopted to obtain structured data on near-miss locations and near-miss types from safety records. The apriori algorithm is used to further mine the associations between “locations” and “types” by scanning structured data. The association results are visualized using a network diagram. A Sankey diagram is used to reveal the information flow of near-miss specific objects using the “location ⟶ type” strong association rule. The proposed method combines text classification, association rules, and the Sankey diagrams and provides a novel approach for mining semi-structured text. Moreover, the method is proven to be useful and efficient for exploring near-miss distribution laws in hydropower engineering construction to reduce the possibility of accidents and efficiently improve the safety level of hydropower engineering construction sites.

6 citations


Journal ArticleDOI
TL;DR: The patterns of associations between texts in problem-solving records are extracted to generate appropriate solutions automatically and the Apriori algorithm is used to identify pattern associations among document clusters that represent the problems, causes, and solutions.
Abstract: Prompt responses to problems/faults arising in an assembly workshop are crucial in terms of production reliability and efficiency. However, human-dependent tasks are time-consuming and prone to error. In this paper, we propose a knowledge discovery approach. We extract the patterns of associations between texts in problem-solving records to generate appropriate solutions automatically. First, we use an enhanced latent Dirichlet allocation (EnLDA) technique to explore the document-topic and topic-word distributions of a text corpus recording assembly problems, causes, and solutions. To increase accuracy, we adjust the elements of the document-term matrix, and we assign term frequency-inverse document frequencies. Second, we use the Refining Density-based Spatial Clustering of Application with Noise (Rf-DBSCAN) algorithm for text clustering. This refines the distances among topic distribution vectors and incorporates noise objects into clustering. This clusters textual documents with similar semantic information, maximizing information retention. Third, we use the Apriori algorithm to identify pattern associations among document clusters that represent the problems, causes, and solutions. We perform a case study using field data from an automobile assembly workshop. The results show that the method retrieves hidden but valuable information from textual records. The decision support knowledge facilitates assembly problem-solving.

Journal ArticleDOI
TL;DR: The network association pattern between disorders that occurred in the same individual by using the association rule mining technique will be helpful in improving prevention strategies, early identification of high-risk populations, and reducing mortality.
Abstract: Objective Short-term or long-term connections between different diseases have not been fully acknowledged. This study was aimed at exploring the network association pattern between disorders that occurred in the same individual by using the association rule mining technique. Methods Raw data were extracted from the large-scale electronic medical record database of the affiliated hospital of Xuzhou Medical University. 1551732 pieces of diagnosis information from 144207 patients were collected from 2015 to 2020. Clinic diagnoses were categorized according to “International Classification of Diseases, 10th revision”. The Apriori algorithm was used to explore the association patterns among those diagnoses. Results 12889 rules were generated after running the algorithm at first. After threshold filtering and manual examination, 110 disease combinations (support ≥ 0.001, confidence ≥ 60%, lift > 1) with strong association strength were obtained eventually. Association rules about the circulatory system and metabolic diseases accounted for a significant part of the results. Conclusion This research elucidated the network associations between disorders from different body systems in the same individual and demonstrated the usefulness of the Apriori algorithm in comorbidity or multimorbidity studies. The mined combinations will be helpful in improving prevention strategies, early identification of high-risk populations, and reducing mortality.

Journal ArticleDOI
TL;DR: The Apriori algorithm is optimized by Amazon Web Services (AWS) and graphics processing unit (GPU) to improve its data mining speed and cloud follow-up platform-based intelligent medical communication system is used to analyze patients’ compliance, quality of life before and after nursing, function limitation of affected limb, and nursing satisfaction under different nursing methods.
Abstract: This study aimed to explore the application value of the intelligent medical communication system based on the Apriori algorithm and cloud follow-up platform in out-of-hospital continuous nursing of breast cancer patients. In this study, the Apriori algorithm is optimized by Amazon Web Services (AWS) and graphics processing unit (GPU) to improve its data mining speed. At the same time, a cloud follow-up platform-based intelligent mobile medical communication system is established, which includes the log-in, my workstation, patient records, follow-up center, satisfaction management, propaganda and education center, SMS platform, and appointment management module. The subjects are divided into the control group (routine telephone follow-up, 163) and the intervention group (continuous nursing intervention, 216) according to different nursing methods. The cloud follow-up platform-based intelligent medical communication system is used to analyze patients’ compliance, quality of life before and after nursing, function limitation of affected limb, and nursing satisfaction under different nursing methods. The running time of Apriori algorithm is proportional to the data amount and inversely proportional to the number of nodes in the cluster. Compared with the control group, there are statistical differences in the proportion of complete compliance data, the proportion of poor compliance data, and the proportion of total compliance in the intervention group ( P < 0.05 ). After the intervention, the scores of the quality of life in the two groups are statistically different from those before treatment ( P < 0.05 ), and the scores of the quality of life in the intervention group were higher than those in the control group ( P < 0.05 ). The proportion of patients with limited and severely limited functional activity of the affected limb in the intervention group is significantly lower than that in the control group ( P < 0.05 ). The satisfaction rate of postoperative nursing in the intervention group is significantly higher than that in the control group ( P < 0.001 ), and the proportion of basically satisfied and dissatisfied patients in the control group was higher than that in the intervention group ( P < 0.05 ).

Journal ArticleDOI
TL;DR: Compared with the conditions before treatment, the sign scores of children with allergic rhinitis were remarkably decreased after treatment with traditional Chinese medicine compounds and the mining performance of the Apriori algorithm was improved by introducing an interest-based model.
Abstract: The data mining analysis of the medication rule and the curative effect of traditional Chinese medicine in treating allergic rhinitis in children was performed by using the association rule Apriori algorithm. The model of interest degree was introduced to improve the Apriori algorithm, and the performance difference of the algorithm before and after improvement was analyzed. Traditional Chinese medicine prescriptions for the treatment of allergic rhinitis in children were selected from the dictionary of Chinese medicine formulations. The frequency, frequent itemsets, and the improved Apriori algorithm of each prescription were analyzed comprehensively. The results showed that both the execution time of the improved Apriori algorithm and the number of mining association rules were signally lower. 102 Chinese herbal compounds were selected, in which the occurrence frequency of Flos magnoliae was the highest (67 times, 5.33%). The occurrence frequency of diaphoretic drugs was the highest (412 times, 32.78%) in drug types. The occurrence frequency of Yu Ping Feng powder was the highest (21 times, 20.59%) in the Chinese herbal compound. After the association rule analysis of the improved Apriori algorithm, Perilla frutescens, Saposhnikovia divaricata, ginseng, Notopterygium root, and Astragalus propinquus Schischkin were often mixed with liquorice, and Flos magnoliae were usually mixed with Fructus xanthii and black plum. Compared with the conditions before treatment, the sign scores of children with allergic rhinitis were remarkably decreased after treatment with traditional Chinese medicine compounds (P < 0.05). The mining performance of the Apriori algorithm was improved by introducing an interest-based model. The treatment of traditional Chinese medicine on allergic rhinitis in children was combined with children's physiological and pathological characteristics of children, which used mild medicines.

Journal ArticleDOI
14 Jan 2022-Data
TL;DR: SHFIM (spark-based hybrid frequent itemset mining) is a three-phase algorithm that utilizes both horizontal and vertical layout diffset instead of tidset to keep track of the differences between transaction ids rather than the intersections.
Abstract: Frequent itemset mining (FIM) is a common approach for discovering hidden frequent patterns from transactional databases used in prediction, association rules, classification, etc. Apriori is an FIM elementary algorithm with iterative nature used to find the frequent itemsets. Apriori is used to scan the dataset multiple times to generate big frequent itemsets with different cardinalities. Apriori performance descends when data gets bigger due to the multiple dataset scan to extract the frequent itemsets. Eclat is a scalable version of the Apriori algorithm that utilizes a vertical layout. The vertical layout has many advantages; it helps to solve the problem of multiple datasets scanning and has information that helps to find each itemset support. In a vertical layout, itemset support can be achieved by intersecting transaction ids (tidset/tids) and pruning irrelevant itemsets. However, when tids become too big for memory, it affects algorithms efficiency. In this paper, we introduce SHFIM (spark-based hybrid frequent itemset mining), which is a three-phase algorithm that utilizes both horizontal and vertical layout diffset instead of tidset to keep track of the differences between transaction ids rather than the intersections. Moreover, some improvements are developed to decrease the number of candidate itemsets. SHFIM is implemented and tested over the Spark framework, which utilizes the RDD (resilient distributed datasets) concept and in-memory processing that tackles MapReduce framework problem. We compared the SHFIM performance with Spark-based Eclat and dEclat algorithms for the four benchmark datasets. Experimental results proved that SHFIM outperforms Eclat and dEclat Spark-based algorithms in both dense and sparse datasets in terms of execution time.

Proceedings ArticleDOI
19 Oct 2022
TL;DR: The system adopts the Apriori algorithm of the Weka framework as the model and Java as the development language, and generates the association rules related to the students' academic performance, which provides auxiliary decision support for the improvement of school teaching.
Abstract: With the improvement of education informatization, a large amount of data related to education and teaching has been accumulated, and colleges and universities have begun to use data mining to analyze and study student achievement. In order to excavate the internal relationship between different courses and find out the relationship between students' achievement and their own attributes and external factors, this paper designs and implements a student's achievement analysis system with Apriori as the core algorithm. The system adopts the Apriori algorithm of the Weka framework as the model and Java as the development language, and finally generates the association rules related to the students' academic performance, which provides auxiliary decision support for the improvement of school teaching.

Journal ArticleDOI
TL;DR: In this paper , the problem of dental image identification was investigated by developing a novel dental identification scheme (DIS) utilizing a fractional wavelet feature extraction technique and rule mining with an Apriori procedure.
Abstract: Several identification approaches have recently been employed in human identification systems for forensic purposes to decrease human efforts and to boost the accuracy of identification. Dental identification systems provide automated matching by searching photographic dental features to retrieve similar models. In this study, the problem of dental image identification was investigated by developing a novel dental identification scheme (DIS) utilizing a fractional wavelet feature extraction technique and rule mining with an Apriori procedure. The proposed approach extracts the most discriminating image features during the mining process to obtain strong association rules (ARs). The proposed approach is divided into two steps. The first stage is feature extraction using a wavelet transform based on a k-symbol fractional Haar filter (k-symbol FHF), while the second stage is the Apriori algorithm of AR mining, which is applied to find the frequent patterns in dental images. Each dental image’s created ARs are saved alongside the image in the rules database for use in the dental identification system’s recognition. The DIS method suggested in this study primarily enhances the Apriori-based dental identification system, which aims to address the drawbacks of dental rule mining.

Journal ArticleDOI
TL;DR: In this article , a study was conducted to reprocess sales transaction data for 2018-2019 using data mining techniques with association methods and apriori algorithms and the results showed that the goods in the Coat (imported), Pants, and Skirt categories are often bought together.
Abstract: The company does not yet know the pattern of consumer purchases because so far, the sales transaction data has not been used correctly and does not have a unique method to determine consumer buying patterns. To overcome the problems on the company, this research was done to reprocess sales transaction data for 2018-2019 using data mining techniques with association methods and apriori algorithms. RapidMiner is a supporting application used to find association rules derived from transaction data. Processed transaction data using the Knowledge Discovery in Database (KDD) approach. Thus, the company can determine consumer habits in buying goods derived from sales transaction data for 2018-2019. The results of this study are that in 2018, nine association rules were obtained, of which the best were CT G-246 ⇒ CT G-250 and CT G-250 ⇒ CT G-246. In 2019, nineteen association rules were obtained, of which the best were PN 0441, SK 0175 ⇒ SK 0530 and SK 0175, SK 0283 ⇒ SK 0530. From the best association rules, the goods in the Coat (imported), Pants, and Skirt categories are categories that are often bought together.

Journal ArticleDOI
TL;DR: An apriori algorithm is proposed to enhance the privacy of encrypted data and it does not require fake transactions, like data privacy association rule mining and shows 3% to 5% improvement in performance when compared to other existing algorithms.
Abstract: Cloud computing provides advantages, like flexibly of space, security, cost optimization, accessibility from any remote location. Because of this factor cloud computing is emerging as in primary data storage for individuals as well as organisations. At the same time, privacy preservation is an also a significant aspect of cloud computing. In regrades to privacy preservation, association rule mining was proposed by previous researches to protect the privacy of users. However, the algorithm involves creation of fake transaction and this algorithm also fails to maintain the privacy of data frequency. In this research an apriori algorithm is proposed to enhance the privacy of encrypted data. The proposed algorithm is integrated with elagmal cryptography and it does not require fake transactions. In this way, the proposed algorithm improves the data protection as well as query privacy and it hides data frequency. Result analysis shows that the proposed algorithm improves the privacy as compared to previously proposed association rule mining and the algorithm also shows 3% to 5% improvement in performance when compared to other existing algorithms. This performance analysis with varying number of the data and fake transactions shows that the proposed algorithm doesn’t require fake transactions, like data privacy association rule mining.

Journal ArticleDOI
TL;DR: In this article, a study of linked road crash data aimed to identify co-occurring injuries in multiple injured road users by using a novel application of a data mining technique commonly used in Market Basket Analysis.

Journal ArticleDOI
TL;DR: In this paper , a study of linked road crash data aimed to identify co-occurring injuries in multiple injured road users by using a novel application of a data mining technique commonly used in Market Basket Analysis.

Journal ArticleDOI
TL;DR: This research proposes a parallelization-based approach to improve the performance of the Apriori algorithm in repetitive mining patterns on network topologies and concludes that this approach provides acceptable efficiency in terms of evaluation criteria such as energy consumption, network lifetime, and runtime compared to other methods.
Abstract: Recently, the discovery of association rules and the consequent mining frequent patterns have attracted the attention of many researchers to discover unknown relationships in big data, especially in networking and distributed environments. In this research, a parallelization-based approach is proposed to improve the performance of the Apriori algorithm in repetitive mining patterns on network topologies. The proposed approach includes two main features: (1) combining centrality criteria of the node and the Apriori algorithm to identify repetitive patterns and (2) using the mapping/reduction method to create parallel processing and achieve optimal values in the shortest time. This approach also pursues three main objectives: reducing the temporal and spatial complexity of the Apriori algorithm, improving the association rules mining process and identifying repetitive patterns, and comparing the proposed approach’s performance on different network topologies to determine the advantages and disadvantages of each topology. Comparing our proposed method and the basic Apriori algorithm, it is concluded that our approach provides acceptable efficiency in terms of evaluation criteria such as energy consumption, network lifetime, and runtime compared to other methods. Experimental results also show that when using our proposed method compared to the basic Apriori algorithm, network life is increased by 7.1%, the runtime is reduced by 43.2%, and the energy consumption is saved by about 41.2%.

Journal ArticleDOI
TL;DR: In this article , the authors identify the processes and actors which compose the regulation level of energy systems, establish the basic relationships between these actors and processes, outlining the guidelines for the establishment and/or modification of policies, laws, and regulations related to the transition of energy management systems to the EC.

Journal ArticleDOI
TL;DR: The quality of student performance is stronglyrelated to the quality of daily homework, and if it is related to the teacher's gender, professional title, etc., it is recommended that schools should pay more attention to homework during the teaching process.
Abstract: This paper uses data mining technology to analyze students' English scores. In view of the influence of many factors on students' English performance, the analysis is realized by using the association rule algorithm. The thesis analyzes and applies students' English scores based on association rules and mainly does the following work: (1) at present, the problem of the CARMA algorithm is low operating efficiency. The combination of the genetic algorithm's crossover, mutation, and the CARMA algorithm realizes the fast search of the algorithm. The simulation results show that the operation performance of the algorithm is greatly improved after the crossover and mutation operations in the genetic algorithm are applied to the CARMA algorithm. The simulation results show that the mining accuracy of the improved algorithm is 97.985%, and the mining accuracy before the improvement is 92.221%, indicating that the improved algorithm can improve the accuracy of mining. (2) By comparing the mining time of the improved CARMA algorithm, the traditional CARMA algorithm, the FP-Growth algorithm, and the Apriori algorithm, the results show that when the number is 6,500, the mining efficiency of the improved CARMA algorithm is twice that of the other three algorithms. As the amount of data increases, the effect of improving mining efficiency gradually increases. (3) By using the improved CARMA algorithm to analyze students' English performance, it is found that the quality of student performance is strongly related to the quality of daily homework, and if it is related to the teacher's gender, professional title, etc., it is recommended that schools should pay more attention to homework during the teaching process.

Journal ArticleDOI
TL;DR: The Scalable Association Rule Learning (SARL) algorithm as mentioned in this paper is a heuristic that efficiently learns gene-disease association rules and gene-gene association rules from large-scale microarray datasets.
Abstract: Abstract Association rule learning algorithms have been applied to microarray datasets to find association rules among genes. With the development of microarray technology, larger datasets have been generated recently that challenge the current association rule learning algorithms. Specifically, the large number of items per transaction significantly increases the running time and memory consumption of such tasks. In this paper, we propose the Scalable Association Rule Learning (SARL) heuristic that efficiently learns gene-disease association rules and gene–gene association rules from large-scale microarray datasets. The rules are ranked based on their importance. Our experiments show the SARL algorithm outperforms the Apriori algorithm by one to three orders of magnitude.

Book ChapterDOI
01 Jan 2022
TL;DR: This research proposes two location-based recommendation systems by using the collaborative and content-based filtering recommendation technique to find the optimal index K (clusters).
Abstract: Location-based services encompass a spectrum of services. Today, it is easier to locate or search for our favorite restaurant, shop, etc., under these services. It helps us get access to important and up-to-date information about their surroundings on a single tap. This research proposes two location-based recommendation systems by using the collaborative and content-based filtering recommendation techniques. The first one is a personalized location-based recommender that uses the content filtering technique. In this recommender, the behavioral patterns are extracted from the user’s location history and then provide personalized recommendations based on patterns. Apriori algorithm has been used to extract user-specific behavioral patterns based on time zone, weekday, and location type. The second one is a generalized location-based recommender that uses the collaborative filtering technique. It employs the K-means clustering algorithm and the silhouette metric and elbow method to find the optimal index K (clusters).

Journal ArticleDOI
TL;DR: In this paper , the authors proposed three different phases namely the pre-processing phase, FIM phase and ARM phase to enhance the consistency that signifies the total number of frequently discovered frequent itemsets.
Abstract: The concept based on data mining has drawn considerable attention from various database professionals and research scholars. The progression of computer-based advancements, namely database management and data storage has facilitated the storage of large data and the data mining approaches are employed to gain valuable information from huge databases. Recently, several techniques to association rule mining (ARM) and frequent itemset mining (FIM) have been established; yet the efficiency based on execution time and scalability continues to be seen as a significant limitation that results in poor solution quality. Therefore, it is necessary to enhance the consistency that signifies the total number of frequently discovered frequent itemsets. This paper proposes three different phases namely the pre-processing phase, FIM phase and ARM phase. In the first pre-processing phase, the Twitter databases are pre-processed and converted into a suitable format for FIM. Here, the tweets are converted into related feature sets and items. In the second FIM phase, an improved Apriori algorithm is 1utilized in mining and extracting the frequent Then in the final phase, an adaptive billiard inspired optimization (ABIO) algorithm which is the integration of neural network (NN) optimization algorithm and billiard inspired optimization (BIO) algorithm is proposed for the optimal generation of association rules with minimum support and confidence from the huge itemsets. Finally, the recent tweets based on covidvaccine, BTSlivestreaming, KFC, McDonald’s as well as lockdown achieved using the hashtag is evaluated for various performance measures, like precision, recall, [Formula: see text]-measure, execution time and memory utilization. Also, comparative analyses are performed to evaluate the efficiency of the proposed technique.

Journal ArticleDOI
TL;DR: In this article , the authors explored adjustment methods for college students' mental health based on virtual reality under the background of positive psychology, and proposed an anomaly mining algorithm based on clustering to quickly find anomalous data health problems.
Abstract: Colleges and universities are in an important position to train builders and successors of the socialist cause whilst promoting quality education. Mental health education is an important foundation and condition for comprehensively improving students' overall quality. This research explores adjustment methods for college students' mental health based on virtual reality under the background of positive psychology. It discusses the importance of system requirements analysis in the software development process, analyzes the system's functional requirements, safety requirements, and software and hardware requirements, and uses the Apriori algorithm to explore the influencing factors of college students' mental health. Based on the system engineering method and using the data mining clustering method undertake detailed analysis and research on the mental health of college students, it then designs an anomaly mining algorithm based on clustering to quickly find anomalous data health problems. The interface design of the system is concise and the operation is simple. Users can conveniently input, query, and count information according to the various controls on the interface, which fully embodies human-oriented characteristics. Exploration of the characteristics of students' frequent Internet access ensures the efficiency, accuracy, and comprehensiveness of the evaluation and consultation work, facilitating psychological counseling for teachers and students, and saving paper. By establishing a data mining model, mining the database, and learning about different student groups and their respective characteristics, we discuss our research on student psychology and summarize the mental health status and gender, adaptation and anxiety, introversion, emotionality, and calmness of college students. We also consider the relationship between sex, negative, and courage. Using positive psychology theory, we examine the positive experiences of students and interconnected qualities, to build a mental health practice system. In the experiment, the happiness index evaluation of the virtual reality treatment system group was significant, P = 0.002 < 0.05. Mental health education plays an important role in cultivating the healthy psychology of college students, developing their psychological potential, enhancing their adaptability, and improving their personality. This analysis based on actual data provides a reliable basis for psychological educators to improve the efficiency and effectiveness of school psychological counseling and to facilitate schools in establishing new methods of early prevention and intervention for psychological disorders, enabling institutions to create a healthy atmosphere for college students.

Journal ArticleDOI
TL;DR: A data mining model has been proposed to enhance the accuracy of predicting and to find association rules for frequent item sets and K-means clustering algorithm has been used to reduce the size of the dataset in order to enhanced the runtime for the proposed model.
Abstract: Nowadays e-commerce environment plays an important role to exchange commodity knowledge between consumers commonly with others. Accurately predicting customer purchase patterns in the e-commerce market is one of the critical applications of data mining. In order to achieve high profit in e-commerce, the relationship between customer and merchandise are very important. Moreover, many e-commerce websites increase rapidly and instantly and competition has become just a mouse-click away. That is why the importance of staying in the business, and improving the profit needs to accurately predict purchase behavior and target their customers with personalized services according to their preferences. In this paper, a data mining model has been proposed to enhance the accuracy of predicting and to find association rules for frequent item sets. Also, K-means clustering algorithm has been used to reduce the size of the dataset in order to enhance the runtime for the proposed model. The proposed model has used four different classifiers which are C4.5, J48, CS-MC4, and MLR. Also, Apriori algorithm to provide recommendations for items based on previous purchases. The proposed model has been tested on Northwind trader’s dataset and the results archives accuracy equal 95.2% when the number of clusters were 8. Keywords—Apriori PT algorithm; C4.5; CS-MC4; Data mining; decision tree; e-commerce; K-means

Journal ArticleDOI
TL;DR: It was found that FP-Growth calculation takes more modest time than Apriori calculation to yield novel results, and the applied A Priori algorithm confirms to have higher accuracy than the FP- Growth algorithm.
Abstract: Aim: To predict the novel and to forecast sales for festival season hypermarkets. Materials and Methods: A total of 484 samples were collected from market datasets available in kaggle. For this two algorithms were used, one is the FP-Growth algorithm and another is Apriori algorithm. Both the algorithms were executed and compared for accuracy. Result: Apriori achieved accuracy, precision, sensitivity and specificity of 73 %,75%, 78%,and 80%, respectively, compared to 71%, 73%, 76%, 75%, and 78% by FP-Growth algorithm, 87.4%, 88.2%, 89.2%, and 93%, respectively, compared to 80.1%, 83.39%, 84%, and 86.20% by Apriori algorithm. The results were obtained with a level of significance (p<=0.310). Conclusion: The applied Apriori algorithm confirms to have higher accuracy than the FP-Growth algorithm. It was additionally found that FP-Growth calculation takes more modest time than Apriori calculation to yield novel results.