scispace - formally typeset
Proceedings ArticleDOI

Application of Association Rule Mining: A case study on team India

21 Feb 2013-pp 1-6
TL;DR: The outcome of the analysis reveals that Team India has performed well in the last ten years as compared to entire period since the team started playing its first match.

...read more

Abstract: This paper applies Association Rule Mining algorithm to sports management, especially mining relationship from data on performance of Indian cricket team in one day international (ODI) matches This analysis will help in determining factors associated with the match outcome so as to enable the team to formulate match winning strategies Data has been obtained from secondary sources to obtain deeper insights on playing conditions and the match outcome The association among factors such as outcome of toss, playing in a home ground or playing abroad, batting first or batting second, and the match outcome, ie, win or loss is examined The outcome of the analysis reveals that Team India has performed well in the last ten years (since 2001 to 2010) as compared to entire period since the team started playing its first match (since 1974 to 2010)

...read more

Citations
More filters

Proceedings ArticleDOI
01 Jan 2014-
TL;DR: A prediction system that takes in historical match data as well as the instantaneous state of a match, and predicts future match events culminating in a victory or loss is built, demonstrating the performance of the algorithms in predicting the number of runs scored, one of the most important determinants of match outcome.

...read more

Abstract: Cricket is a popular sport played by 16 countries, is the second most watched sport in the world after soccer, and enjoys a multi-million dollar industry. There is tremendous interest in simulating cricket and more importantly in predicting the outcome of games, particularly in their one-day international format. The complex rules governing the game, along with the numerous natural parameters affecting the outcome of a cricket match present significant challenges for accurate prediction. Multiple diverse parameters, including but not limited to cricketing skills and performances, match venues and even weather conditions can significantly affect the outcome of a game. The sheer number of parameters, along with their interdependence and variance create a non-trivial challenge to create an accurate quantitative model of a game Unlike other sports such as basketball and baseball which are well researched from a sports analytics perspective, for cricket, these tasks have yet to be investigated in depth. In this paper, we build a prediction system that takes in historical match data as well as the instantaneous state of a match, and predicts future match events culminating in a victory or loss. We model the game using a subset of match parameters, using a combination of linear regression and nearestneighbor clustering algorithms. We describe our model and algorithms and finally present quantitative results, demonstrating the performance of our algorithms in predicting the number of runs scored, one of the most important determinants of match outcome.

...read more

35 citations


7


Cites background from "Application of Association Rule Min..."

  • ...Raj and Padma [15] analyze the Indian cricket team’s One-Day International (ODI) match data and mine association rules from a set of features, namely toss, home or away game, batting first or second and game outcome....

    [...]


Proceedings ArticleDOI
T.P. Singh1, Vishal Singla1, Parteek Bhatia1Institutions (1)
01 Oct 2015-
Abstract: Currently, in One Day International (ODI) cricket matches first innings score is predicted on the basis of Current Run Rate which can be calculated as the amount of runs scored per the number of overs bowled. It does not include factors like number of wickets fallen and venue of the match. Furthermore, in second innings there is no method to predict the outcome of the match. In this paper a model has been proposed that has two methods, first predicts the score of first innings not only on the basis of current run rate but also considers number of wickets fallen, venue of the match and batting team. The second method predicts the outcome of the match in the second innings considering the same attributes as of the former method along with the target given to the batting team. These two methods have been implemented using Linear Regression Classifier and Naive Bayes Classifier for first innings and second innings respectively. In both methods, 5 over intervals have been made from 50 overs of the match and at each interval above mentioned attributes have been recorded of all non-curtailed matches played between 2002 and 2014 of every team independently. It has been found in the results that error in Linear Regression classifier is less than Current Run Rate method in estimating the final score and also accuracy of Naive Bayes in predicting match outcome has been 68% initially from 0–5 overs to 91% till the end of 45th over.

...read more

28 citations


Cites background from "Application of Association Rule Min..."

  • ...There are certain rules defined to get the batsman out by the bowlers or the fielders....

    [...]


Journal ArticleDOI
TL;DR: Attempts are made to investigate the feasibility of using collective knowledge obtained from microposts posted on Twitter to predict the winner of a Cricket match to classify winning team prediction in a Cricket game before the start of game.

...read more

Abstract: Social media has become a platform of first choice where one can express his/her feelings with freedom. The sports and matches being played are also discussed on social media such as Twitter. In this article, efforts are made to investigate the feasibility of using collective knowledge obtained from microposts posted on Twitter to predict the winner of a Cricket match. For predictions, we use three different methods that depend on the total number of tweets before the game for each team, fans sentiments toward each team and fans score predictions on Twitter. By combining these three methods, we classify winning team prediction in a Cricket game before the start of game. Our results are promising enough to be used for winning team forecast. Furthermore, the effectiveness of supervised learning algorithms is evaluated where Support Vector Machine (SVM) has shown advantage over other classifiers.

...read more

24 citations


Cites background from "Application of Association Rule Min..."

  • ...Raj and Padma [11] analyzed the data of Indian Cricket team ODI matches and mine association rules from the following set of features: toss, home/away game, batting first or second and game result....

    [...]


Proceedings ArticleDOI
01 May 2015-
TL;DR: This paper has applied association rule mining technique on individual Indian players' career-record to obtain the underlying unknown relations of several factors impacting the players' performances, which could help in selecting the best-suited team-combination in a given match condition.

...read more

Abstract: Performance analysis in every sport is essential to find out the weaknesses and strengths of the players. In a team game like cricket, analysis of career-data is indispensable to get the insight of the players' performance, which helps the selectors to do their job flawlessly and also helps the players' themselves to identify their weaknesses and their strengths. And, when the time comes for world cup cricket, every team looks for their best team-combination to be available on the ground to achieve the desired result in their favour. Association rule mining techniques reveal the unknown information from a huge set of data, and this technique can be used in extracting the information from the performance-data of the players. In this paper, we have applied association rule mining technique on individual Indian players' career-record to obtain the underlying unknown relations of several factors impacting the players' performances. This analysis could help in selecting the best-suited team-combination in a given match condition. The relation among various intrinsic factors such as venue of the match, batting first or second, batting position (of batsmen), strike-rate and runs (of batsmen), economy-rate and wickets taken (of bowlers) is analyzed. The result of this study could be useful for Indian team-captain and team-manager in decision making and strategy planning and could also boost the chances of success for team India for the world cup 2015.

...read more

6 citations


Cites methods from "Application of Association Rule Min..."

  • ...Association rule mining technique was used by Raj and Padma [6] for analysing India’s performance in ODIs....

    [...]

  • ...[6] Raj, K. Antony Arokia Durai, and Panchapakesan Padma....

    [...]


Posted Content
Abstract: Recently, data mining studies are being successfully conducted to estimate several parameters in a variety of domains. Data mining techniques have attracted the attention of the information industry and society as a whole, due to a large amount of data and the imminent need to turn it into useful knowledge. However, the effective use of data in some areas is still under development, as is the case in sports, which in recent years, has presented a slight growth; consequently, many sports organizations have begun to see that there is a wealth of unexplored knowledge in the data extracted by them. Therefore, this article presents a systematic review of sports data mining. Regarding years 2010 to 2018, 31 types of research were found in this topic. Based on these studies, we present the current panorama, themes, the database used, proposals, algorithms, and research opportunities. Our findings provide a better understanding of the sports data mining potentials, besides motivating the scientific community to explore this timely and interesting topic.

...read more

4 citations


References
More filters

Proceedings ArticleDOI
01 Jun 1993-
TL;DR: An efficient algorithm is presented that generates all significant association rules between items in the database of customer transactions and incorporates buffer management and novel estimation and pruning techniques.

...read more

Abstract: We are given a large database of customer transactions. Each transaction consists of items purchased by a customer in a visit. We present an efficient algorithm that generates all significant association rules between items in the database. The algorithm incorporates buffer management and novel estimation and pruning techniques. We also present results of applying this algorithm to sales data obtained from a large retailing company, which shows the effectiveness of the algorithm.

...read more

15,011 citations


Proceedings Article
01 Jul 1998-
TL;DR: Two new algorithms for solving thii problem that are fundamentally different from the known algorithms are presented and empirical evaluation shows that these algorithms outperform theknown algorithms by factors ranging from three for small problems to more than an order of magnitude for large problems.

...read more

Abstract: We consider the problem of discovering association rules between items in a large database of sales transactions. We present two new algorithms for solving thii problem that are fundamentally different from the known algorithms. Empirical evaluation shows that these algorithms outperform the known algorithms by factors ranging from three for small problems to more than an order of magnitude for large problems. We also show how the best features of the two proposed algorithms can be combined into a hybrid algorithm, called AprioriHybrid. Scale-up experiments show that AprioriHybrid scales linearly with the number of transactions. AprioriHybrid also has excellent scale-up properties with respect to the transaction size and the number of items in the database.

...read more

10,858 citations


Journal ArticleDOI
Jiawei Han1, Jian Pei1, Yiwen Yin1Institutions (1)
16 May 2000-
TL;DR: This study proposes a novel frequent pattern tree (FP-tree) structure, which is an extended prefix-tree structure for storing compressed, crucial information about frequent patterns, and develops an efficient FP-tree-based mining method, FP-growth, for mining the complete set of frequent patterns by pattern fragment growth.

...read more

Abstract: Mining frequent patterns in transaction databases, time-series databases, and many other kinds of databases has been studied popularly in data mining research. Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach. However, candidate set generation is still costly, especially when there exist prolific patterns and/or long patterns.In this study, we propose a novel frequent pattern tree (FP-tree) structure, which is an extended prefix-tree structure for storing compressed, crucial information about frequent patterns, and develop an efficient FP-tree-based mining method, FP-growth, for mining the complete set of frequent patterns by pattern fragment growth. Efficiency of mining is achieved with three techniques: (1) a large database is compressed into a highly condensed, much smaller data structure, which avoids costly, repeated database scans, (2) our FP-tree-based mining adopts a pattern fragment growth method to avoid the costly generation of a large number of candidate sets, and (3) a partitioning-based, divide-and-conquer method is used to decompose the mining task into a set of smaller tasks for mining confined patterns in conditional databases, which dramatically reduces the search space. Our performance study shows that the FP-growth method is efficient and scalable for mining both long and short frequent patterns, and is about an order of magnitude faster than the Apriori algorithm and also faster than some recently reported new frequent pattern mining methods.

...read more

5,773 citations


Proceedings ArticleDOI
Jong Soo Park1, Ming-Syan Chen1, Philip S. Yu1Institutions (1)
22 May 1995-
TL;DR: The number of candidate 2-itemsets generated by the proposed algorithm is, in orders of magnitude, smaller than that by previous methods, thus resolving the performance bottleneck, and allows us to effectively trim the transaction database size at a much earlier stage of the iterations, thereby reducing the computational cost for later iterations significantly.

...read more

Abstract: In this paper, we examine the issue of mining association rules among items in a large database of sales transactions. The mining of association rules can be mapped into the problem of discovering large itemsets where a large itemset is a group of items which appear in a sufficient number of transactions. The problem of discovering large itemsets can be solved by constructing a candidate set of itemsets first and then, identifying, within this candidate set, those itemsets that meet the large itemset requirement. Generally this is done iteratively for each large k-itemset in increasing order of k where a large k-itemset is a large itemset with k items. To determine large itemsets from a huge number of candidate large itemsets in early iterations is usually the dominating factor for the overall data mining performance. To address this issue, we propose an effective hash-based algorithm for the candidate set generation. Explicitly, the number of candidate 2-itemsets generated by the proposed algorithm is, in orders of magnitude, smaller than that by previous methods, thus resolving the performance bottleneck. Note that the generation of smaller candidate sets enables us to effectively trim the transaction database size at a much earlier stage of the iterations, thereby reducing the computational cost for later iterations significantly. Extensive simulation study is conducted to evaluate performance of the proposed algorithm.

...read more

1,597 citations


"Application of Association Rule Min..." refers methods in this paper

  • ...Most of the previous studies have proposed different mining algorithm which is similar to or modified version of apriori algorithm [2] [10] [12] [11] [8]....

    [...]


Proceedings Article
Hannu Toivonen1Institutions (1)
03 Sep 1996-
TL;DR: New algorithms that reduce the database activity considerably by picking a Random sample, to find using this sample all association rules that probably hold in the whole database, and then to verify the results with the rest of the database.

...read more

Abstract: Discovery of association rules .is an important database mining problem. Current algorithms for finding association rules require several passes over the analyzed database, and obviously the role of I/O overhead is very significant for very large databases. We present new algorithms that reduce the database activity considerably. The idea is to pick a Random sample, to find using this sample all association rules that probably hold in the whole database, and then to verify the results with the rest of the database. The algorithms thus produce exact association rules, not approximations based on a sample. The approach is, however, probabilistic, and in those rare cases where our sampling method does not produce all association rules, the missing rules can be found in a second pass. Our experiments show that the proposed algorithms can find association rules very efficiently in only one database

...read more

1,230 citations


"Application of Association Rule Min..." refers background in this paper

  • ...Some of the sampling methods proposed in the past are simple random sampling, finding associations from sampled transactions (FAST), and epsilon approximation sample enabled (EASE) [13] [5] [4]....

    [...]


Network Information
Related Papers (5)
01 Jul 2020

P. Rajesh, Bharadwaj +2 more

10 Sep 2010

Robert P. Schumaker, Osama K. Solieman +1 more

Performance
Metrics
No. of citations received by the Paper in previous years
YearCitations
20201
20194
20183
20171
20153
20143