scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Educational data mining: A survey from 1995 to 2005

01 Jul 2007-Expert Systems With Applications (Pergamon)-Vol. 33, Iss: 1, pp 135-146
TL;DR: This paper surveys the application of data mining to traditional educational systems, particular web- based courses, well-known learning content management systems, and adaptive and intelligent web-based educational systems.
Abstract: Currently there is an increasing interest in data mining and educational systems, making educational data mining as a new growing research community. This paper surveys the application of data mining to traditional educational systems, particular web-based courses, well-known learning content management systems, and adaptive and intelligent web-based educational systems. Each of these systems has different data source and objectives for knowledge discovering. After preprocessing the available data in each case, data mining techniques can be applied: statistics and visualization; clustering, classification and outlier detection; association rule mining and pattern mining; and text mining. The success of the plentiful work needs much more specialized work in order for educational data mining to become a mature area.
Citations
More filters
Journal ArticleDOI
01 Nov 2010
TL;DR: The most relevant studies carried out in educational data mining to date are surveyed and the different groups of user, types of educational environments, and the data they provide are described.
Abstract: Educational data mining (EDM) is an emerging interdisciplinary research area that deals with the development of methods to explore data originating in an educational context. EDM uses computational approaches to analyze educational data in order to study educational questions. This paper surveys the most relevant studies carried out in this field to date. First, it introduces EDM and describes the different groups of user, types of educational environments, and the data they provide. It then goes on to list the most typical/common tasks in the educational environment that have been resolved through data-mining techniques, and finally, some of the most promising future lines of research are discussed.

1,723 citations


Cites background from "Educational data mining: A survey f..."

  • ...The first one [221] is a former review of Romero and Ventura with 81...

    [...]

Proceedings ArticleDOI
01 Oct 2009
TL;DR: This paper reviewed the history and current trends in the field of EDM and discussed trends and shifts in the research conducted by this community, and discussed the increased emphasis on prediction, the emergence of work using existing models to make scientific discoveries, and the reduction in the frequency of relationship mining within the EDM community.
Abstract: We review the history and current trends in the field of Educational Data Mining (EDM). We consider the methodological profile of research in the early years of EDM, compared to in 2008 and 2009, and discuss trends and shifts in the research conducted by this community. In particular, we discuss the increased emphasis on prediction, the emergence of work using existing models to make scientific discoveries ("discovery with models"), and the reduction in the frequency of relationship mining within the EDM community. We discuss two ways that researchers have attempted to categorize the diversity of research in educational data mining research, and review the types of research problems that these methods have been used to address. The most cited papers in EDM between 1995 and 2005 are listed, and their influence on the EDM community (and beyond the EDM community) is discussed.

1,217 citations

Journal ArticleDOI
TL;DR: This work describes the full process for mining e-learning data step by step as well as how to apply the main data mining techniques used, such as statistics, visualization, classification, clustering and association rule mining of Moodle data.
Abstract: Educational data mining is an emerging discipline, concerned with developing methods for exploring the unique types of data that come from the educational context. This work is a survey of the specific application of data mining in learning management systems and a case study tutorial with the Moodle system. Our objective is to introduce it both theoretically and practically to all users interested in this new research area, and in particular to online instructors and e-learning administrators. We describe the full process for mining e-learning data step by step as well as how to apply the main data mining techniques used, such as statistics, visualization, classification, clustering and association rule mining of Moodle data. We have used free data mining tools so that any user can immediately begin to apply data mining without having to purchase a commercial tool or program a specific personalized tool.

1,049 citations

Journal ArticleDOI
Rebecca Ferguson1
TL;DR: This review of the field begins with an examination of the technological, educational and political factors that have driven the development of analytics in educational settings, and goes on to chart the emergence of learning analytics.
Abstract: Learning analytics is a significant area of technology-enhanced learning that has emerged during the last decade. This review of the field begins with an examination of the technological, educational and political factors that have driven the development of analytics in educational settings. It goes on to chart the emergence of learning analytics, including their origins in the 20th century, the development of data-driven analytics, the rise of learning-focused perspectives and the influence of national economic concerns. It next focuses on the relationships between learning analytics, educational data mining and academic analytics. Finally, it examines developing areas of learning analytics research, and identifies a series of future challenges.

1,029 citations


Cites background from "Educational data mining: A survey f..."

  • ...Overall, data mining is a field of computing that applies a variety of techniques (for example, decision tree construction, rule induction, artificial neural networks, instance-based learning, Bayesian learning, logic programming and statistical algorithms) to databases in order to discover and display previously unknown, and potentially useful, data patterns (Chatti et al., this issue; Romero and Ventura, 2007)....

    [...]

  • ...…rule induction, artificial neural networks, instance-based learning, Bayesian learning, logic programming and statistical algorithms) to databases in order to discover and display previously unknown, and potentially useful, data patterns (Chatti et al., this issue; Romero and Ventura, 2007)....

    [...]

  • ...EDM emerged from the analysis of logs of student-computer interaction and, until 2005, relationship-mining methods were the most prominent type of EDM research, followed by prediction methods (Baker and Yacef, 2009)....

    [...]

  • ...As learning analytics emerge from the wide fields of analytics and data mining, disambiguating themselves from academic analytics and EDM, researchers will need to build strong connections with the learning sciences....

    [...]

  • ...As a result, the literature of the two diverged and the key EDM references identified by Romero and Ventura (2007) were displaced in the analytics literature by generic references to overviews of the EDM field (Romero and Ventura, 2007; Baker and Yacef, 2009)....

    [...]

Journal ArticleDOI
TL;DR: Key milestones and the current state of affairs in the field of EDM are reviewed, together with specific applications, tools, and future insights.
Abstract: Applying data mining DM in education is an emerging interdisciplinary research field also known as educational data mining EDM. It is concerned with developing methods for exploring the unique types of data that come from educational environments. Its goal is to better understand how students learn and identify the settings in which they learn to improve educational outcomes and to gain insights into and explain educational phenomena. Educational information systems can store a huge amount of potential data from multiple sources coming in different formats and at different granularity levels. Each particular educational problem has a specific objective with special characteristics that require a different treatment of the mining problem. The issues mean that traditional DM techniques cannot be applied directly to these types of data and problems. As a consequence, the knowledge discovery process has to be adapted and some specific DM techniques are needed. This paper introduces and reviews key milestones and the current state of affairs in the field of EDM, together with specific applications, tools, and future insights. © 2012 Wiley Periodicals, Inc.

885 citations

References
More filters
Proceedings ArticleDOI
01 Jun 1993
TL;DR: An efficient algorithm is presented that generates all significant association rules between items in the database of customer transactions and incorporates buffer management and novel estimation and pruning techniques.
Abstract: We are given a large database of customer transactions. Each transaction consists of items purchased by a customer in a visit. We present an efficient algorithm that generates all significant association rules between items in the database. The algorithm incorporates buffer management and novel estimation and pruning techniques. We also present results of applying this algorithm to sales data obtained from a large retailing company, which shows the effectiveness of the algorithm.

15,645 citations


"Educational data mining: A survey f..." refers background in this paper

  • ...Mining association rules between sets of items in large databases was first stated by Agrawal, Imielinski, and Swami (1993) and it opened a brand new family of algorithms....

    [...]

Proceedings ArticleDOI
06 Mar 1995
TL;DR: Three algorithms are presented to solve the problem of mining sequential patterns over databases of customer transactions, and empirically evaluating their performance using synthetic data shows that two of them have comparable performance.
Abstract: We are given a large database of customer transactions, where each transaction consists of customer-id, transaction time, and the items bought in the transaction. We introduce the problem of mining sequential patterns over such databases. We present three algorithms to solve this problem, and empirically evaluate their performance using synthetic data. Two of the proposed algorithms, AprioriSome and AprioriAll, have comparable performance, albeit AprioriSome performs a little better when the minimum number of customers that must support a sequential pattern is low. Scale-up experiments show that both AprioriSome and AprioriAll scale linearly with the number of customer transactions. They also have excellent scale-up properties with respect to the number of transactions per customer and the number of items in a transaction. >

5,663 citations


"Educational data mining: A survey f..." refers background in this paper

  • ...Sequential pattern mining ( Agrawal & Srikant, 1995 ) attempts to find inter-session patterns such as the presence of a set of items followed by another item in a time-ordered set of sessions or episodes....

    [...]

Journal ArticleDOI
TL;DR: Web usage mining is the application of data mining techniques to discover usage patterns from Web data, in order to understand and better serve the needs of Web-based applications as mentioned in this paper, where preprocessing, pattern discovery, and pattern analysis are described in detail.
Abstract: Web usage mining is the application of data mining techniques to discover usage patterns from Web data, in order to understand and better serve the needs of Web-based applications. Web usage mining consists of three phases, namely preprocessing, pattern discovery, and pattern analysis. This paper describes each of these phases in detail. Given its application potential, Web usage mining has seen a rapid increase in interest, from both the research and practice communities. This paper provides a detailed taxonomy of the work in this area, including research efforts as well as commercial offerings. An up-to-date survey of the existing work is also provided. Finally, a brief overview of the WebSIFT system as an example of a prototypical Web usage mining system is given.

2,227 citations

Book ChapterDOI
01 Apr 2003
TL;DR: A challenging research goal is the development of adaptive and intelligent Web-based educational systems (W-AIES) that offer some amount of adaptivity and intelligence.
Abstract: Currently, Web-based educational systems form one of the fastest growing areas in educational technology research and development. Benefits of Web-based education are independence of teaching and learning with respect to time and space. Courseware installed and maintained in one place may be used by a huge number of users all over the world. A challenging research goal is the development of adaptive and intelligent Web-based educational systems (W-AIES) that offer some amount of adaptivity and intelligence.

679 citations

Journal ArticleDOI

627 citations


"Educational data mining: A survey f..." refers background in this paper

  • ...Li and Zaı̈ane (2004) use more information channels to model user navigational behavior: web access logs, the structure of a visited web site, and the content of visited web pages. Avouris, Komis, Fiotakis, Margaritis, and Voyiatzaki (2005) expand automatically generated log files by introducing contextual information as additional events and by associating comments and static files....

    [...]

  • ...Web-based education is a form of distance education delivered over the Internet (Johnson et al., 2000)....

    [...]

  • ...Li and Zaı̈ane (2004) use more information channels to model user navigational behavior: web access logs, the structure of a visited web site, and the content of visited web pages. Avouris, Komis, Fiotakis, Margaritis, and Voyiatzaki (2005) expand automatically generated log files by introducing contextual information as additional events and by associating comments and static files. Monk (2005) combines data on the activity with content and user profiles in a composite information model....

    [...]

  • ...Li and Zaı̈ane (2004) use more information channels to model user navigational behavior: web access logs, the structure of a visited web site, and the content of visited web pages....

    [...]