scispace - formally typeset
Search or ask a question

Showing papers on "Educational data mining published in 2012"


Journal ArticleDOI
Rebecca Ferguson1
TL;DR: This review of the field begins with an examination of the technological, educational and political factors that have driven the development of analytics in educational settings, and goes on to chart the emergence of learning analytics.
Abstract: Learning analytics is a significant area of technology-enhanced learning that has emerged during the last decade. This review of the field begins with an examination of the technological, educational and political factors that have driven the development of analytics in educational settings. It goes on to chart the emergence of learning analytics, including their origins in the 20th century, the development of data-driven analytics, the rise of learning-focused perspectives and the influence of national economic concerns. It next focuses on the relationships between learning analytics, educational data mining and academic analytics. Finally, it examines developing areas of learning analytics research, and identifies a series of future challenges.

1,029 citations


Proceedings ArticleDOI
29 Apr 2012
TL;DR: This paper argues for increased and formal communication and collaboration between these communities in order to share research, methods, and tools for data mining and analysis in the service of developing both LAK and EDM fields.
Abstract: Growing interest in data and analytics in education, teaching, and learning raises the priority for increased, high-quality research into the models, methods, technologies, and impact of analytics. Two research communities -- Educational Data Mining (EDM) and Learning Analytics and Knowledge (LAK) have developed separately to address this need. This paper argues for increased and formal communication and collaboration between these communities in order to share research, methods, and tools for data mining and analysis in the service of developing both LAK and EDM fields.

801 citations


Journal ArticleDOI
TL;DR: A reference model for LA is described based on four dimensions, namely data and environments what?
Abstract: Recently, there is an increasing interest in learning analytics in Technology-Enhanced Learning TEL. Generally, learning analytics deals with the development of methods that harness educational datasets to support the learning process. Learning analytics LA is a multi-disciplinary field involving machine learning, artificial intelligence, information retrieval, statistics and visualisation. LA is also a field in which several related areas of research in TEL converge. These include academic analytics, action analytics and educational data mining. In this paper, we investigate the connections between LA and these related fields. We describe a reference model for LA based on four dimensions, namely data and environments what?, stakeholders who?, objectives why? and methods how?. We then review recent publications on LA and its related fields and map them to the four dimensions of the reference model. Furthermore, we identify various challenges and research opportunities in the area of LA in relation to each dimension.

561 citations


01 Oct 2012
TL;DR: This issue brief is intended to help policymakers and administrators understand how analytics and data mining have been—and can be—applied for educational improvement.
Abstract: The authors are grateful for the deliberations of our technical working group (TWG) of academic experts in educational data mining and learning analytics. These experts provided constructive guidance and comments for this issue brief. The TWG comprised Ryan S. In data mining and data analytics, tools and techniques once confined to research laboratories are being adopted by forward-looking industries to generate business intelligence for improving decision making. Higher education institutions are beginning to use analytics for improving the services they provide and for increasing student grades and retention. The U.S. Department of Education's National Education Technology Plan, as one part of its model for 21st-century learning powered by technology, envisions ways of using data from online learning systems to improve instruction. With analytics and data mining experiments in education starting to proliferate, sorting out fact from fiction and identifying research possibilities and practical applications are not easy. This issue brief is intended to help policymakers and administrators understand how analytics and data mining have been—and can be—applied for educational improvement. At present, educational data mining tends to focus on developing new tools for discovering patterns in data. These patterns are generally about the microconcepts involved in learning: one-digit multiplication, subtraction with carries, and so on. Learning analytics—at least as it is currently contrasted with data mining—focuses on applying tools and techniques at larger scales, such as in courses and at schools and postsecondary institutions. But both disciplines work with patterns and prediction: If we can discern the pattern in the data and make sense of what is happening, we can predict what should come next and take the appropriate action. Educational data mining and learning analytics are used to research and build models in several areas that can influence online learning systems. One area is user modeling, which encompasses what a learner knows, what a learner's behavior and motivation are, what the user experience is like, and how satisfied users are with online learning. At the simplest level, analytics can detect when a student in an online course is going astray and nudge him or her on to a course correction. At the most complex, they hold promise of detecting boredom from patterns of key clicks and redirecting the student's attention. Because these data are gathered in real time, there is a real possibility of continuous improvement via multiple feedback loops that operate at different time scales—immediate to the student …

509 citations


Journal Article
TL;DR: It is a current goal at RWTH Aachen University to enhance its VLE with user-friendly tools for Learning Analytics, in order to equip their teachers and tutors with means to evaluate the effectiveness of TEL within their instructional design and courses offered.
Abstract: Introduction Learning Management Systems (LMS) or Virtual Learning Environments (VLE) are widely used and have become part of the common toolkits of educators (Schroeder, 2009). One of the main goals of the integration of traditional teaching methods with technology enhancements is the improvement of teaching and learning quality in large university courses with many students. But does utilizing a VLE automatically improve teaching and learning? In our experience, many teachers just upload existing files, like lecture slides, handouts and exercises, when starting to use a VLE. Thereby availability of learning resources is improved. For improving teaching and learning it could be helpful to create more motivating, challenging, and engaging learning materials and e.g., collaborative scenarios to improve learning among large groups of students. Teachers could e.g., use audio and video recordings of their lectures or provide interactive, demonstrative multimedia examples and quizzes. If they put effort in the design of such online learning activities, they need tools that help them observe the consequences of their actions and evaluate their teaching interventions. They need to have appropriate access to data to assess changing behaviors and performances of their students to estimate the level of improvement that has been achieved in the learning environment. With the establishment of TEL, a new research field, called Learning Analytics, is emerging (Elias, 2011). This research field borrows and synthesizes techniques from different related fields, such as Educational Data Mining (EDM), Academic Analytics, Social Network Analysis or Business Intelligence (BI), to harness them for converting educational data into useful information and thereon to motivate actions, like self-reflecting ones previous teaching or learning activities, to foster improved teaching and learning. The main goal of BI is to turn enterprise data into useful information for management decision support. However, Learning Analytics, Academic Analytics, as well as EDM more specifically focus on tools and methods for exploring data coming from educational contexts. While Academic Analytics take a university-wide perspective, including also e.g., organizational and financial issues (Campbell & Oblinger, 2007), Learning Analytics as well as EDM focus specifically on data about teaching and learning. Siemens (2010) defines Learning Analytics as "the use of intelligent data, learner-produced data, and analysis models to discover information and social connections, and to predict and advise on learning." It can support teachers and students to take action based on the evaluation of educational data. However, the technology to deliver this potential is still very young and research on understanding the pedagogical usefulness of Learning Analytics is still in its infancy (Johnson et al., 2011b; Johnson et al., 2012). It is a current goal at RWTH Aachen University to enhance its VLE--the learning and teaching portal L2P (Gebhardt et al., 2007)--with user-friendly tools for Learning Analytics, in order to equip their teachers and tutors with means to evaluate the effectiveness of TEL within their instructional design and courses offered. These teachers still face difficulties, deterring them from integrating cyclical reflective research activities, comparable to Action Research, into everyday practice. Action Research is characterized by a continuing effort to closely interlink, relate and confront action and reflection, to reflect upon one's conscious and unconscious doings in order to develop one's actions, and to act reflectively in order to develop one's knowledge." (Altrichter et al., 2005, p. 6). A pre-eminent barrier is the additional workload, originating from tasks of collecting, integrating, and analyzing raw data from log files of their VLE (Altenbernd-Giani et al., 2009). To tackle these issues, we have developed the "exploratory Learning Analytics Toolkit" (eLAT). …

298 citations


Posted Content
TL;DR: In the present investigation, an experimental methodology was adopted to generate a database and raw data was preprocessed in terms of filling up missing values, transforming values in one form into another and relevant attribute/ variable selection for Byes classification prediction model construction.
Abstract: Now-a-days the amount of data stored in educational database increasing rapidly. These databases contain hidden information for improvement of students' performance. The performance in higher education in India is a turning point in the academics for all students. This academic performance is influenced by many factors, therefore it is essential to develop predictive data mining model for students' performance so as to identify the difference between high learners and slow learners student. In the present investigation, an experimental methodology was adopted to generate a database. The raw data was preprocessed in terms of filling up missing values, transforming values in one form into another and relevant attribute/ variable selection. As a result, we had 300 student records, which were used for by Byes classification prediction model construction. Keywords- Data Mining, Educational Data Mining, Predictive Model, Classification.

283 citations


Journal Article
TL;DR: This issue reflects the rapid maturation of learning analytics as a domain of research and indicates LA as a field with potential for improving teaching and learning.
Abstract: The early stages of the internet and world wide web drew attention to the communication and connective capacities of global networks. The ability to collaborate and interact with colleagues from around the world provided academics with new models of teaching and learning. Today, online education is a fast growing segment of the education sector. A side effect, to date not well explored, of digital learning is the collection of data and analytics in order to understand and inform teaching and learning. As learners engage in online or mobile learning, data trails are created. These data trails indicate social networks, learning dispositions, and how different learners come to understand core course concepts. Aggregate and large-scale data can also provide predictive value about the types of learning patterns and activity that might indicate risk of failure or drop out. The Society for Learning Analytics Research defines learning analytics as the measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs (http://www.solaresearch.org/mission/about/). As numerous papers in this issue reference, data analytics has drawn the attention of academics and academic leaders. High expectations exist for learning analytics to provide new insights into educational practices and ways to improve teaching, learning, and decision-making. The appropriateness of these expectations is the subject of researchers in the young but rapidly growing learning analytics field. Learning analytics currently sits at a crossroads between technical and social learning theory fields. On the one hand, the algorithms that form recommender systems, personalization models, and network analysis require deep technical expertise. The impact of these algorithms, however, is felt in the social system of learning. As a consequence, researchers in learning analytics have devoted significant attention to bridging these gaps and bringing these communities in contact with each other through conversations and conferences. The LAK12 conference in Vancouver, for example, included invited panels and presentations from the educational data mining community. The SoLAR steering committee also includes representation from the International Educational Data mining Society (http://www.educationaldatamining.org). This issue reflects the rapid maturation of learning analytics as a domain of research. The papers in this issue indicate LA as a field with potential for improving teaching and learning. Less clear, currently, is the long-term trajectory of LA as a discipline. LA borrows from numerous fields including computer science, sociology, learning sciences, machine learning, statistics, and "big data". Coalescing as a field will require leadership, openness, collaboration, and a willingness for researchers to approach learning analytics as a holistic process that includes both technical and social domains. This issue includes ten articles: Buckingham Shum and Fergusson describe social learning analytics (SLA) as a subset of learning analytics. SLA is concerned with the process of learning, instead of heavily favoring summative assessment. SLA emphasizes that "new skills and ideas are not solely individual achievements, but are developed, carried forward, and passed on through interaction and collaboration". As a consequence, analytics in social systems must account for connected and distributed interaction activity. Hung, Hsu, and Rice explore the role of data mining in K-12 online education program reviews, providing educators with institutional decision-making support, in addition to identifying the characteristics of successful and at-risk students. Greller and Drachsler propose a generic framework for learning analytics, intended to serve as a guide in setting up LA services within an educational institution. …

211 citations


01 Jan 2012
TL;DR: This paper used educational data mining to improve graduate students’ performance, and overcome the problem of low grades of graduate students.
Abstract: Educational data mining concerns with developing methods for discovering knowledge from data that come from educational domain. In this paper we used educational data mining to improve graduate students’ performance, and overcome the problem of low grades of graduate students. In our case study we try to extract useful knowledge from graduate students data collected from the college of Science and Technology – Khanyounis. The data include fifteen years period [1993-2007]. After preprocessing the data, we applied data mining techniques to discover association, classification, clustering and outlier detection rules. In each of these four tasks, we present the extracted knowledge and describe its importance in educational domain.

188 citations


Posted Content
TL;DR: The comparative analysis of the results states that the prediction has helped the weaker students to improve and brought out betterment in the result.
Abstract: Now-a-days the amount of data stored in educational database increasing rapidly. These databases contain hidden information for improvement of students' performance. Educational data mining is used to study the data available in the educational field and bring out the hidden knowledge from it. Classification methods like decision trees, Bayesian network etc can be applied on the educational data for predicting the student's performance in examination. This prediction will help to identify the weak students and help them to score better marks. The C4.5, ID3 and CART decision tree algorithms are applied on engineering student's data to predict their performance in the final exam. The outcome of the decision tree predicted the number of students who are likely to pass, fail or promoted to next year. The results provide steps to improve the performance of the students who were predicted to fail or promoted. After the declaration of the results in the final examination the marks obtained by the students are fed into the system and the results were analyzed for the next session. The comparative analysis of the results states that the prediction has helped the weaker students to improve and brought out betterment in the result.

173 citations


Journal ArticleDOI
TL;DR: Students’ key demographic characteristics and their marks in a small number of written assignments can constitute the training set for a regression method in order to predict the student’s performance.
Abstract: Use of machine learning techniques for educational proposes (or educational data mining) is an emerging field aimed at developing methods of exploring data from computational educational settings and discovering meaningful patterns. The stored data (virtual courses, e-learning log file, demographic and academic data of students, admissions/registration info, and so on) can be useful for machine learning algorithms. In this article, we cite the most current articles that use machine learning techniques for educational proposes and we present a case study for predicting students' marks. Students' key demographic characteristics and their marks in a small number of written assignments can constitute the training set for a regression method in order to predict the student's performance. Finally, a prototype version of software support tool for tutors has been constructed.

170 citations


Journal ArticleDOI
TL;DR: The paper presents the use of virtual appliances, a fully functional computer simulated over a regular one and configured with all the required tools needed in a learning experience and a prediction model is presented based on those observations with the highest correlation.
Abstract: The interactions that students have with each other, with the instructors, and with educational resources are valuable indicators of the effectiveness of a learning experience. The increasing use of information and communication technology allows these interactions to be recorded so that analytic or mining techniques are used to gain a deeper understanding of the learning process and propose improvements. But with the increasing variety of tools being used, monitoring student progress is becoming a challenge. The paper answers two questions. The first one is how feasible is to monitor the learning activities occurring in a student personal workspace. The second is how to use the recorded data for the prediction of student achievement in a course. To address these research questions, the paper presents the use of virtual appliances, a fully functional computer simulated over a regular one and configured with all the required tools needed in a learning experience. Students carry out activities in this environment in which a monitoring scheme has been previously configured. A case study is presented in which a comprehensive set of observations were collected. The data is shown to have significant correlation with student academic achievement thus validating the approach to be used as a prediction mechanism. Finally a prediction model is presented based on those observations with the highest correlation.

Journal Article
TL;DR: This study explores how the interaction of students with each other and with their instructors predicts their learning outcomes (as measured by their final grades) and aims to enrich the existing body of literature, while augmenting the understanding of effective learning strategies across a variety of new delivery modes.
Abstract: Introduction According to a recent survey conducted by Campus Computing (campuscomputing.net) and WCET (wcet.info), almost 88% of the surveyed institutions reported having used an LMS (Learning Management System) as a medium for course delivery for both on-campus and online offerings. In addition to various student information management systems (SISs), LMSs are providing the educational community with a goldmine of unexploited data about students' learning characteristics, behaviours, and patterns. The turning of such raw data into useful information and knowledge will enable institutes of higher education (HEIs) to rethink and improve students' learning experiences by using the data to streamline their teaching and learning processes, to extract and analyse students' learning and navigation patterns and behaviours, to analyse threaded discussion and interaction logs, and to provide feedback to students and to faculty about the unfolding of their students' learning experiences (Hung & Crooks, 2009; Garcia, Romero, Ventura, & de Castro, 2011). To this end, data mining has emerged as a powerful analytical and exploratory tool supported by faster multi-core 64 CPUs with larger memories, and by powerful database reporting tools. Originating in corporate business practices, data mining is multidisciplinary by nature and springs from several different disciplines including computer science, artificial intelligence, statistics, and biometrics. Using various approaches (such as classification, clustering, association rules, and visualization), data mining has been gaining momentum in higher education, which is now using a variety of applications, most notably in enrolment, learning patterns, personalization, and threaded discussion analysis. By discovering hidden relationships, patterns, and interdependencies, and by correlating raw/unstructured institutional data, data mining is beginning to facilitate the decision-making process in higher educational institutions. This interest in data mining is timely and critical, particularly as universities are diversifying their delivery modes to include more online and mobile learning environments. EDM has the potential to help HEIs understand the dynamics and patterns of a variety of learning environments and to provide insightful data for rethinking and improving students' learning experiences. This paper is focused on understanding live video streaming (LVS) students' learning behaviours, their interactions, and their learning outcomes. More specifically, this study explores how the interaction of students with each other and with their instructors predicts their learning outcomes (as measured by their final grades). By investigating these interrelated dimensions, this study aims to enrich the existing body of literature, while augmenting the understanding of effective learning strategies across a variety of new delivery modes. This paper is divided into four sections. It begins by reviewing the literature dealing with the use of data mining in administrative and academic environments, followed by a short discussion of the way in which data mining is used to understand various dimensions of learning. The second section explains the purpose and the research questions explored in this paper. The third section describes the background of the study and details its methodological approach (sampling, data collection, and analysis). The paper concludes by highlighting key findings, by discussing the study's limitations, and by proposing several recommendations for distance education administrators and practitioners. Data mining applications in administrative and academic environments At the intersection of several disciplines including computer science, statistics, psychometrics (Garcia et al., 2011), data mining has thrived in business practices as a knowledge discovery tool intended to transform raw data into highlevel knowledge for decision support (Hen & Lee, 2008). …

Proceedings ArticleDOI
01 Oct 2012
TL;DR: This work introduces ECD and relates its elements to the broad range of digital inputs relevant to modern assessment and discusses the relation between EDM and psychometric activities in educational assessment.
Abstract: Evidence-centered design (ECD) is a comprehensive framework for describing the conceptual, computational and inferential elements of educational assessment. It emphasizes the importance of articulating inferences one wants to make and the evidence needed to support those inferences. At first blush, ECD and educational data mining (EDM) might seem in conflict: structuring situations to evoke particular kinds of evidence, versus discovering meaningful patterns in available data. However, a dialectic between the two stances increases understanding and improves practice. We first introduce ECD and relate its elements to the broad range of digital inputs relevant to modern assessment. We then discuss the relation between EDM and psychometric activities in educational assessment. We illustrate points with examples from the Cisco Networking Academy, a global program in which information technology is taught through a blended program of face-to-face classroom instruction, an online curriculum, and online assessments.

Proceedings ArticleDOI
01 Oct 2012
TL;DR: Science Assistments, an interactive environment, which assesses students’ inquiry skills as they engage in inquiry using science microworlds, is presented, which focuses on the student model, the task model, and the evidence model in the conceptual assessment framework.
Abstract: We present Science Assistments, an interactive environment, which assesses students’ inquiry skills as they engage in inquiry using science microworlds. We frame our variables, tasks, assessments, and methods of analyzing data in terms of evidence-centered design. Specifically, we focus on the student model, the task model, and the evidence model in the conceptual assessment framework. In order to support both assessment and the provision of scaffolding, the environment makes inferences about student inquiry skills using models developed through a combination of text replay tagging [cf. Sao Pedro et al. 2011], a method for rapid manual coding of student log files, and educational data mining. Models were developed for multiple inquiry skills, with particular focus on detecting if students are testing their articulated hypotheses, and if they are designing controlled experiments. Student-level cross-validation was applied to validate that this approach can automatically and accurately identify these inquiry skills for new students. The resulting detectors also can be applied at run-time to drive scaffolding intervention.

Proceedings ArticleDOI
07 Jan 2012
TL;DR: This paper describes how to apply the main data mining techniques such as prediction, classification, relationship mining, clustering, and social area networking to educational data.
Abstract: Educational data mining is an emerging trend, concerned with developing techniques for exploring, and analyzing the huge data that come from the educational context. EDM is poised to leverage an enormous amount of research from data mining community and apply that research to educational problems in learning, cognition and assessment. In recent years, Educational data mining has proven to be more successful at many of these educational statistics problems due to enormous computing power and data mining algorithms. This paper surveys the history and applications of data mining techniques in the educational field. The objective is to introduce data mining to traditional educational system, web-based educational system, intelligent tutoring system, and e-learning. This paper describes how to apply the main data mining techniques such as prediction, classification, relationship mining, clustering, and social area networking to educational data.

Journal Article
TL;DR: This study investigated an innovative approach of program evaluation through analyses of student learning logs, demographic data, and end-of-course evaluation surveys in an online K–12 supplemental program and explored potential EDM applications at the K12 level that have already been broadly adopted in higher education institutions.
Abstract: This study investigated an innovative approach of program evaluation through analyses of student learning logs, demographic data, and end-of-course evaluation surveys in an online K–12 supplemental program. The results support the development of a program evaluation model for decision making on teaching and learning at the K– 12 level. A case study was conducted with a total of 7,539 students (whose activities resulted in 23,854,527 learning logs in 883 courses). Clustering analysis was applied to reveal students’ shared characteristics, and decision tree analysis was applied to predict student performance and satisfaction levels toward course and instructor. This study demonstrated how data mining can be incorporated into program evaluation in order to generate in-depth information for decision making. In addition, it explored potential EDM applications at the K12 level that have already been broadly adopted in higher education institutions.

Book ChapterDOI
14 Jun 2012
TL;DR: A new technique to represent, classify, and use programs written by novices as a base for automatic hint generation for programming tutors is described, and it is shown that the algorithms can be used to generate hints over 80 percent of the time.
Abstract: We describe a new technique to represent, classify, and use programs written by novices as a base for automatic hint generation for programming tutors. The proposed linkage graph representation is used to record and reuse student work as a domain model, and we use an overlay comparison to compare in-progress work with complete solutions in a twist on the classic approach to hint generation. Hint annotation is a time consuming component of developing intelligent tutoring systems. Our approach uses educational data mining and machine learning techniques to automate the creation of a domain model and hints from student problem-solving data. We evaluate the approach with a sample of partial and complete, novice programs and show that our algorithms can be used to generate hints over 80 percent of the time. This promising rate shows that the approach has potential to be a source for automatically generated hints for novice programmers.

Journal ArticleDOI
TL;DR: This work presents a brief overview of EDM and introduces four selected EDM papers representing a crosscut of different application areas for data mining in education.
Abstract: Educational Data Mining (EDM) is an emerging multidisciplinary research area, in which methods and techniques for exploring data originating from various educational information systems have been developed. EDM is both a learning science, as well as a rich application area for data mining, due to the growing availability of educational data. EDM contributes to the study of how students learn, and the settings in which they learn. It enables data-driven decision making for improving the current educational practice and learning material. We present a brief overview of EDM and introduce four selected EDM papers representing a crosscut of different application areas for data mining in education.

Journal ArticleDOI
TL;DR: A Moodle module is proposed that allows automatic extraction of data needed for educational data mining analysis and deploys models developed in this study, which included data preprocessing, parameter optimization and attribute selection steps, which enhanced the overall performance.
Abstract: In this research we applied classification models for prediction of students’ performance, and cluster models for grouping students based on their cognitive styles in e-learning environment. Classification models described in this paper should help: teachers, students and business people, for early engaging with students who are likely to become excellent on a selected topic. Clustering students based on cognitive styles and their overall performance should enable better adaption of the learning materials with respect to their learning styles. The approach is tested using well-established data mining algorithms, and evaluated by several evaluation measures. Model building process included data preprocessing, parameter optimization and attribute selection steps, which enhanced the overall performance. Additionally we propose a Moodle module that allows automatic extraction of data needed for educational data mining analysis and deploys models developed in this study.

Proceedings ArticleDOI
01 Oct 2012
TL;DR: The development and refinement of evidence rules and measurement models within the evidence model of the evidence-centered design (ECD) framework are described in the context of the Packet Tracer digital learning environment of the Cisco Networking Academy.
Abstract: In this paper we describe the development and refinement of evidence rules and measurement models within the evidence model of the evidence-centered design (ECD) framework in the context of the Packet Tracer digital learning environment of the Cisco Networking Academy. Using Packet Tracer learners design, configure, and troubleshoot computer networks within an interactive interface. This leads to product data, which result from the students' final submitted network configurations, and process data, which are log file entries detailing how they got to the final configurations. We discuss how an iterative cycle of empirical analyses and discussions with subject-matter experts is essential for identifying and accumulating evidence about skill profiles of learners and their development. We present results from descriptive, exploratory, and confirmatory diagnostic modeling analyses for both data types, which required bringing to bear a diversity of tools from multivariate statistics, modern psychometrics, and educational data mining. We close the paper with a discussion of the implications of this work for evidence-based argumentation guided by ECD principles within digital learning environments more generally.

Journal ArticleDOI
01 Aug 2012
TL;DR: This paper presents an approach based on grammar guided genetic programming, G3P-MI, which classifies students in order to predict their final grade based on features extracted from logged data in a web based education system.
Abstract: A considerable amount of e-learning content is available via virtual learning environments. These platforms keep track of learners' activities including the content viewed, assignments submission, time spent and quiz results, which all provide us with a unique opportunity to apply data mining methods. This paper presents an approach based on grammar guided genetic programming, G3P-MI, which classifies students in order to predict their final grade based on features extracted from logged data in a web based education system. Our proposal works with multiple instance learning, a relatively new learning framework that can eliminate the great number of missing values that appear when the problem is represented by traditional supervised learning. Experimental results are carried out on data sets with information about several courses and demonstrate that G3P-MI successfully achieves better accuracy and yields trade-off between such contradictory metrics as sensitivity and specificity compared to the most popular techniques of multiple instance learning. This method could be quite useful for early identification of students at risk, especially in very large classes, and allows the instructor to provide information about the most relevant activities to help students have a better chance to pass a course.

Proceedings ArticleDOI
18 Oct 2012
TL;DR: In this article, an educational data mining (EDM) case study based on the data collected from learning management system (LMS) of e-learning center and electronic education system of Iran University of Science and Technology (IUST).
Abstract: In this paper, we describe an educational data mining (EDM) case study based on the data collected from learning management system (LMS) of e-learning center and electronic education system of Iran University of Science and Technology (IUST). Our main goal is to illustrate the applications of EDM in the domain of e-learning and online courses by implementing a model to predict academic dismissal and also GPA of graduated students. The monitoring and support of freshmen and first year students are considered very significant in many educational institutions. Consequently, if there are some ways to estimate probability of dismissal, drop out and other challenges within the process of the graduation, and also capable tools to predict GPA or even semester by semester grades, the university officials can design and improve more efficient strategies for education systems especially for e-learning ones which include less known and more complicated problems. To achieve the mentioned goal, a common methodology of data mining has been utilized which is called CRISP. Our results show that there can be confident models for predicting educational attributes. Currently there is an increasing interest in data mining and educational systems, making educational data mining as a new growing research community.

Journal Article
TL;DR: A clustering study of teachers' usage patterns while using an educational digital library tool, called the Instructional Architect, showed that increased teaching experience and comfort with technology were related to teachers' effectiveness in using the IA.
Abstract: * Corresponding author ABSTRACT Teachers and students increasingly enjoy unprecedented access to abundant web resources and digital libraries to enhance and enrich their classroom experiences. However, due to the distributed nature of such systems, conventional educational research methods, such as surveys and observations, provide only limited snapshots. In addition, educational data mining, as an emergent research approach, has seldom been used to explore teachers' online behaviors when using digital libraries. Building upon results from a preliminary study, this article presents results from a clustering study of teachers' usage patterns while using an educational digital library tool, called the Instructional Architect. The clustering approach employed a robust statistical model called latent class analysis. In addition, frequent itemsets mining was used to clean and extract common patterns from the clusters initially generated. The final clusters identified three groups of teachers in the IA: key brokers, insular classroom practitioners, and inactive islanders. Identified clusters were triangulated with data collected in teachers' registration profiles. Results showed that increased teaching experience and comfort with technology were related to teachers' effectiveness in using the IA.

Journal ArticleDOI
TL;DR: This paper presents the data mining method for enrollment management for MCA course, a new research community, educational data mining (EDM), is growing which is intersection of data mining and pedagogy.
Abstract: In the last two decades, number of Higher Education Institutions (HEI) grows rapidly in India. This causes a cut throat competition among these institutions while attracting the student to get admission in these institutions. Most of the institutions are opened in self finance mode, so all time they feel short hand in expenditure. Therefore, institutions focused on the strength of students not on the quality of education. Indian education sector has a lot of data that can produce valuable information. Knowledge Discovery and Data Mining (KDD) is a multidisciplinary area focusing upon methodologies for extracting useful knowledge from data and there are several useful KDD tools to extract the knowledge. This knowledge can be used to increase the quality of education. But educational institution does not use any knowledge discovery process approach on these data. Now-a- day a new research community, educational data mining (EDM), is growing which is intersection of data mining and pedagogy. In this paper we present the data mining method for enrollment management for MCA course.

Journal ArticleDOI
TL;DR: A computational method that can efficiently estimate the ability of students from the log files of a Web-based learning environment capturing their problem solving processes by approximating the posterior distribution of the student's ability from the conventional Bayes Modal Estimation approach to a simple Gaussian function.
Abstract: This paper presents a computational method that can efficiently estimate the ability of students from the log files of a Web-based learning environment capturing their problem solving processes. The computational method developed in this study approximates the posterior distribution of the student's ability obtained from the conventional Bayes Modal Estimation (BME) approach to a simple Gaussian function in order to reduce the amount of computations required in the subsequent ability update processes. To verify the correctness and usefulness of this method, the abilities of 407 college students who solved 61 physics problems in a Web-based learning environment were estimated from the log files of the learning environment. The reduced chi-squared statistic and Pearson's chi-square test for the goodness of fit indicate that the estimated abilities were able to successfully explain the observed problem solving performance of students within error. The educational implications of estimating the ability of students in Web-based learning environments were also discussed.

Journal ArticleDOI
TL;DR: The analysis of students' use of recorded lectures at two Universities in the Netherlands shows that recorded lectures are viewed to prepare for exams and assignments and suggests that students who do this have a significantly higher chance of passing the exams.
Abstract: This study analyses the interactions of students with the recorded lectures. We report on an analysis of students' use of recorded lectures at two Universities in the Netherlands. The data logged by the lecture capture system (LCS) is used and combined with collected survey data. We describe the process of data pre-processing and analysis of the resulting full dataset and then focus on the usage for the course with the most learner sessions. We found discrepancies as well as similarities between students' verbal reports and actual usage as logged by the recorded lecture servers. The analysis shows that recorded lectures are viewed to prepare for exams and assignments. The data suggests that students who do this have a significantly higher chance of passing the exams. Given the discrepancies between verbal reports and actual usage, research should no longer rely on verbal reports alone.

Book ChapterDOI
16 Jul 2012
TL;DR: This paper presents a first detector of what they term WTF ("Without Thinking Fastidiously") behavior, based on data from the Phase Change microworld in the Science ASSISTments environment, and discusses implications for understanding how and why students conduct inquiry without thinking fastidiously while learning in science inquiry microworlds.
Abstract: In recent years, there has been increased interest and research on identifying the various ways that students can deviate from expected or desired patterns while using educational software. This includes research on gaming the system, player transformation, haphazard inquiry, and failure to use key features of the learning system. Detection of these sorts of behaviors has helped researchers to better understand these behaviors, thus allowing software designers to develop interventions that can remediate them and/or reduce their negative impacts on user outcomes. In this paper, we present a first detector of what we term WTF ("Without Thinking Fastidiously") behavior, based on data from the Phase Change microworld in the Science ASSISTments environment. In WTF behavior, the student is interacting with the software, but their actions appear to have no relationship to the intended learning task. We discuss the detector development process, validate the detectors with human labels of the behavior, and discuss implications for understanding how and why students conduct inquiry without thinking fastidiously while learning in science inquiry microworlds.

Book ChapterDOI
14 Jun 2012
TL;DR: An automated detector is presented which is able to identify shallow learners, who are likely to need different intervention than students who have not yet learned at all, and is developed using a step regression approach.
Abstract: Recent research has extended student modeling to infer not just whether a student knows a skill or set of skills, but also whether the student has achieved robust learning --- learning that leads the student to be able to transfer their knowledge and prepares them for future learning (PFL). However, a student may fail to have robust learning in two fashions: they may have no learning, or they may have shallow learning (learning that applies only to the current skill, and does not support transfer or PFL). Within this paper, we present an automated detector which is able to identify shallow learners, who are likely to need different intervention than students who have not yet learned at all. This detector is developed using a step regression approach, with data from college students learning introductory genetics from an intelligent tutoring system.

Proceedings ArticleDOI
29 Apr 2012
TL;DR: This panel is proposed as a means of promoting mutual learning and continued dialogue between the Educational Data Mining and Learning Analytics communities.
Abstract: W This panel is proposed as a means of promoting mutual learning and continued dialogue between the Educational Data Mining and Learning Analytics communities. EDM has been developing as a community for longer than the LAK conference, so what if anything makes the LAK community different, and where is the common ground?

Proceedings Article
01 Jun 2012
TL;DR: In their experiments, the authors used discretized first response time data to predict students’ correctness of the next question, and leveraged the result into a Knowledge Tracing model and confirmed the value of studentFirst response time in modeling student knowledge.
Abstract: The field of educational data mining has been using the Knowledge Tracing model, which only look at the correctness of student first response, for tracking student knowledge. Recently, lots of other features are studied to extend the Knowledge Tracing model to better model student knowledge. The goal of this paper is to analyze whether or not the information of student first response time of a question can be leveraged into Knowledge Tracing model and improve Knowledge Tracing’s prediction accuracy. In our experiments, we used discretized first response time data to predict students’ correctness of the next question, and leveraged the result into a Knowledge Tracing model. Our analysis confirmed the value of student first response time in modeling student knowledge.