scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Prospects and Challenges of Using Machine Learning for Academic Forecasting

TL;DR: It is suggested that machine learning remains one of the promising forecasting technologies with the power to enhance effective academic forecasting that would assist the education industry in planning and making better decisions to enrich the quality of education.
Abstract: The study examines the prospects and challenges of machine learning (ML) applications in academic forecasting. Predicting academic activities through machine learning algorithms presents an enhanced means to accurately forecast academic events, including the academic performances and the learning style of students. The use of machine learning algorithms such as K-nearest neighbor (KNN), random forest, bagging, artificial neural network (ANN), and Bayesian neural network (BNN) has potentials that are currently being applied in the education sector to predict future events. Many gaps in the traditional forecasting techniques have greatly been bridged by the use of artificial intelligence-based machine learning algorithms thereby aiding timely decision-making by education stakeholders. ML algorithms are deployed by educational institutions to predict students' learning behaviours and academic achievements, thereby giving them the opportunity to detect at-risk students early and then develop strategies to help them overcome their weaknesses. However, despite the benefits associated with the ML approach, there exist some limitations that could affect its correctness or deployment in forecasting academic events, e.g., proneness to errors, data acquisition, and time-consuming issues. Nonetheless, we suggest that machine learning remains one of the promising forecasting technologies with the power to enhance effective academic forecasting that would assist the education industry in planning and making better decisions to enrich the quality of education.

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI
13 Oct 2022
TL;DR: In this paper , a focus is given on additional external factors such as geographical location, parent education, health status etc. that can affect a students' performances apart from the grades in any course.
Abstract: India's Education system is very old and due to a large population of students in India, there are some serious issues in analyzing and predicting students' performance. In the Indian Context, every institution has its own set of standards for evaluating student success, there is no proper procedure for monitoring and analyzing a student's performance and progress. One of the major factors is lack of research in existing prediction approaches, making it difficult to determine the optimal prediction methodology for visualizing student academic growth and performance. Another reason could be the lack of research into the areas that can affect students' academic performance and achievement. In this paper, focus is given on additional external factors like geographical location, parent education, health status etc. that can affect a students' performances apart from the grades in any course. That will be more effective in visualizing and analyzing student's performance. For experimental work, data has been collected from UCI repository and results are obtained from two different machine learning algorithms (KNN and Logistic Regression). Performance analysis is also done for these two algorithms based on accuracy level of results as well as with some existing work.

1 citations

Proceedings ArticleDOI
13 Oct 2022
TL;DR: In this article , a focus is given on additional external factors such as geographical location, parent education, health status etc. that can affect a students' performances apart from the grades in any course.
Abstract: India's Education system is very old and due to a large population of students in India, there are some serious issues in analyzing and predicting students' performance. In the Indian Context, every institution has its own set of standards for evaluating student success, there is no proper procedure for monitoring and analyzing a student's performance and progress. One of the major factors is lack of research in existing prediction approaches, making it difficult to determine the optimal prediction methodology for visualizing student academic growth and performance. Another reason could be the lack of research into the areas that can affect students' academic performance and achievement. In this paper, focus is given on additional external factors like geographical location, parent education, health status etc. that can affect a students' performances apart from the grades in any course. That will be more effective in visualizing and analyzing student's performance. For experimental work, data has been collected from UCI repository and results are obtained from two different machine learning algorithms (KNN and Logistic Regression). Performance analysis is also done for these two algorithms based on accuracy level of results as well as with some existing work.
References
More filters
Journal ArticleDOI
01 Oct 2001
TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Abstract: Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, aaa, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.

79,257 citations

Journal ArticleDOI
27 Mar 2018-PLOS ONE
TL;DR: It is found that the post-sample accuracy of popular ML methods are dominated across both accuracy measures used and for all forecasting horizons examined, and that their computational requirements are considerably greater than those of statistical methods.
Abstract: Machine Learning (ML) methods have been proposed in the academic literature as alternatives to statistical ones for time series forecasting. Yet, scant evidence is available about their relative performance in terms of accuracy and computational requirements. The purpose of this paper is to evaluate such performance across multiple forecasting horizons using a large subset of 1045 monthly time series used in the M3 Competition. After comparing the post-sample accuracy of popular ML methods with that of eight traditional statistical ones, we found that the former are dominated across both accuracy measures used and for all forecasting horizons examined. Moreover, we observed that their computational requirements are considerably greater than those of statistical methods. The paper discusses the results, explains why the accuracy of ML models is below that of statistical ones and proposes some possible ways forward. The empirical results found in our research stress the need for objective and unbiased ways to test the performance of forecasting methods that can be achieved through sizable and open competitions allowing meaningful comparisons and definite conclusions.

800 citations

Proceedings Article
27 Aug 1998
TL;DR: A multi-classifier meta-learning approach to address very large databases with skewed class distributions and non-uniform cost per error and empirical results indicate that the approach can significantly reduce loss due to illegitimate transactions.
Abstract: Very large databases with skewed class distributions and non-uniform cost per error are not uncommon in real-world data mining tasks. We devised a multi-classifier meta-learning approach to address these three issues. Our empirical results from a credit card fraud detection task indicate that the approach can significantly reduce loss due to illegitimate transactions.

499 citations

Journal ArticleDOI
TL;DR: The phenomena of the emergence of the use of artificial intelligence in teaching and learning in higher education is explored and educational implications of emerging technologies on the way students learn and how institutions teach and evolve are investigated.
Abstract: This paper explores the phenomena of the emergence of the use of artificial intelligence in teaching and learning in higher education. It investigates educational implications of emerging technologies on the way students learn and how institutions teach and evolve. Recent technological advancements and the increasing speed of adopting new technologies in higher education are explored in order to predict the future nature of higher education in a world where artificial intelligence is part of the fabric of our universities. We pinpoint some challenges for institutions of higher education and student learning in the adoption of these technologies for teaching, learning, student support, and administration and explore further directions for research.

400 citations

Journal ArticleDOI
TL;DR: An ensemble approach for feature selection is presented, which aggregates the several individual feature lists obtained by the different feature selection methods so that a more robust and efficient feature subset can be obtained.
Abstract: Sentiment analysis is an important research direction of natural language processing, text mining and web mining which aims to extract subjective information in source materials The main challenge encountered in machine learning method-based sentiment classification is the abundant amount of data available This amount makes it difficult to train the learning algorithms in a feasible time and degrades the classification accuracy of the built model Hence, feature selection becomes an essential task in developing robust and efficient classification models whilst reducing the training time In text mining applications, individual filter-based feature selection methods have been widely utilized owing to their simplicity and relatively high performance This paper presents an ensemble approach for feature selection, which aggregates the several individual feature lists obtained by the different feature selection methods so that a more robust and efficient feature subset can be obtained In order to aggregate the individual feature lists, a genetic algorithm has been utilized Experimental evaluations indicated that the proposed aggregation model is an efficient method and it outperforms individual filter-based feature selection methods on sentiment classification

274 citations