Open AccessJournal Article
Comparison of Dimension Reduction Methods for Automated Essay Grading
Reads0
Chats0
TLDR
The results show that the use of learning materials as training data for the grading model outperforms the k-NN-based grading methods and the division of the learning materials in the training data is crucial.Abstract:
Automatic Essay Assessor (AEA) is a system that utilizes information retrieval techniques such as Latent Semantic Analysis (LSA), Probabilistic Latent Semantic Analysis (PLSA), and Latent Dirichlet Allocation (LDA) for automatic essay grading. The system uses learning materials and relatively few teacher-graded essays for calibrating the scoring mechanism before grading. We performed a series of experiments using LSA, PLSA and LDA for document comparisons in AEA. In addition to comparing the methods on a theoretical level, we compared the applicability of LSA, PLSA, and LDA to essay grading with empirical data. The results show that the use of learning materials as training data for the grading model outperforms the k-NN-based grading methods. In addition to this, we found that using LSA yielded slightly more accurate grading than PLSA and LDA. We also found that the division of the learning materials in the training data is crucial. It is better to divide learning materials into sentences than paragraphs.read more
Citations
More filters
Journal ArticleDOI
A Survey of Topic Modeling in Text Mining
Rubayyi Alghamdi,Khalid Alfalqi +1 more
TL;DR: Different models, such as topic over time (TOT), dynamic topic models (DTM), multiscale topic tomography, dynamic topic correlation detection, detecting topic evolution in scientific literature, etc. are discussed.
Journal ArticleDOI
An Approach to Source-Code Plagiarism Detection and Investigation Using Latent Semantic Analysis
Georgina Cosma,Mike Joy +1 more
TL;DR: PlaGate is described, a novel tool that can be integrated with existing plagiarism detection tools to improve plagiarism Detection performance and implements a new approach for investigating the similarity between source-code files with a view to gathering evidence for proving plagiarism.
Journal ArticleDOI
Performance Analysis of Multi-Motion Sensor Behavior for Active Smartphone Authentication
TL;DR: This paper investigates the reliability and applicability of using motion-sensor behavior for active and continuous smartphone authentication across various operational scenarios, and presents a systematic evaluation of the distinctiveness and permanence properties of the behavior.
Journal ArticleDOI
The shifting sands of disciplinary development: Analyzing North American Library and Information Science dissertations using latent Dirichlet allocation
TL;DR: The findings indicate that the main topics in LIS have changed substantially from those in the initial period (1930–1969) to the present (2000–2009), including the diminishing use of the word library.
Journal ArticleDOI
A tool for addressing construct identity in literature reviews and meta-analyses
Kai R. Larsen,Chih How Bong +1 more
TL;DR: The construct identity detector (CID) is designed and evaluated, the first tool with large-scale construct identity detection properties and the first tools that does not require respondent data.
References
More filters
Journal ArticleDOI
Maximum likelihood from incomplete data via the EM algorithm
Journal ArticleDOI
Latent dirichlet allocation
TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.
Proceedings Article
Latent Dirichlet Allocation
TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).
Journal ArticleDOI
Indexing by Latent Semantic Analysis
TL;DR: A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.
Journal ArticleDOI
Machine learning in automated text categorization
TL;DR: This survey discusses the main approaches to text categorization that fall within the machine learning paradigm and discusses in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation.