scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Powergrading: a Clustering Approach to Amplify Human Effort for Short Answer Grading

31 Oct 2013-Vol. 1, pp 391-402
TL;DR: This paper used a similarity metric between student responses, and then used this metric to group responses into clusters and subclusters, which allowed teachers to grade multiple responses with a single action, provide rich feedback to groups of similar answers, and discover modalities of misunderstanding among students.
Abstract: We introduce a new approach to the machine-assisted grading of short answer questions. We follow past work in automated grading by first training a similarity metric between student responses, but then go on to use this metric to group responses into clusters and subclusters. The resulting groupings allow teachers to grade multiple responses with a single action, provide rich feedback to groups of similar answers, and discover modalities of misunderstanding among students; we refer to this amplification of grader effort as “powergrading.” We develop the means to further reduce teacher effort by automatically performing actions when an answer key is available. We show results in terms of grading progress with a small “budget” of human actions, both from our method and an LDA-based approach, on a test corpus of 10 questions answered by 698 respondents.

Content maybe subject to copyright    Report

Citations
More filters
Proceedings Article
06 Jul 2015
TL;DR: A neural network method is introduced to encode programs as a linear mapping from an embedded precondition space to an embedded postcondition space and an algorithm for feedback at scale is proposed using these linear maps as features.
Abstract: Providing feedback, both assessing final work and giving hints to stuck students, is difficult for open-ended assignments in massive online classes which can range from thousands to millions of students. We introduce a neural network method to encode programs as a linear mapping from an embedded precondition space to an embedded postcondition space and propose an algorithm for feedback at scale using these linear maps as features. We apply our algorithm to assessments from the Code.org Hour of Code and Stanford University's CS1 course, where we propagate human comments on student assignments to orders of magnitude more submissions.

143 citations

Journal ArticleDOI
10 Mar 2015
TL;DR: OverCode is presented, a system for visualizing and exploring thousands of programming solutions that allows teachers to more quickly develop a high-level view of students' understanding and misconceptions, and to provide feedback that is relevant to more students' solutions.
Abstract: In MOOCs, a single programming exercise may produce thousands of solutions from learners. Understanding solution variation is important for providing appropriate feedback to students at scale. The wide variation among these solutions can be a source of pedagogically valuable examples and can be used to refine the autograder for the exercise by exposing corner cases. We present OverCode, a system for visualizing and exploring thousands of programming solutions. OverCode uses both static and dynamic analysis to cluster similar solutions, and lets teachers further filter and cluster solutions based on different criteria. We evaluated OverCode against a nonclustering baseline in a within-subjects study with 24 teaching assistants and found that the OverCode interface allows teachers to more quickly develop a high-level view of students' understanding and misconceptions, and to provide feedback that is relevant to more students' solutions.

128 citations

Posted Content
TL;DR: A technique for providing feedback on syntax errors that uses Recurrent neural networks (RNNs) to model syntactically valid token sequences that can fix the error by either replacing or inserting the predicted token sequence at the error location.
Abstract: We present a method for automatically generating repair feedback for syntax errors for introductory programming problems. Syntax errors constitute one of the largest classes of errors (34%) in our dataset of student submissions obtained from a MOOC course on edX. The previous techniques for generating automated feed- back on programming assignments have focused on functional correctness and style considerations of student programs. These techniques analyze the program AST of the program and then perform some dynamic and symbolic analyses to compute repair feedback. Unfortunately, it is not possible to generate ASTs for student pro- grams with syntax errors and therefore the previous feedback techniques are not applicable in repairing syntax errors. We present a technique for providing feedback on syntax errors that uses Recurrent neural networks (RNNs) to model syntactically valid token sequences. Our approach is inspired from the recent work on learning language models from Big Code (large code corpus). For a given programming assignment, we first learn an RNN to model all valid token sequences using the set of syntactically correct student submissions. Then, for a student submission with syntax errors, we query the learnt RNN model with the prefix to- ken sequence to predict token sequences that can fix the error by either replacing or inserting the predicted token sequence at the error location. We evaluate our technique on over 14, 000 student submissions with syntax errors. Our technique can completely re- pair 31.69% (4501/14203) of submissions with syntax errors and in addition partially correct 6.39% (908/14203) of the submissions.

109 citations

Proceedings ArticleDOI
01 Sep 2017
TL;DR: This work investigates how several basic neural approaches similar to those used for automated essay scoring perform on short answer scoring, and shows that neural architectures can outperform a strong non-neural baseline, but performance and optimal parameter settings vary across the more diverse types of prompts typical of shortanswer scoring.
Abstract: Neural approaches to automated essay scoring have recently shown state-of-the-art performance The automated essay scoring task typically involves a broad notion of writing quality that encompasses content, grammar, organization, and conventions This differs from the short answer content scoring task, which focuses on content accuracy The inputs to neural essay scoring models – ngrams and embeddings – are arguably well-suited to evaluate content in short answer scoring tasks We investigate how several basic neural approaches similar to those used for automated essay scoring perform on short answer scoring We show that neural architectures can outperform a strong non-neural baseline, but performance and optimal parameter settings vary across the more diverse types of prompts typical of short answer scoring

99 citations

Proceedings ArticleDOI
27 May 2018
TL;DR: A novel Neuro-symbolic approach that combines neural networks with constraint-based reasoning is proposed that is able to repair syntax errors in 60% of submissions, and finds functionally correct repairs for 23.8% of submitted programs.
Abstract: Automatic correction of programs is a challenging problem with numerous real world applications in security, verification, and education. One application that is becoming increasingly important is the correction of student submissions in online courses for providing feedback. Most existing program repair techniques analyze Abstract Syntax Trees (ASTs) of programs, which are unfortunately unavailable for programs with syntax errors. In this paper, we propose a novel Neuro-symbolic approach that combines neural networks with constraint-based reasoning. Specifically, our method first uses a Recurrent Neural Network (RNN) to perform syntax repairs for the buggy programs; subsequently, the resulting syntactically-fixed programs are repaired using constraint-based techniques to ensure functional correctness. The RNNs are trained using a corpus of syntactically correct submissions for a given programming assignment, and are then queried to fix syntax errors in an incorrect programming submission by replacing or inserting the predicted tokens at the error location. We evaluate our technique on a dataset comprising of over 14,500 student submissions with syntax errors. Our method is able to repair syntax errors in 60% (8689) of submissions, and finds functionally correct repairs for 23.8% (3455) submissions.

74 citations

References
More filters
Journal ArticleDOI
TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.
Abstract: We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. In the context of text modeling, the topic probabilities provide an explicit representation of a document. We present efficient approximate inference techniques based on variational methods and an EM algorithm for empirical Bayes parameter estimation. We report results in document modeling, text classification, and collaborative filtering, comparing to a mixture of unigrams model and the probabilistic LSI model.

30,570 citations

Proceedings Article
03 Jan 2001
TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).
Abstract: We propose a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams [6], and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI) [3]. In the context of text modeling, our model posits that each document is generated as a mixture of topics, where the continuous-valued mixture proportions are distributed as a latent Dirichlet random variable. Inference and learning are carried out efficiently via variational algorithms. We present empirical results on applications of this model to problems in text modeling, collaborative filtering, and text classification.

25,546 citations

Book
01 Jan 1990
TL;DR: An electrical signal transmission system, applicable to the transmission of signals from trackside hot box detector equipment for railroad locomotives and rolling stock, wherein a basic pulse train is transmitted whereof the pulses are of a selected first amplitude and represent a train axle count.
Abstract: 1. Introduction. 2. Partitioning Around Medoids (Program PAM). 3. Clustering large Applications (Program CLARA). 4. Fuzzy Analysis. 5. Agglomerative Nesting (Program AGNES). 6. Divisive Analysis (Program DIANA). 7. Monothetic Analysis (Program MONA). Appendix 1. Implementation and Structure of the Programs. Appendix 2. Running the Programs. Appendix 3. Adapting the Programs to Your Needs. Appendix 4. The Program CLUSPLOT. References. Author Index. Subject Index.

10,537 citations

Journal ArticleDOI
TL;DR: An algorithm for suffix stripping is described, which has been implemented as a short, fast program in BCPL, and performs slightly better than a much more elaborate system with which it has been compared.
Abstract: The automatic removal of suffixes from words in English is of particular interest in the field of information retrieval. An algorithm for suffix stripping is described, which has been implemented as a short, fast program in BCPL. Although simple, it performs slightly better than a much more elaborate system with which it has been compared. It effectively works by treating complex suffixes as compounds made up of simple suffixes, and removing the simple suffixes in a number of steps. In each step the removal of the suffix is made to depend upon the form of the remaining stem, which usually involves a measure of its syllable length.

7,572 citations

Journal ArticleDOI
Ali S. Hadi1
TL;DR: This book make understandable the cluster analysis is based notion of starsmodern treatment, which efficiently finds accurate clusters in data and discusses various types of study the user set explicitly but also proposes another.
Abstract: The Wiley-Interscience Paperback Series consists of selected books that have been made more accessible to consumers in an effort to increase In both the increasingly important and distribution we show how these methods. Our experiments demonstrate that together can deal with most applications technometrics. In an appropriate visualization technique is to these new. The well written and efficiently finds accurate clusters in data including. Of applied value for several preprocessing tasks discontinuity preserving smoothing feature clusters! However the model based notion of domain knowledge from real data repositories in data. Discusses various types of study the user set explicitly but also propose another. This book make understandable the cluster analysis is based notion of starsmodern treatment.

7,423 citations