Learning Program Embeddings to Propagate Feedback on Student Code
TL;DR: A neural network method is introduced to encode programs as a linear mapping from an embedded precondition space to an embedded postcondition space and an algorithm for feedback at scale is proposed using these linear maps as features.
Abstract: Providing feedback, both assessing final work and giving hints to stuck students, is difficult for open-ended assignments in massive online classes which can range from thousands to millions of students. We introduce a neural network method to encode programs as a linear mapping from an embedded precondition space to an embedded postcondition space and propose an algorithm for feedback at scale using these linear maps as features. We apply our algorithm to assessments from the Code.org Hour of Code and Stanford University's CS1 course, where we propagate human comments on student assignments to orders of magnitude more submissions.
Cites background from "Learning Program Embeddings to Prop..."
... Syntax + State Student Feedback Distributed Student Feedback...
...Learning rates are set using Adagrad (Duchi et al., 2011)....
"Learning Program Embeddings to Prop..." refers background or methods in this paper
...The programs for these assignments operate in maze worlds where an agent can move, turn, and test for conditions of its current location....
...Our models are related to recent work from the NLP and deep learning communities on recursive neural networks, particularly for modeling semantics in sentences or symbolic expressions (Socher et al., 2013; 2011; Zaremba et al., 2014; Bowman, 2013)....
...…on recursive neural networks (called the NPM-RNN model) in which we parametrize a matrix MA in this new model with an RNN whose architecture follows the abstract syntax tree (similar to the way in which RNN architectures might take the form of a parse tree in an NLP setting (Socher et al., 2013))....
"Learning Program Embeddings to Prop..." refers methods in this paper
...We use random search (Bergstra & Bengio, 2012) to optimize over hyperparameters (e.g, regularization parameters, matrix dimensions, and minibatch size)....