Learning Program Embeddings to Propagate Feedback on Student Code

Open AccessProceedings Article

Learning Program Embeddings to Propagate Feedback on Student Code

Chris Piech, +5 more

- pp 1093-1102

Chats0

TLDR

A neural network method is introduced to encode programs as a linear mapping from an embedded precondition space to an embedded postcondition space and an algorithm for feedback at scale is proposed using these linear maps as features.

Abstract:

Providing feedback, both assessing final work and giving hints to stuck students, is difficult for open-ended assignments in massive online classes which can range from thousands to millions of students. We introduce a neural network method to encode programs as a linear mapping from an embedded precondition space to an embedded postcondition space and propose an algorithm for feedback at scale using these linear maps as features. We apply our algorithm to assessments from the Code.org Hour of Code and Stanford University's CS1 course, where we propagate human comments on student assignments to orders of magnitude more submissions.

Citations

PDF

Open Access

More filters

Proceedings Article

Convolutional neural networks over tree structures for programming language processing

Lili Mou, +4 more

TL;DR: In this article, a tree-based convolutional neural network (TBCNN) is proposed for programming language processing, in which a convolution kernel is designed over programs' abstract syntax trees to capture structural information.

...read moreread less

Posted Content

A Survey of Machine Learning for Big Code and Naturalness

Miltiadis Allamanis, +3 more

- 18 Sep 2017 -

arXiv: Software Engineering

TL;DR: This article presents a taxonomy based on the underlying design principles of each model and uses it to navigate the literature and discuss cross-cutting and application-specific challenges and opportunities.

...read moreread less

Journal ArticleDOI

A Survey of Machine Learning for Big Code and Naturalness

Miltiadis Allamanis, +3 more

- 31 Jul 2018 -

ACM Computing Surveys

TL;DR: A survey of machine learning, programming languages, and software engineering has recently taken important steps in proposing learnable probabilistic models of source code that exploit the abundance of patterns of code.

...read moreread less

Proceedings Article

code2seq: Generating Sequences from Structured Representations of Code

Uri Alon, +3 more

TL;DR: This model represents a code snippet as the set of compositional paths in its abstract syntax tree and uses attention to select the relevant paths while decoding and significantly outperforms previous models that were specifically designed for programming languages, as well as state-of-the-art NMT models.

...read moreread less

Proceedings Article

DeepFix: Fixing Common C Language Errors by Deep Learning

Rahul Gupta, +3 more

TL;DR: DeepFix is a multi-layered sequence-to-sequence neural network with attention which is trained to predict erroneous program locations along with the required correct statements and could fix 1881 programs completely and 1338 programs partially.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Learning task-dependent distributed representations by backpropagation through structure

C. Goller, +1 more

TL;DR: A connectionist architecture together with a novel supervised learning scheme which is capable of solving inductive inference tasks on complex symbolic structures of arbitrary size is presented.

...read moreread less

Journal ArticleDOI

Functional maps: a flexible representation of maps between shapes

Maks Ovsjanikov, +4 more

TL;DR: A novel representation of maps between pairs of shapes that allows for efficient inference and manipulation and supports certain algebraic operations such as map sum, difference and composition, and enables a number of applications, such as function or annotation transfer without establishing point-to-point correspondences.

...read moreread less

Proceedings ArticleDOI

Hilbert space embeddings of conditional distributions with applications to dynamical systems

Le Song, +3 more

TL;DR: This paper derives a kernel estimate for the conditional embedding, and shows its connection to ordinary embeddings, and aims to derive a nonparametric method for modeling dynamical systems where the belief state of the system is maintained as a conditional embeddedding.

...read moreread less

Journal ArticleDOI

Kernel Embeddings of Conditional Distributions: A Unified Kernel Framework for Nonparametric Inference in Graphical Models

Le Song, +2 more

- 12 Jun 2013 -

IEEE Signal Processing Magazine

TL;DR: Many modern applications of signal processing and machine learning, ranging from computer vision to computational biology, require the analysis of large volumes of high-dimensional continuous-valued measurements, and a flexible and robust modeling framework that can take into account these diverse statistical features is needed.

...read moreread less

Journal ArticleDOI

Powergrading: a Clustering Approach to Amplify Human Effort for Short Answer Grading

Sumit Basu, +2 more

TL;DR: This paper used a similarity metric between student responses, and then used this metric to group responses into clusters and subclusters, which allowed teachers to grade multiple responses with a single action, provide rich feedback to groups of similar answers, and discover modalities of misunderstanding among students.

...read moreread less

Collapse

Learning Program Embeddings to Propagate Feedback on Student Code

Citations

Convolutional neural networks over tree structures for programming language processing

A Survey of Machine Learning for Big Code and Naturalness

A Survey of Machine Learning for Big Code and Naturalness

code2seq: Generating Sequences from Structured Representations of Code

DeepFix: Fixing Common C Language Errors by Deep Learning

References

Learning task-dependent distributed representations by backpropagation through structure

Functional maps: a flexible representation of maps between shapes

Hilbert space embeddings of conditional distributions with applications to dynamical systems

Kernel Embeddings of Conditional Distributions: A Unified Kernel Framework for Nonparametric Inference in Graphical Models

Powergrading: a Clustering Approach to Amplify Human Effort for Short Answer Grading

Related Papers (5)

Convolutional neural networks over tree structures for programming language processing

On the naturalness of software

Learning to Represent Programs with Graphs

DeepFix: Fixing Common C Language Errors by Deep Learning

Long short-term memory