Learning Program Embeddings to Propagate Feedback on Student Code

Home
/
Papers
/
Learning Program Embeddings to Propagate Feedback on Student Code

Proceedings Article•

Learning Program Embeddings to Propagate Feedback on Student Code

Chris Piech¹, Jonathan Huang², Andy Nguyen¹, Mike Phulsuksombati¹, Mehran Sahami¹, Leonidas J. Guibas¹ - Show less +2 more•Institutions (2)

Stanford University¹, Google²

06 Jul 2015-pp 1093-1102

TL;DR: A neural network method is introduced to encode programs as a linear mapping from an embedded precondition space to an embedded postcondition space and an algorithm for feedback at scale is proposed using these linear maps as features.

read less

Abstract: Providing feedback, both assessing final work and giving hints to stuck students, is difficult for open-ended assignments in massive online classes which can range from thousands to millions of students. We introduce a neural network method to encode programs as a linear mapping from an embedded precondition space to an embedded postcondition space and propose an algorithm for feedback at scale using these linear maps as features. We apply our algorithm to assessments from the Code.org Hour of Code and Stanford University's CS1 course, where we propagate human comments on student assignments to orders of magnitude more submissions.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Proceedings Article•

Convolutional neural networks over tree structures for programming language processing

[...]

Lili Mou¹, Ge Li¹, Lu Zhang¹, Tao Wang², Zhi Jin¹ - Show less +1 more•Institutions (2)

Peking University¹, Stanford University²

12 Feb 2016

TL;DR: In this article, a tree-based convolutional neural network (TBCNN) is proposed for programming language processing, in which a convolution kernel is designed over programs' abstract syntax trees to capture structural information.

...read moreread less

Abstract: Programming language processing (similar to natural language processing) is a hot research topic in the field of software engineering; it has also aroused growing interest in the artificial intelligence community. However, different from a natural language sentence, a program contains rich, explicit, and complicated structural information. Hence, traditional NLP models may be inappropriate for programs. In this paper, we propose a novel tree-based convolutional neural network (TBCNN) for programming language processing, in which a convolution kernel is designed over programs' abstract syntax trees to capture structural information. TBCNN is a generic architecture for programming language processing; our experiments show its effectiveness in two different program analysis tasks: classifying programs according to functionality, and detecting code snippets of certain patterns. TBCNN outperforms baseline methods, including several neural models for NLP.

...read moreread less

551 citations

Posted Content•

A Survey of Machine Learning for Big Code and Naturalness

[...]

Miltiadis Allamanis¹, Earl T. Barr², Premkumar Devanbu³, Charles Sutton⁴•Institutions (4)

Microsoft¹, University College London², University of California, Davis³, University of Edinburgh⁴

18 Sep 2017-arXiv: Software Engineering

TL;DR: This article presents a taxonomy based on the underlying design principles of each model and uses it to navigate the literature and discuss cross-cutting and application-specific challenges and opportunities.

...read moreread less

Abstract: Research at the intersection of machine learning, programming languages, and software engineering has recently taken important steps in proposing learnable probabilistic models of source code that exploit code's abundance of patterns. In this article, we survey this work. We contrast programming languages against natural languages and discuss how these similarities and differences drive the design of probabilistic models. We present a taxonomy based on the underlying design principles of each model and use it to navigate the literature. Then, we review how researchers have adapted these models to application areas and discuss cross-cutting and application-specific challenges and opportunities.

...read moreread less

503 citations

Cites background from "Learning Program Embeddings to Prop..."

...[155] Syntax + State Student Feedback Distributed Student Feedback...
[...]

Journal Article•DOI•

A Survey of Machine Learning for Big Code and Naturalness

[...]

Miltiadis Allamanis¹, Earl T. Barr², Premkumar Devanbu³, Charles Sutton⁴•Institutions (4)

Microsoft¹, University College London², University of California, Davis³, University of Edinburgh⁴

31 Jul 2018-ACM Computing Surveys

TL;DR: A survey of machine learning, programming languages, and software engineering has recently taken important steps in proposing learnable probabilistic models of source code that exploit the abundance of patterns of code.

...read moreread less

Abstract: Research at the intersection of machine learning, programming languages, and software engineering has recently taken important steps in proposing learnable probabilistic models of source code that exploit the abundance of patterns of code. In this article, we survey this work. We contrast programming languages against natural languages and discuss how these similarities and differences drive the design of probabilistic models. We present a taxonomy based on the underlying design principles of each model and use it to navigate the literature. Then, we review how researchers have adapted these models to application areas and discuss cross-cutting and application-specific challenges and opportunities.

...read moreread less

502 citations

Proceedings Article•

code2seq: Generating Sequences from Structured Representations of Code

[...]

Uri Alon¹, Shaked Brody², Omer Levy³, Eran Yahav²•Institutions (3)

University of Missouri–Kansas City¹, Technion – Israel Institute of Technology², Facebook³

27 Sep 2018

TL;DR: This model represents a code snippet as the set of compositional paths in its abstract syntax tree and uses attention to select the relevant paths while decoding and significantly outperforms previous models that were specifically designed for programming languages, as well as state-of-the-art NMT models.

...read moreread less

Abstract: The ability to generate natural language sequences from source code snippets has a variety of applications such as code summarization, documentation, and retrieval. Sequence-to-sequence (seq2seq) models, adopted from neural machine translation (NMT), have achieved state-of-the-art performance on these tasks by treating source code as a sequence of tokens. We present ${\rm {\scriptsize CODE2SEQ}}$: an alternative approach that leverages the syntactic structure of programming languages to better encode source code. Our model represents a code snippet as the set of compositional paths in its abstract syntax tree (AST) and uses attention to select the relevant paths while decoding. We demonstrate the effectiveness of our approach for two tasks, two programming languages, and four datasets of up to $16$M examples. Our model significantly outperforms previous models that were specifically designed for programming languages, as well as state-of-the-art NMT models. An interactive online demo of our model is available at this http URL. Our code, data and trained models are available at this http URL.

...read moreread less

486 citations

Proceedings Article•

DeepFix: Fixing Common C Language Errors by Deep Learning

[...]

Rahul Gupta¹, Soham Pal¹, Aditya Kanade¹, Shirish Shevade¹•Institutions (1)

Indian Institute of Science¹

12 Feb 2017

TL;DR: DeepFix is a multi-layered sequence-to-sequence neural network with attention which is trained to predict erroneous program locations along with the required correct statements and could fix 1881 programs completely and 1338 programs partially.

...read moreread less

Abstract: The problem of automatically fixing programming errors is a very active research topic in software engineering. This is a challenging problem as fixing even a single error may require analysis of the entire program. In practice, a number of errors arise due to programmer's inexperience with the programming language or lack of attention to detail. We call these common programming errors. These are analogous to grammatical errors in natural languages. Compilers detect such errors, but their error messages are usually inaccurate. In this work, we present an end-to-end solution, called DeepFix, that can fix multiple such errors in a program without relying on any external tool to locate or fix them. At the heart of DeepFix is a multi-layered sequence-to-sequence neural network with attention which is trained to predict erroneous program locations along with the required correct statements. On a set of 6971 erroneous C programs written by students for 93 programming tasks, DeepFix could fix 1881 (27%) programs completely and 1338 (19%) programs partially.

...read moreread less

415 citations

Cites methods from "Learning Program Embeddings to Prop..."

...Piech et al. (2015) proposed a neural network based approach to find program representations and used them for automatically propagating instructor feedback to students in a massive course....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Autonomously Generating Hints by Inferring Problem Solving Policies

[...]

Chris Piech¹, Mehran Sahami¹, Jonathan Huang¹, Leonidas J. Guibas¹•Institutions (1)

Stanford University¹

14 Mar 2015

TL;DR: This paper autonomously generate hints for the Code.org `Hour of Code,' (which is to the best of the authors' knowledge the largest online course to date) using historical student data, and discovers that this statistic is highly predictive of a student's future success.

...read moreread less

Abstract: Exploring the whole sequence of steps a student takes to produce work, and the patterns that emerge from thousands of such sequences is fertile ground for a richer understanding of learning. In this paper we autonomously generate hints for the Code.org `Hour of Code,' (which is to the best of our knowledge the largest online course to date) using historical student data. We first develop a family of algorithms that can predict the way an expert teacher would encourage a student to make forward progress. Such predictions can form the basis for effective hint generation systems. The algorithms are more accurate than current state-of-the-art methods at recreating expert suggestions, are easy to implement and scale well. We then show that the same framework which motivated the hint generating algorithms suggests a sequence-based statistic that can be measured for each learner. We discover that this statistic is highly predictive of a student's future success.

...read moreread less

127 citations

"Learning Program Embeddings to Prop..." refers background in this paper

...While there are many dimensions that “characterize” a program including aspects such as style or time/space complexity, we begin by first focussing on capturing the most basic aspect of a program — its function....
[...]
...There have been a number of recent papers (Huang et al., 2013; Basu et al., 2013; Nguyen et al., 2014; Brooks et al., 2014; Lan et al., 2015; Piech et al., 2015) on using large homework submission datasets to improve student feedback....
[...]

Proceedings Article•DOI•

Codewebs: scalable homework search for massive open online programming courses

[...]

Andy Nguyen¹, Chris Piech¹, Jonathan Huang¹, Leonidas J. Guibas¹•Institutions (1)

Stanford University¹

07 Apr 2014

TL;DR: A method for decomposing online homework submissions into a vocabulary of "code phrases", and based on this vocabulary, a queryable index that allows for fast searches into the massive dataset of student homework submissions is designed.

...read moreread less

Abstract: Massive open online courses (MOOCs), one of the latest internet revolutions have engendered hope that constant iterative improvement and economies of scale may cure the ``cost disease" of higher education. While scalable in many ways, providing feedback for homework submissions (particularly open-ended ones) remains a challenge in the online classroom. In courses where the student-teacher ratio can be ten thousand to one or worse, it is impossible for instructors to personally give feedback to students or to understand the multitude of student approaches and pitfalls. Organizing and making sense of massive collections of homework solutions is thus a critical web problem. Despite the challenges, the dense solution space sampling in highly structured homeworks for some MOOCs suggests an elegant solution to providing quality feedback to students on a massive scale. We outline a method for decomposing online homework submissions into a vocabulary of "code phrases", and based on this vocabulary, we architect a queryable index that allows for fast searches into the massive dataset of student homework submissions. To demonstrate the utility of our homework search engine we index over a million code submissions from users worldwide in Stanford's Machine Learning MOOC and (a) semi-automatically learn shared structure amongst homework submissions and (b) generate specific feedback for student mistakes. Codewebs is a tool that leverages the redundancy of densely sampled, highly structured homeworks in order to force-multiply teacher effort. Giving articulate, instant feedback is a crucial component of the online learning process and thus by building a homework search engine we hope to take a step towards higher quality free education.

...read moreread less

105 citations

"Learning Program Embeddings to Prop..." refers background or methods in this paper

...There have been a number of recent papers (Huang et al., 2013; Basu et al., 2013; Nguyen et al., 2014; Brooks et al., 2014; Lan et al., 2015; Piech et al., 2015) on using large homework submission datasets to improve student feedback....
[...]
...Using equivalences found using similar amount of effort as in previous work, we are able to achieve 90% precision with recall of 39%, 48% and 13%, for the three problems respectively....
[...]
...Moreover, we extended this baseline by amalgamating functionally equivalent code (Nguyen et al., 2014)....
[...]
...In other words, we want to predict a postcondition Q out of some space of possible postconditions....
[...]
...To incorporate functionality, (Nguyen et al., 2014) proposed a method that discovers program modifications that do not appear to change the semantic meaning of code....
[...]

Proceedings Article•DOI•

Divide and correct: using clusters to grade short answers at scale

[...]

Michael Brooks¹, Sumit Basu², Charles E. Jacobs², Lucy Vanderwende²•Institutions (2)

University of Washington¹, Microsoft²

04 Mar 2014-Legal Studies

TL;DR: A cluster-based interface is proposed that allows teachers to read, grade, and provide feedback on large groups of answers at once and is found to allow teachers to grade substantially faster, to give more feedback to students, and to develop a high-level view of students' understanding and misconceptions.

...read moreread less

Abstract: In comparison to multiple choice or other recognition-oriented forms of assessment, short answer questions have been shown to offer greater value for both students and teachers; for students they can improve retention of knowledge, while for teachers they provide more insight into student understanding. Unfortunately, the same open-ended nature which makes them so valuable also makes them more difficult to grade at scale. To address this, we propose a cluster-based interface that allows teachers to read, grade, and provide feedback on large groups of answers at once. We evaluated this interface against an unclustered baseline in a within-subjects study with 25 teachers, and found that the clustered interface allows teachers to grade substantially faster, to give more feedback to students, and to develop a high-level view of students' understanding and misconceptions.

...read moreread less

71 citations

"Learning Program Embeddings to Prop..." refers background in this paper

...While there are many dimensions that “characterize” a program including aspects such as style or time/space complexity, we begin by first focussing on capturing the most basic aspect of a program — its function....
[...]
...There have been a number of recent papers (Huang et al., 2013; Basu et al., 2013; Nguyen et al., 2014; Brooks et al., 2014; Lan et al., 2015; Piech et al., 2015) on using large homework submission datasets to improve student feedback....
[...]

Proceedings Article•

Syntactic and Functional Variability of a Million Code Submissions in a Machine Learning MOOC.

[...]

Jonathan Huang¹, Chris Piech¹, Andy Nguyen¹, Leonidas J. Guibas¹•Institutions (1)

Stanford University¹

01 Jan 2013

TL;DR: The syntax and functional similarity of the submissions are mapped out in order to explore the variation in solutions in the first offering of Stanford's Machine Learning Massive Open-Access Online Course.

...read moreread less

Abstract: In the first offering of Stanford’s Machine Learning Massive Open-Access Online Course (MOOC) there were over a million programming submissions to 42 assignments — a dense sampling of the range of possible solutions. In this paper we map out the syntax and functional similarity of the submissions in order to explore the variation in solutions. While there was a massive number of submissions, there is a much smaller set of unique approaches. This redundancy in student solutions can be leveraged to “force multiply” teacher feedback. Fig. 1. The landscape of solutions for “gradient descent for linear regression” representing over 40,000 student code submissions with edges drawn between syntactically similar submissions and colors corresponding to performance on a battery of unit tests (red submissions passed all unit tests).

...read moreread less

70 citations

"Learning Program Embeddings to Prop..." refers background in this paper

...Given a program A (where we consider a program to generally be any executable code whether a full submission or a subtree of a submission), and a precondition P , we thus would like to learn features of A that are useful for predicting the outcome of running A when P holds....
[...]
...Some authors have done this without an explicit featurization of the code — for example, the AST edit distance has been a popular choice (Huang et al., 2013; Rogers et al., 2014)....
[...]
...While there are many dimensions that “characterize” a program including aspects such as style or time/space complexity, we begin by first focussing on capturing the most basic aspect of a program — its function....
[...]
...There have been a number of recent papers (Huang et al., 2013; Basu et al., 2013; Nguyen et al., 2014; Brooks et al., 2014; Lan et al., 2015; Piech et al., 2015) on using large homework submission datasets to improve student feedback....
[...]

Posted Content•

Learning Program Embeddings to Propagate Feedback on Student Code

[...]

Chris Piech¹, Jonathan Huang², Andy Nguyen¹, Mike Phulsuksombati¹, Mehran Sahami¹, Leonidas J. Guibas¹ - Show less +2 more•Institutions (2)

Stanford University¹, Google²

22 May 2015-arXiv: Learning

TL;DR: In this paper, a neural network method is proposed to encode programs as a linear mapping from an embedded precondition space to an embedded postcondition space and propose an algorithm for feedback at scale using these linear maps as features.

...read moreread less

Abstract: Providing feedback, both assessing final work and giving hints to stuck students, is difficult for open-ended assignments in massive online classes which can range from thousands to millions of students. We introduce a neural network method to encode programs as a linear mapping from an embedded precondition space to an embedded postcondition space and propose an algorithm for feedback at scale using these linear maps as features. We apply our algorithm to assessments from the this http URL Hour of Code and Stanford University's CS1 course, where we propagate human comments on student assignments to orders of magnitude more submissions.

...read moreread less

68 citations