Home
/
Authors
/
Mike Phulsuksombati

Author

Mike Phulsuksombati

Bio: Mike Phulsuksombati is an academic researcher from Stanford University. The author has contributed to research in topics: Postcondition. The author has an hindex of 2, co-authored 2 publications receiving 189 citations.

Topics: Postcondition

Papers

PDF

Open Access

More filters

Proceedings Article•

Learning Program Embeddings to Propagate Feedback on Student Code

[...]

Chris Piech¹, Jonathan Huang², Andy Nguyen¹, Mike Phulsuksombati¹, Mehran Sahami¹, Leonidas J. Guibas¹ - Show less +2 more•Institutions (2)

Stanford University¹, Google²

06 Jul 2015

TL;DR: A neural network method is introduced to encode programs as a linear mapping from an embedded precondition space to an embedded postcondition space and an algorithm for feedback at scale is proposed using these linear maps as features.

...read moreread less

Abstract: Providing feedback, both assessing final work and giving hints to stuck students, is difficult for open-ended assignments in massive online classes which can range from thousands to millions of students. We introduce a neural network method to encode programs as a linear mapping from an embedded precondition space to an embedded postcondition space and propose an algorithm for feedback at scale using these linear maps as features. We apply our algorithm to assessments from the Code.org Hour of Code and Stanford University's CS1 course, where we propagate human comments on student assignments to orders of magnitude more submissions.

...read moreread less

143 citations

Posted Content•

Learning Program Embeddings to Propagate Feedback on Student Code

[...]

Chris Piech¹, Jonathan Huang², Andy Nguyen¹, Mike Phulsuksombati¹, Mehran Sahami¹, Leonidas J. Guibas¹ - Show less +2 more•Institutions (2)

Stanford University¹, Google²

22 May 2015-arXiv: Learning

TL;DR: In this paper, a neural network method is proposed to encode programs as a linear mapping from an embedded precondition space to an embedded postcondition space and propose an algorithm for feedback at scale using these linear maps as features.

...read moreread less

Abstract: Providing feedback, both assessing final work and giving hints to stuck students, is difficult for open-ended assignments in massive online classes which can range from thousands to millions of students. We introduce a neural network method to encode programs as a linear mapping from an embedded precondition space to an embedded postcondition space and propose an algorithm for feedback at scale using these linear maps as features. We apply our algorithm to assessments from the this http URL Hour of Code and Stanford University's CS1 course, where we propagate human comments on student assignments to orders of magnitude more submissions.

...read moreread less

68 citations

Cited by

PDF

Open Access

More filters

Proceedings Article•

Deep knowledge tracing

[...]

Chris Piech¹, Jonathan Bassen¹, Jonathan Huang¹, Surya Ganguli¹, Mehran Sahami¹, Leonidas J. Guibas¹, Jascha Sohl-Dickstein¹ - Show less +3 more•Institutions (1)

Stanford University¹

07 Dec 2015

TL;DR: The utility of using Recurrent Neural Networks to model student learning and the learned model can be used for intelligent curriculum design and allows straightforward interpretation and discovery of structure in student tasks are explored.

...read moreread less

Abstract: Knowledge tracing—where a machine models the knowledge of a student as they interact with coursework—is a well established problem in computer supported education. Though effectively modeling student knowledge would have high educational impact, the task has many inherent challenges. In this paper we explore the utility of using Recurrent Neural Networks (RNNs) to model student learning. The RNN family of models have important advantages over previous methods in that they do not require the explicit encoding of human domain knowledge, and can capture more complex representations of student knowledge. Using neural networks results in substantial improvements in prediction performance on a range of knowledge tracing datasets. Moreover the learned model can be used for intelligent curriculum design and allows straightforward interpretation and discovery of structure in student tasks. These results suggest a promising new line of research for knowledge tracing and an exemplary application task for RNNs.

...read moreread less

595 citations

Proceedings Article•

Convolutional neural networks over tree structures for programming language processing

[...]

Lili Mou¹, Ge Li¹, Lu Zhang¹, Tao Wang², Zhi Jin¹ - Show less +1 more•Institutions (2)

Peking University¹, Stanford University²

12 Feb 2016

TL;DR: In this article, a tree-based convolutional neural network (TBCNN) is proposed for programming language processing, in which a convolution kernel is designed over programs' abstract syntax trees to capture structural information.

...read moreread less

Abstract: Programming language processing (similar to natural language processing) is a hot research topic in the field of software engineering; it has also aroused growing interest in the artificial intelligence community. However, different from a natural language sentence, a program contains rich, explicit, and complicated structural information. Hence, traditional NLP models may be inappropriate for programs. In this paper, we propose a novel tree-based convolutional neural network (TBCNN) for programming language processing, in which a convolution kernel is designed over programs' abstract syntax trees to capture structural information. TBCNN is a generic architecture for programming language processing; our experiments show its effectiveness in two different program analysis tasks: classifying programs according to functionality, and detecting code snippets of certain patterns. TBCNN outperforms baseline methods, including several neural models for NLP.

...read moreread less

551 citations

Posted Content•

A Survey of Machine Learning for Big Code and Naturalness

[...]

Miltiadis Allamanis¹, Earl T. Barr², Premkumar Devanbu³, Charles Sutton⁴•Institutions (4)

Microsoft¹, University College London², University of California, Davis³, University of Edinburgh⁴

18 Sep 2017-arXiv: Software Engineering

TL;DR: This article presents a taxonomy based on the underlying design principles of each model and uses it to navigate the literature and discuss cross-cutting and application-specific challenges and opportunities.

...read moreread less

Abstract: Research at the intersection of machine learning, programming languages, and software engineering has recently taken important steps in proposing learnable probabilistic models of source code that exploit code's abundance of patterns. In this article, we survey this work. We contrast programming languages against natural languages and discuss how these similarities and differences drive the design of probabilistic models. We present a taxonomy based on the underlying design principles of each model and use it to navigate the literature. Then, we review how researchers have adapted these models to application areas and discuss cross-cutting and application-specific challenges and opportunities.

...read moreread less

503 citations

Journal Article•DOI•

A Survey of Machine Learning for Big Code and Naturalness

[...]

Miltiadis Allamanis¹, Earl T. Barr², Premkumar Devanbu³, Charles Sutton⁴•Institutions (4)

Microsoft¹, University College London², University of California, Davis³, University of Edinburgh⁴

31 Jul 2018-ACM Computing Surveys

TL;DR: A survey of machine learning, programming languages, and software engineering has recently taken important steps in proposing learnable probabilistic models of source code that exploit the abundance of patterns of code.

...read moreread less

Abstract: Research at the intersection of machine learning, programming languages, and software engineering has recently taken important steps in proposing learnable probabilistic models of source code that exploit the abundance of patterns of code. In this article, we survey this work. We contrast programming languages against natural languages and discuss how these similarities and differences drive the design of probabilistic models. We present a taxonomy based on the underlying design principles of each model and use it to navigate the literature. Then, we review how researchers have adapted these models to application areas and discuss cross-cutting and application-specific challenges and opportunities.

...read moreread less

502 citations

Proceedings Article•

code2seq: Generating Sequences from Structured Representations of Code

[...]

Uri Alon¹, Shaked Brody², Omer Levy³, Eran Yahav²•Institutions (3)

University of Missouri–Kansas City¹, Technion – Israel Institute of Technology², Facebook³

27 Sep 2018

TL;DR: This model represents a code snippet as the set of compositional paths in its abstract syntax tree and uses attention to select the relevant paths while decoding and significantly outperforms previous models that were specifically designed for programming languages, as well as state-of-the-art NMT models.

...read moreread less

Abstract: The ability to generate natural language sequences from source code snippets has a variety of applications such as code summarization, documentation, and retrieval. Sequence-to-sequence (seq2seq) models, adopted from neural machine translation (NMT), have achieved state-of-the-art performance on these tasks by treating source code as a sequence of tokens. We present ${\rm {\scriptsize CODE2SEQ}}$: an alternative approach that leverages the syntactic structure of programming languages to better encode source code. Our model represents a code snippet as the set of compositional paths in its abstract syntax tree (AST) and uses attention to select the relevant paths while decoding. We demonstrate the effectiveness of our approach for two tasks, two programming languages, and four datasets of up to $16$M examples. Our model significantly outperforms previous models that were specifically designed for programming languages, as well as state-of-the-art NMT models. An interactive online demo of our model is available at this http URL. Our code, data and trained models are available at this http URL.

...read moreread less

486 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34

Collapse