scispace - formally typeset
Open AccessProceedings ArticleDOI

Introductory programming: a systematic literature review

TLDR
An ITiCSE working group conducted a systematic review of the introductory programming literature to explore trends, highlight advances in knowledge over the past 15 years, and indicate possible directions for future research.
Abstract
As computing becomes a mainstream discipline embedded in the school curriculum and acts as an enabler for an increasing range of academic disciplines in higher education, the literature on introductory programming is growing. Although there have been several reviews that focus on specific aspects of introductory programming, there has been no broad overview of the literature exploring recent trends across the breadth of introductory programming. This paper is the report of an ITiCSE working group that conducted a systematic review in order to gain an overview of the introductory programming literature. Partitioning the literature into papers addressing the student, teaching, the curriculum, and assessment, we explore trends, highlight advances in knowledge over the past 15 years, and indicate possible directions for future research.

read more

Content maybe subject to copyright    Report

Introductory Programming: A Systematic Literature Review
Andrew Luxton-Reilly
University of Auckland
New Zealand
andrew@cs.auckland.ac.nz
Simon
University of Newcastle
Australia
simon@newcastle.edu.au
Ibrahim Albluwi
Princeton University
United States of America
isma@cs.princeton.edu
Brett A. Becker
University College Dublin
Ireland
brett.becker@ucd.ie
Michail Giannakos
Norwegian University of Science and
Technology
Norway
michailg@ntnu.no
Amruth N. Kumar
Ramapo College of New Jersey
United States of America
amruth@ramapo.edu
Linda Ott
Michigan Technological University
United States of America
linda@mtu.edu
James Paterson
Glasgow Caledonian University
United Kingdom
james.paterson@gcu.ac.uk
Michael James Scott
Falmouth University
United Kingdom
michael.scott@falmouth.ac.uk
Judy Sheard
Monash University
Australia
judy.sheard@monash.edu
Claudia Szabo
University of Adelaide
Australia
claudia.szabo@adelaide.edu.au
ABSTRACT
As computing becomes a mainstream discipline embedded in the
school curriculum and acts as an enabler for an increasing range of
academic disciplines in higher education, the literature on introduc-
tory programming is growing. Although there have been several
reviews that focus on specic aspects of introductory programming,
there has been no broad overview of the literature exploring recent
trends across the breadth of introductory programming.
This paper is the report of an ITiCSE working group that con-
ducted a systematic review in order to gain an overview of the
introductory programming literature. Partitioning the literature
into papers addressing the student, teaching, the curriculum, and
assessment, we explore trends, highlight advances in knowledge
over the past 15 years, and indicate possible directions for future
research.
CCS CONCEPTS
Social and professional topics Computing education;
KEYWORDS
ITiCSE working group; CS1; introductory programming; novice
programming; systematic literature review; systematic review; lit-
erature review; review; SLR; overview
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for third-party components of this work must be honored.
For all other uses, contact the owner/author(s).
ITiCSE’18, July 2–4, 2018, Larnaca, Cyprus
© 2018 Copyright held by the owner/author(s).
ACM ISBN 978-x-xxxx-xxxx-x/YY/MM.
https://doi.org/10.1145/nnnnnnn.nnnnnnn
ACM Reference Format:
Andrew Luxton-Reilly, Simon, Ibrahim Albluwi, Brett A. Becker, Michail Gi-
annakos, Amruth N. Kumar, Linda Ott, James Paterson, Michael James Scott,
Judy Sheard, and Claudia Szabo. 2018. Introductory Programming: A Sys-
tematic Literature Review. In Proceedings of 23rd Annual ACM Conference on
Innovation and Technology in Computer Science Education (ITiCSE’18). ACM,
New York, NY, USA, 52 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn
1 INTRODUCTION
Teaching students to program is a complex process. A 2003 review
by Robins et al. [
554
] provided a comprehensive overview of novice
programming research prior to that year. The rst paragraph of the
review sets the general tone:
Learning to program is hard [. .. ] Novice programmers
suer from a wide range of diculties and decits. Pro-
gramming courses are generally regarded as dicult,
and often have the highest dropout rates. [554, p137]
However, more recent studies have suggested that the situation is
not as dire as previously suggested. Studies indicate that dropout
rates among computing students are not alarmingly high [
64
,
692
],
and it has been suggested that the diculties faced by novices may
be a consequence of unrealistic expectations rather than intrin-
sic subject complexity [
393
]. Better outcomes are likely to arise
from focusing less on student decits and more on actions that
the computing community can take to improve teaching practice.
In this paper we investigate the literature related to introductory
programming and summarise the main ndings and challenges for
the computing community.

ITiCSE’18, July 2–4, 2018, Larnaca, Cyprus Luxton-Reilly, Simon, Albluwi, Becker, Giannakos, Kumar, O, Paterson, Sco, Sheard, Szabo
Although there have been several reviews of published work
involving novice programmers since 2003, those reviews have gen-
erally focused on highly specic aspects, such as student miscon-
ceptions [
530
], teaching approaches [
679
], program comprehen-
sion [
578
], potentially seminal papers [
502
], research methods ap-
plied [
598
], automated feedback for exercises [
321
], competency-
enhancing games [
675
], student anxiety [
478
], and program visual-
isation [631].
A review conducted contemporaneously with our own, by
Medeiros et al
. [436]
, is somewhat broader in scope than those
mentioned above, but not as broad as our own. It investigates the
skills and background that best prepare a student for program-
ming, the diculties encountered by novice programmers, and the
challenges faced by their instructors.
2 SCOPE
We review papers published between 2003 and 2017 inclusive. Pub-
lications outside this range are not included in the formal analysis,
but may be included in discussion where appropriate.
In selecting papers for review, we make a clear distinction be-
tween those involving introductory programming the focus of
our review and those about other aspects of introductory com-
puting. For example, the literature of computing includes many
papers on aspects of computational thinking. This review addresses
such papers only where they where they have a clear focus on
introductory programming.
We have limited our scope to units of teaching corresponding to
introductory programming courses, thus ruling out shorter and less
formal units such as boot camps and other outreach activities. As
it became apparent that we needed to reduce the scope still further,
we also excluded work on introductory programming courses at
school level (also known as K–12) and work explicitly concerning
introductory computing courses for non-computing students (also
known as non-majors). Some papers in these areas are still included
in our discussion, but only if they contribute to our principal focus
on introductory programming courses for students in computing
degrees.
As recommended by an ITiCSE working group on worldwide
terminology in computing education [
609
], this report tends in
general to avoid the term computer science’, preferring instead the
less restrictive term computing’.
3 METHOD
The working group conducted a systematic literature review by
adapting the guidelines proposed by Kitchenham
[335]
. In this
review, we followed a highly structured process that involved:
(1) Specifying research questions
(2) Conducting searches of databases
(3) Selecting studies
(4) Filtering the studies by evaluating their pertinence
(5) Extracting data
(6) Synthesising the results
(7) Writing the review report
3.1 Research Questions
This review aims to explore the literature of introductory program-
ming by identifying publications that are of interest to the com-
puting community, the contributions of these publications, and the
evidence for any research ndings that they report. The specic
research questions are:
RQ1
What aspects of introductory programming have been
the focus of the literature?
RQ2
What developments have been reported in introductory
programming education between 2003 and 2017?
RQ3
What evidence has been reported when addressing dif-
ferent aspects of introductory programming?
3.2 Conducting Searches
Selecting search terms for a broad and inclusive review of intro-
ductory literature proved challenging. Terms that are too general
result in an unwieldy set of papers, while terms that are too specic
are likely to miss relevant papers. After some trial and error with
a range of databases, we selected a combined search phrase that
seemed to capture the area of interest:
"introductory programming" OR "introduction to pro-
gramming" OR "novice programming" OR "novice
programmers" OR "CS1" OR "CS 1" OR "learn pro-
gramming" OR "learning to program" OR "teach pro-
gramming"
To check whether this search phrase was appropriate, we applied
it to a trial set of papers and compared the outcome with our own
thoughts as to which papers from that set would fall within the
scope of our review. We chose the papers from the proceedings of
ICER 2017 and ITiCSE 2017, 94 papers in all. Ten members of the
working group individually decided whether each paper was rele-
vant to the review. The members then formed ve pairs, discussed
any dierences, and resolved them.
The inter-rater reliability of this process was measured with the
Fleiss-Davies kappa [
148
], which measures the agreement when
a xed set of raters classify a number of items into a xed set of
categories. In principle we were classifying the papers into just two
categories, yes and no, but some members were unable to make this
decision for some papers, introducing a third category of undecided.
The Fleiss-Davies kappa for individual classication was 61%. It has
been observed [
607
] that classication in pairs is more reliable than
individual classication, and this was borne out with our paired
classication, which resulted in a Fleiss-Davies kappa of 73% and
the disappearance of the undecided category.
When measuring inter-rater reliability, an agreement of less than
40% is generally considered to be poor, between 40% and 75% is
considered fair to good, and more than 75% is rated excellent [
45
].
By this criterion, our individual agreement was good and our paired
agreement was substantially better. This process resulted in the
selection of 29 papers, those that a majority of pairs agreed were
pertinent to our review.
We then automatically applied the search terms to the same set of
94 papers, resulting in a selection of 32 papers: 25 of the 29 that we
had selected and seven false positives, papers that were indicated
by the search terms but not selected by us. This proportion of false
positives was not a major concern, because every selected paper was

Introductory Programming: A Systematic Literature Review ITiCSE’18, July 2–4, 2018, Larnaca, Cyprus
going to be examined by at least one member of the team and could
be eliminated at that point. There were also four false negatives,
papers that we deemed relevant but that were not identied by
the search terms. False negatives are of greater concern because
they represent relevant papers that will not be identied by the
search; but, unable to nd a better combination of search terms, we
accepted that some 14% of pertinent papers would not be identied
by our search.
The search terms were then applied to the title, abstract, and
keyword elds of the ACM Full Text Collection, IEEE Explore,
ScienceDirect, SpringerLink and Scopus databases. The search was
conducted on 27 May 2018, and identied the following numbers
of papers:
ACM Full text collection: 2199
IEEE Explore: 710
ScienceDirect (Elsevier): 469
SpringerLink (most relevant 1000): 1000
Scopus (most relevant 2000): 2000; 678 after removal of du-
plicates
Total: 5056
3.3 Selecting Studies
The next stage of a systematic review is to select the papers that will
form the basis for the review. The search results were divided among
the authors, who examined each title and abstract, and the corre-
sponding full paper if required, to determine its relevance to the
review. We eliminated papers that were irrelevant, papers that were
less than four pages long (such as posters), and papers that were
clearly identied as work in progress. The biggest reductions were
seen in the more general ScienceDirect and SpringerLink databases,
where, for example, ‘CS1’ can refer to conditioned stimulus 1 in a
behavioural studies paper, cesium 1 in a paper on molecular struc-
tures, and connecting segment 1 in a paper on pathology. This
process reduced the pool of papers by more than half, as shown
below.
ACM Full text collection: 1126 (51%)
IEEE Explore: 448 (63%)
ScienceDirect (Elsevier): 62 (13%)
SpringerLink (most relevant 1000): 204 (20%)
Scopus: 349 (51%)
Total: 2189 (43%)
3.4 Filtering and Data Analysis
Following the selection of papers, the team collectively devised a
set of topics that might cover the papers we had been seeing. This
began with a brainstormed list of topics, which was then rened
and rationalised. The topics were then gathered into four high-level
groups: the student, teaching, curriculum, and assessment. The
rst three of these groups have together been called the ‘didactic
triangle’ of teaching [
70
,
314
]. While it is not one of these three core
elements, assessment is a clear link among them, being set by the
teacher and used to assess the student’s grasp of the curriculum.
The 2189 papers were divided among the authors, each of whom
classied approximately 200 papers using the abstract and, where
necessary, the full text of the paper. During this phase some 500
further papers were excluded upon perusal of their full text and
some 25 papers because the research team was unable to access
them. The remaining 1666 papers were classied into at least one
and often several of the categories. This classifying, or tagging,
constituted the rst phase of the analysis: the data extracted from
each paper were the groups and subgroups into which the paper
appeared to t.
Small groups then focused on particular topics to undertake the
remaining steps of the systematic process: evaluating the pertinence
of the papers, extracting the relevant data, synthesising the results,
and writing the report. The data extracted at this stage were brief
summaries of the points of interest of each paper as pertaining
to the topics under consideration. As the groups examined each
candidate paper in more depth, a few papers were reclassied by
consensus, and other papers were eliminated from the review. At the
completion of this phase, some of the initial topics were removed
because we had found few or no papers on them (for example,
competencies in the curriculum group), and one or two new topics
had emerged from examination of the papers (for example, course
orientation in the teaching group).
In a systematic literature review conducted according to Kitchen-
ham’s guidelines [
335
], candidate papers would at this point have
been ltered according to quality. This process was followed in
a limited manner: as indicated in the two foregoing paragraphs,
some papers were eliminated upon initial perusal and others upon
closer examination. However, our focus was more on the perti-
nence of papers to our subject area than on their inherent quality,
so at this stage we could be said to have deviated somewhat from
Kitchenham’s guidelines.
A further deviation from Kitchenham’s guidelines arises from the
number of papers identied by our search. It would be impractical
to list every paper that has addressed every topic, and even more
impractical to discuss any papers in depth. Therefore our intent is
to give a thorough view of the topics that have been discussed in the
literature, referring to a sample of the papers that have covered each
topic. Except where the context suggests otherwise, every reference
in the following sections should be understood as preceded by an
implicit ‘for example.
4 OVERVIEW OF INTRODUCTORY
PROGRAMMING RESEARCH
Table 1 shows the number of papers in each group and the sub-
groups into which some or all of these papers were classied. The
majority of papers fall into the teaching category, most of them
describing either teaching tools or the various forms of delivery
that have been explored in the classroom. A substantial number of
papers focus on the students themselves. We see fewer papers that
discuss course content or the competencies that students acquire
through the study of programming. The smallest of the broad topic
areas is assessment, which is interesting since assessment is such a
critical component of courses and typically drives both teaching
and learning.
Given the comprehensive nature of this review, it is inevitable
that some papers will be discussed in more than one section. For
example, a paper exploring how students use a Facebook group
to supplement their interactions in the university’s learning man-
agement system [
408
] is discussed under both student behaviour

ITiCSE’18, July 2–4, 2018, Larnaca, Cyprus Luxton-Reilly, Simon, Albluwi, Becker, Giannakos, Kumar, O, Paterson, Sco, Sheard, Szabo
Table 1: Initial classication of 1666 papers, some classied
into two or more groups or subgroups
Group Papers
Optional subgroups
The student 489
student learning, underrepresented
groups, student attitudes, student be-
haviour, student engagement, student
ability, the student experience, code
reading, tracing, writing, and debug-
ging
Teaching 905
teaching tools, pedagogical ap-
proaches, theories of learning,
infrastructure
Curriculum 258
competencies, programming lan-
guages, paradigms
Assessment 192
assessment tools, approaches to as-
sessment, feedback on assessment,
academic integrity
(section 5.1.3) and teaching infrastructure (section 6.5). While fur-
ther rationalisation might have been possible, we consider that
readers are best served by a structure that considers broad cate-
gories and then surveys the papers relevant to each, even if that
entails some duplication.
Figure 1 shows the number of papers that we identied in the
data set, arranged by year. It is clear that the number of publications
about introductory programming courses is increasing over time.
To check whether introductory programming is a growing focus
in the ACM SIGCSE conferences, we counted the papers in our data
set that were published each year in ICER (which began in 2005),
ITiCSE, or the SIGCSE Technical Symposium, and compared this
with the total number of papers published each year in those three
conferences. Figure 2 shows that publication numbers in the three
main ACM SIGCSE conferences remain fairly stable between 2005
and 2017. Publications from these venues focusing on introductory
programming, although somewhat variable, also remain relatively
stable across the same period. We conclude that the main growth
in publications is occurring outside SIGCSE venues, which might
indicate that programming education is of growing interest to the
broader research community. Alternatively, it might indicate that
authors are seeking more venues because there is no growth in the
numbers of papers accepted by the SIGCSE venues.
5 THE STUDENT
This section explores publications that focus primarily on the stu-
dent. This includes work on learning disciplinary content knowl-
edge, student perceptions and experiences of introductory program-
ming, and identiable subgroups of students studying programming.
Table 2 gives an overview of the categories and corresponding num-
bers of papers. The sum of the numbers in the table does not match
the number in Table 1 because some papers were classied into
more than one category.
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
0
50
100
150
200
Paper count
Figure 1: Introductory programming publications identied
by our search
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
0
50
100
150
200
250
Paper count
SIGCSE conference papers in our study
All SIGCSE conference papers
Figure 2: Introductory programming publications identied
by our search and published in ICER, ITiCSE or SIGCSE,
compared with total publications in ICER, ITiCSE and
SIGCSE
Table 2: Classication of papers focused on students
Category N Description
Content
Theory 17 Models of student understanding
Literacy 58 Code reading, writing, debugging
Behaviour 69 Measurements of student activity
Ability 169 Measuring student ability
Sentiment
Attitudes 105 Student attitudes
Engagement 61 Measuring/improving engagement
Experience 18 Experiences of programming
Subgroups
At risk 17 Students at risk of failing
Underrep. 25 Women and minorities

Introductory Programming: A Systematic Literature Review ITiCSE’18, July 2–4, 2018, Larnaca, Cyprus
5.1 Content
The categories in this section relate to measuring what students
learn and how they learn it. We begin by considering work that
applies a cognitive lens to student understanding. We then move
to publications on what we term code literacy (i.e., reading, writ-
ing, and debugging of code), before moving to measurable student
behaviour. The section concludes by considering broad ways that
student ability is addressed in research. Figure 3 illustrates the
growth of papers focusing on the interaction between students and
the content taught in introductory programming courses.
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
Year
0
5
10
15
20
25
Paper count
Figure 3: Number of publications focusing on interaction be-
tween students and the content theory, literacy, behaviour
and ability by year
5.1.1 Theory.
Several papers grounded in various theoretical perspectives
study the thinking processes of novice programmers. The num-
ber of papers focusing on the theoretical perspectives is relatively
small (no more than 3 papers in any given year), with no discernible
trend over the period of our study.
From a constructivist point of view, learners construct their own
mental models of the phenomena they interact with [
59
]. Several pa-
pers have investigated novice programmers’ viable and non-viable
mental models of concepts such as variables [
628
], parameter pass-
ing [
401
], value and reference assignment [
399
], and how objects
are stored in memory [
627
]. Students were found to hold misconcep-
tions and non-viable mental models of these fundamental concepts
even after completing their introductory programming courses. To
address this issue, Sorva [
627
,
628
] recommends the use of visual-
isation tools with techniques from variation theory, while Ma et
al. [
398
,
400
] recommend the use of visualisation tools with tech-
niques from cognitive conict theory. Interestingly, both Madison
and Giord
[401]
and Ma et al
. [399]
report that students holding
non-viable mental models sometimes still manage to do well on
related programming tasks, suggesting that assessment techniques
beyond conventional code-writing tasks might be needed to reveal
certain misconceptions.
Vagianou
[674]
suggests a conceptual framework and a graph-
ical representation that can be used to help students construct a
viable mental model of program-memory interaction. Vagianou ar-
gues that program-memory interaction exhibits the characteristics
of a threshold concept, being troublesome, transformative, and po-
tentially irreversible. Sorva [
629
] distinguishes between threshold
concepts and fundamental ideas, proposing that threshold concepts
act as ‘portals’ that transform the students’ understanding, while
fundamental ideas “run threadlike across a discipline and beyond”.
Sorva, who has conducted a comprehensive review of research
on mental models, misconceptions, and threshold concepts [
630
],
suggests that abstraction and state might be fundamental ideas
while program dynamics, information hiding, and object interac-
tion might be threshold concepts.
Lister [
381
] and Teague et al. [
655
658
] apply a neo-Piagetian
perspective to explore how students reason about code. They dis-
cuss the dierent cognitive developmental stages of novice pro-
grammers and use these stages to explain and predict the ability or
otherwise of students to perform tasks in code reading and writing.
The most important pedagogical implication of this work is that in-
structional practice should rst identify the neo-Piagetian level that
students are at and then explicitly train them to reason at higher
levels. They contrast this with conventional practices, where they
argue that teaching often happens at a cognitive level that is higher
than that of many students [381, 656658].
Due to the qualitative nature of research done both on mental
models and on neo-Piagetian cognitive stages, more work is needed
to quantitatively study what has been observed. For example, it is
still not clear how widespread the observed mental models are or
which neo-Piagetian cognitive stages are more prevalent among
novice programmers in dierent courses or at dierent times in the
same course. The small numbers of participants in these qualitative
studies suggest the need for more studies that replicate, validate,
and expand them.
5.1.2 Code ‘Literacy’.
Literacy, a term traditionally applied to the reading and writing
of natural language, is concerned with making sense of the world
and communicating eectively. In modern usage this has broad-
ened to refer to knowledge and competency in a specic area, for
example, computer literacy’. Here, however, we apply the term in
the traditional sense to coding, using it to mean how students make
sense of code and how they communicate solutions to problems by
writing executable programs. We distinguish between reading and
writing, and consider research that seeks insights into the students’
processes in each.
Code reading and tracing. The process of reading programs is
essential both in learning to program and in the practice of pro-
gramming by experts. We found 28 papers reporting on issues
related to students’ code reading. These included papers where the
reading process involved tracing the way a computer would exe-
cute the program, which adds a signicant dimension to ‘making
sense’ that is not present in the largely linear process of reading
natural-language text.
A number of papers study the reading process in order to gain
insight into students’ program comprehension, for example by re-
lating reading behaviour to well-known program comprehension
models [
13
] or the education-focused block model [
702
]. There has
been recent interest in the application of eye-tracking techniques
to novice programmers, for example to study changes in reading

Citations
More filters
Proceedings ArticleDOI

The Robots Are Coming: Exploring the Implications of OpenAI Codex on Introductory Programming

TL;DR: This work explores how Codex performs on typical introductory programming problems, and reports its performance on real questions taken from introductory programming exams and compares it to results from students who took these same exams under normal conditions, demonstrating that Codex outscores most students.
Proceedings ArticleDOI

First Things First: Providing Metacognitive Scaffolding for Interpreting Problem Prompts

TL;DR: Students who received the intervention showed a higher degree of understanding of the problem prompt and were more likely to complete the programming task successfully, finding metacognitive awareness is crucially important for novice students.
Proceedings ArticleDOI

50 Years of CS1 at SIGCSE: A Review of the Evolution of Introductory Programming Education Research

TL;DR: A perspective on the evolution of introductory programming education research at the SIGCSE Technical Symposium over these 50 years is presented, and a systematic approach to collecting papers presented at the Symposium is applied, revealing important introductory programming topics and their trends from 1970 to 2018.
References
More filters
Book

A taxonomy for learning, teaching, and assessing : a revision of Bloom's

TL;DR: The Taxonomy of Educational Objectives as discussed by the authors is a taxonomy of educational objectives that is based on the concepts of knowledge, specificity, and problems of objectives, and is used in our taxonomy.
Book

The WEIRDest People in the World

TL;DR: A review of the comparative database from across the behavioral sciences suggests both that there is substantial variability in experimental results across populations and that WEIRD subjects are particularly unusual compared with the rest of the species – frequent outliers.
Proceedings ArticleDOI

From game design elements to gamefulness: defining "gamification"

TL;DR: A definition of "gamification" is proposed as the use of game design elements in non-game contexts and it is suggested that "gamified" applications provide insight into novel, gameful phenomena complementary to playful phenomena.
Frequently Asked Questions (13)
Q1. What are the contributions in "Introductory programming: a systematic literature review" ?

This paper is the report of an ITiCSE working group that conducted a systematic review in order to gain an overview of the introductory programming literature. 

The vast array of environments used with introductory programming courses includes command-line compilers, industry-grade IDEs, and pedagogical environments specifically intended for learning. 

Environments such as a tablet-PC-based classroom presentation system [341], Blackboard and Facebook [408], and a purpose-built mobile social learning environment [407], have been designed to support introductory programming courses. 

Collaboration has several reported benefits in the introductory classroom environment, including improved motivation, improved understanding of content knowledge, and the development of soft skills such as teamwork and communication. 

Notable gaps in the recent literature in terms of reviews of teaching tools include the areas of debugging and errors, tools to practice programming, design, student collaboration, games, evaluation of tools, student progress tools, and language extensions, APIs, and libraries. 

Due to the qualitative nature of research done both on mental models and on neo-Piagetian cognitive stages, more work is needed to quantitatively study what has been observed. 

These were targeted at reducing cheating through education, discouragement, making cheating difficult, and empowering students to take responsibilityfor their behaviour. 

Denny et al. [157] showed in a randomised experiment that requiring students to create exercise questions in addition to solving exercises can bring about a greater improvement in their exam performance than simply solving exercises. 

Bruce et al. [95] argue for introducing structural recursion before loops and arrays in an objects-first Java course, suggesting that this ordering provides more opportunity for reinforcing object-oriented concepts before students are exposed to concepts that can be dealt with in a non-object-oriented way. 

Hui and Farvolden [280] recently proposed a framework to classify ways in which learning analytics can be used to shape a course, considering how data can address the instructors’ needs to understand student knowledge, errors, engagement, and expectations, and how personalisation of feedback based on data can address students’ needs to plan and monitor their progress. 

This could be achieved by: grounding or connecting future research to learning theories, and documenting potential theoretical implications; extending learning theories to address the particularities of introductory programming research; or, performing meta-analyses to construct bodies of knowledge that are more abstracted. 

They found that students and instructors very rarely used the visualisation tool that was at their disposal, and instead using visual notation on the blackboard to trace program execution and show the structure of the manipulated data. 

Examples include students who are bored or confused in lab [556] and students who stop working, who use code samples verbatim that may not be relevant, or who make frantic changes [716].