scispace - formally typeset
Open Access

Remix and reuse of source code in software production

Reads0
Chats0
TLDR
This study explores how source code snippets in programming books and on the web are changing software development practice, and provides a comprehensive view of code copying across 6,190 PHP-language applications, to explore the concept of a “remix” method of software production.
Abstract
The means of producing information and the infrastructure for disseminating it are constantly changing. The web mobilizes information in electronic formats, making it easier to copy, modify, remix, and redistribute. This has changed how information is produced, distributed, and used. People are not just consuming information; they are actively producing, remixing, and sharing information, using the web as a platform for creativity and production. This is true of software development as well. It is frequently commented by programmers and researchers who study software development, that programmers frequently copy and paste code. Although this practice is widely acknowledged, it is rarely studied directly, or explicitly accounted for in models of software development. However, this attitude is changing as software becomes more ubiquitous, and software development practice shifts away from the formal models of software engineering, towards a post-modernist perspective. This study explores how source code snippets in programming books and on the web are changing software development practice. By examining program source code using clone detection algorithms, this study provides a comprehensive view of code copying across 6,190 PHP-language applications. These data are used to explore the concept of a “remix” method of software production, where software and systems are built out of copied and pasted snippets of code. These findings are contrasted against both traditional models of information production coming from informetrics (e.g., authorship, citation analysis), and models from software engineering (e.g., the Lego Hypothesis). Explanations for observed phenomena are discussed borrowing metaphors from linguistics, which provide a richer explanation of copy-paste programming than offered by the Lego Hypothesis. The focus and findings of this study ultimately point to a pressing demand for further research centered on the notion of software as information. Software and software repositories hold a large amount of information about how it was produced, and how it is used, adapted, and maintained. Software informatics is proposed as an organizing label to study the science of information, practice, and communication around software. It studies the individual, collaborative, and social aspects of software production and use, spanning multiple representations of software from design, to source code, to application.

read more

Citations
More filters
Journal Article

On software maintenance process improvement based on code clone analysis

TL;DR: This paper intends to extend the functionality of Gemini to cope with the problems, and applies the extended Gemini to several software and evaluates the applicability of the new functions.
Journal ArticleDOI

Development nature matters: An empirical study of code clones in JavaScript applications

TL;DR: A large-scale clone detection experiment in a dynamically-typed programming language, JavaScript, for different application domains: web pages and standalone projects showed that unlike JavaScript standalone projects, JavaScript web applications have 95 % of inter-file clones and 91–97 % of widely scattered clones, indicating that features of programming languages and technologies affect how developers duplicate code.
Posted Content

Software cloning in extreme programming environment

TL;DR: This paper summarizes my overview talk on software cloning analysis and highlights Code Cloning in Extreme Programming Environment and finds Clone Detection as effective tool for Refactoring.
References
More filters
Journal ArticleDOI

Emergence of Scaling in Random Networks

TL;DR: A model based on these two ingredients reproduces the observed stationary scale-free distributions, which indicates that the development of large networks is governed by robust self-organizing phenomena that go beyond the particulars of the individual systems.
Book

Design Patterns: Elements of Reusable Object-Oriented Software

TL;DR: The book is an introduction to the idea of design patterns in software engineering, and a catalog of twenty-three common patterns, which most experienced OOP designers will find out they've known about patterns all along.
Journal Article

The magical number seven, plus or minus two: some limits on our capacity for processing information

TL;DR: The theory of information as discussed by the authors provides a yardstick for calibrating our stimulus materials and for measuring the performance of our subjects and provides a quantitative way of getting at some of these questions.
Book

The magical number seven plus or minus two: some limits on our capacity for processing information

TL;DR: The theory provides us with a yardstick for calibrating the authors' stimulus materials and for measuring the performance of their subjects, and the concepts and measures provided by the theory provide a quantitative way of getting at some of these questions.
Book

The postmodern condition : a report on knowledge

TL;DR: In this article, the status of science, technology, and the arts, the significance of technocracy, and how the flow of information is controlled in the Western world are discussed.