scispace - formally typeset
Search or ask a question
Author

Jens Dietrich

Other affiliations: Massey University
Bio: Jens Dietrich is an academic researcher from Victoria University of Wellington. The author has contributed to research in topics: Java & Computer science. The author has an hindex of 20, co-authored 89 publications receiving 1560 citations. Previous affiliations of Jens Dietrich include Massey University.


Papers
More filters
Proceedings ArticleDOI
30 Nov 2010
TL;DR: The Qualitas Corpus, a large curated collection of open source Java systems, is described, which reduces the cost of performing large empirical studies of code and supports comparison of measurements of the same artifacts.
Abstract: In order to increase our ability to use measurement to support software development practise we need to do more analysis of code. However, empirical studies of code are expensive and their results are difficult to compare. We describe the Qualitas Corpus, a large curated collection of open source Java systems. The corpus reduces the cost of performing large empirical studies of code and supports comparison of measurements of the same artifacts. We discuss its design, organisation, and issues associated with its development.

390 citations

Proceedings ArticleDOI
29 Mar 2005
TL;DR: The developed software design ontology is discussed, and how this approach relates to the meta-modelling architecture of the OMG, and an effective pattern scanner for the Java language is presented.
Abstract: Design patterns have been used successfully in the last decade to reuse and communicate object-oriented design. However, the documentation of pattern usage is often very poor. This motivates the use of tools which can detect and document design patterns found in software. A couple of approaches have been proposed in recent years. The approach introduced is based on a formal description of design patterns using the Web ontology language OWL. Software artefacts used to define design patterns in a formal and machine processable fashion are represented by uniform resource identifiers (URIs). This yields a description that is open and extensible, and facilitates the sharing of design among software engineers. We discuss the developed software design ontology, and how this approach relates to the meta-modelling architecture of the OMG. In the second part, an effective pattern scanner for the Java language is presented. This scanner is based on the ontology developed in part one and uses reflection and AST analysis to verify constraints. Various applications of this scanner are discussed.

75 citations

Proceedings ArticleDOI
27 Feb 2014
TL;DR: This paper has studied the extent of the problem on the qualitas corpus, a data set consisting of Java open-source programs widely used in empirical studies and found that the above mentioned issues do occur in practice, albeit not on a wide scale.
Abstract: It has become common practice to build programs by using libraries. While the benefits of reuse are well known, an often overlooked risk are system runtime failures due to API changes in libraries that evolve independently. Traditionally, the consistency between a program and the libraries it uses is checked at build time when the entire system is compiled and tested. However, the trend towards partially upgrading systems by redeploying only evolved library versions results in situations where these crucial verification steps are skipped. For Java programs, partial upgrades create additional interesting problems as the compiler and the virtual machine use different rule sets to enforce contracts between the providers and the consumers of APIs. We have studied the extent of the problem on the qualitas corpus, a data set consisting of Java open-source programs widely used in empirical studies. In this paper, we describe the study and report its key findings. We found that the above mentioned issues do occur in practice, albeit not on a wide scale.

72 citations

Proceedings ArticleDOI
14 Mar 2016
TL;DR: This work proposes a preliminary classification of such false positives with the aim of facilitating a better understanding of the effects of anti-patterns and code smells in practice, and hopes that the development and further refinement of such a classification can support researchers and tool vendors in their endeavour to develop more pragmatic, context-relevant detection and analysis tools for anti- patterns.
Abstract: Anti-patterns and code smells are archetypes used for describing software design shortcomings that can negatively affect software quality, in particular maintainability. Tools, metrics and methodologies have been developed to identify these archetypes, based on the assumption that they can point at problematic code. However, recent empirical studies have shown that some of these archetypes are ubiquitous in real world programs, and many of them are found not to be as detrimental to quality as previously conjectured. We are therefore interested in revisiting common anti-patterns and code smells, and building a catalogue of cases that constitute candidates for "false positives". We propose a preliminary classification of such false positives with the aim of facilitating a better understanding of the effects of anti-patterns and code smells in practice. We hope that the development and further refinement of such a classification can support researchers and tool vendors in their endeavour to develop more pragmatic, context-relevant detection and analysis tools for anti-patterns and code smells.

70 citations

Journal ArticleDOI
TL;DR: A novel approach to the formal definition of design patterns is proposed that is based on the idea that design patterns are knowledge that is shared across a community and that is by nature distributed and inconsistent.

65 citations


Cited by
More filters
Journal Article
TL;DR: AspectJ as mentioned in this paper is a simple and practical aspect-oriented extension to Java with just a few new constructs, AspectJ provides support for modular implementation of a range of crosscutting concerns.
Abstract: Aspect] is a simple and practical aspect-oriented extension to Java With just a few new constructs, AspectJ provides support for modular implementation of a range of crosscutting concerns. In AspectJ's dynamic join point model, join points are well-defined points in the execution of the program; pointcuts are collections of join points; advice are special method-like constructs that can be attached to pointcuts; and aspects are modular units of crosscutting implementation, comprising pointcuts, advice, and ordinary Java member declarations. AspectJ code is compiled into standard Java bytecode. Simple extensions to existing Java development environments make it possible to browse the crosscutting structure of aspects in the same kind of way as one browses the inheritance structure of classes. Several examples show that AspectJ is powerful, and that programs written using it are easy to understand.

2,947 citations

Book
01 Nov 2002
TL;DR: Drive development with automated tests, a style of development called “Test-Driven Development” (TDD for short), which aims to dramatically reduce the defect density of code and make the subject of work crystal clear to all involved.
Abstract: From the Book: “Clean code that works” is Ron Jeffries’ pithy phrase. The goal is clean code that works, and for a whole bunch of reasons: Clean code that works is a predictable way to develop. You know when you are finished, without having to worry about a long bug trail.Clean code that works gives you a chance to learn all the lessons that the code has to teach you. If you only ever slap together the first thing you think of, you never have time to think of a second, better, thing. Clean code that works improves the lives of users of our software.Clean code that works lets your teammates count on you, and you on them.Writing clean code that works feels good.But how do you get to clean code that works? Many forces drive you away from clean code, and even code that works. Without taking too much counsel of our fears, here’s what we do—drive development with automated tests, a style of development called “Test-Driven Development” (TDD for short). In Test-Driven Development, you: Write new code only if you first have a failing automated test.Eliminate duplication. Two simple rules, but they generate complex individual and group behavior. Some of the technical implications are:You must design organically, with running code providing feedback between decisionsYou must write your own tests, since you can’t wait twenty times a day for someone else to write a testYour development environment must provide rapid response to small changesYour designs must consist of many highly cohesive, loosely coupled components, just to make testing easy The two rules imply an order to the tasks ofprogramming: 1. Red—write a little test that doesn’t work, perhaps doesn’t even compile at first 2. Green—make the test work quickly, committing whatever sins necessary in the process 3. Refactor—eliminate all the duplication created in just getting the test to work Red/green/refactor. The TDD’s mantra. Assuming for the moment that such a style is possible, it might be possible to dramatically reduce the defect density of code and make the subject of work crystal clear to all involved. If so, writing only code demanded by failing tests also has social implications: If the defect density can be reduced enough, QA can shift from reactive to pro-active workIf the number of nasty surprises can be reduced enough, project managers can estimate accurately enough to involve real customers in daily developmentIf the topics of technical conversations can be made clear enough, programmers can work in minute-by-minute collaboration instead of daily or weekly collaborationAgain, if the defect density can be reduced enough, we can have shippable software with new functionality every day, leading to new business relationships with customers So, the concept is simple, but what’s my motivation? Why would a programmer take on the additional work of writing automated tests? Why would a programmer work in tiny little steps when their mind is capable of great soaring swoops of design? Courage. Courage Test-driven development is a way of managing fear during programming. I don’t mean fear in a bad way, pow widdle prwogwammew needs a pacifiew, but fear in the legitimate, this-is-a-hard-problem-and-I-can’t-see-the-end-from-the-beginning sense. If pain is nature’s way of saying “Stop!”, fear is nature’s way of saying “Be careful.” Being careful is good, but fear has a host of other effects: Makes you tentativeMakes you want to communicate lessMakes you shy from feedbackMakes you grumpy None of these effects are helpful when programming, especially when programming something hard. So, how can you face a difficult situation and: Instead of being tentative, begin learning concretely as quickly as possible.Instead of clamming up, communicate more clearly.Instead of avoiding feedback, search out helpful, concrete feedback.(You’ll have to work on grumpiness on your own.) Imagine programming as turning a crank to pull a bucket of water from a well. When the bucket is small, a free-spinning crank is fine. When the bucket is big and full of water, you’re going to get tired before the bucket is all the way up. You need a ratchet mechanism to enable you to rest between bouts of cranking. The heavier the bucket, the closer the teeth need to be on the ratchet. The tests in test-driven development are the teeth of the ratchet. Once you get one test working, you know it is working, now and forever. You are one step closer to having everything working than you were when the test was broken. Now get the next one working, and the next, and the next. By analogy, the tougher the programming problem, the less ground should be covered by each test. Readers of Extreme Programming Explained will notice a difference in tone between XP and TDD. TDD isn’t an absolute like Extreme Programming. XP says, “Here are things you must be able to do to be prepared to evolve further.” TDD is a little fuzzier. TDD is an awareness of the gap between decision and feedback during programming, and techniques to control that gap. “What if I do a paper design for a week, then test-drive the code? Is that TDD?” Sure, it’s TDD. You were aware of the gap between decision and feedback and you controlled the gap deliberately. That said, most people who learn TDD find their programming practice changed for good. “Test Infected” is the phrase Erich Gamma coined to describe this shift. You might find yourself writing more tests earlier, and working in smaller steps than you ever dreamed would be sensible. On the other hand, some programmers learn TDD and go back to their earlier practices, reserving TDD for special occasions when ordinary programming isn’t making progress. There are certainly programming tasks that can’t be driven solely by tests (or at least, not yet). Security software and concurrency, for example, are two topics where TDD is not sufficient to mechanically demonstrate that the goals of the software have been met. Security relies on essentially defect-free code, true, but also on human judgement about the methods used to secure the software. Subtle concurrency problems can’t be reliably duplicated by running the code. Once you are finished reading this book, you should be ready to: Start simplyWrite automated testsRefactor to add design decisions one at a time This book is organized into three sections. An example of writing typical model code using TDD. The example is one I got from Ward Cunningham years ago, and have used many times since, multi-currency arithmetic. In it you will learn to write tests before code and grow a design organically.An example of testing more complicated logic, including reflection and exceptions, by developing a framework for automated testing. This example also serves to introduce you to the xUnit architecture that is at the heart of many programmer-oriented testing tools. In the second example you will learn to work in even smaller steps than in the first example, including the kind of self-referential hooha beloved of computer scientists.Patterns for TDD. Included are patterns for the deciding what tests to write, how to write tests using xUnit, and a greatest hits selection of the design patterns and refactorings used in the examples. I wrote the examples imagining a pair programming session. If you like looking at the map before wandering around, you may want to go straight to the patterns in Section 3 and use the examples as illustrations. If you prefer just wandering around and then looking at the map to see where you’ve been, try reading the examples through and refering to the patterns when you want more detail about a technique, then using the patterns as a reference. Several reviewers have commented they got the most out of the examples when they started up a programming environment and entered the code and ran the tests as they read. A note about the examples. Both examples, multi-currency calculation and a testing framework, appear simple. There are (and I have seen) complicated, ugly, messy ways of solving the same problems. I could have chosen one of those complicated, ugly, messy solutions to give the book an air of “reality.” However, my goal, and I hope your goal, is to write clean code that works. Before teeing off on the examples as being too simple, spend 15 seconds imagining a programming world in which all code was this clear and direct, where there were no complicated solutions, only apparently complicated problems begging for careful thought. TDD is a practice that can help you lead yourself to exactly that careful thought.

1,864 citations

Book
01 Jan 1975
TL;DR: The major change in the second edition of this book is the addition of a new chapter on probabilistic retrieval, which I think is one of the most interesting and active areas of research in information retrieval.
Abstract: The major change in the second edition of this book is the addition of a new chapter on probabilistic retrieval. This chapter has been included because I think this is one of the most interesting and active areas of research in information retrieval. There are still many problems to be solved so I hope that this particular chapter will be of some help to those who want to advance the state of knowledge in this area. All the other chapters have been updated by including some of the more recent work on the topics covered. In preparing this new edition I have benefited from discussions with Bruce Croft, The material of this book is aimed at advanced undergraduate information (or computer) science students, postgraduate library science students, and research workers in the field of IR. Some of the chapters, particularly Chapter 6 * , make simple use of a little advanced mathematics. However, the necessary mathematical tools can be easily mastered from numerous mathematical texts that now exist and, in any case, references have been given where the mathematics occur. I had to face the problem of balancing clarity of exposition with density of references. I was tempted to give large numbers of references but was afraid they would have destroyed the continuity of the text. I have tried to steer a middle course and not compete with the Annual Review of Information Science and Technology. Normally one is encouraged to cite only works that have been published in some readily accessible form, such as a book or periodical. Unfortunately, much of the interesting work in IR is contained in technical reports and Ph.D. theses. For example, most the work done on the SMART system at Cornell is available only in reports. Luckily many of these are now available through the National Technical Information Service (U.S.) and University Microfilms (U.K.). I have not avoided using these sources although if the same material is accessible more readily in some other form I have given it preference. I should like to acknowledge my considerable debt to many people and institutions that have helped me. Let me say first that they are responsible for many of the ideas in this book but that only I wish to be held responsible. My greatest debt is to Karen Sparck Jones who taught me to research information retrieval as an experimental science. Nick Jardine and Robin …

822 citations