scispace - formally typeset
Search or ask a question

Showing papers on "Static program analysis published in 2016"


Proceedings ArticleDOI
25 Aug 2016
TL;DR: This work introduces learning-based detection techniques where everything for representing terms and fragments in source code is mined from the repository, and compared its approach to a traditional structure-oriented technique and found that it detected clones that were either undetected or suboptimally reported by the prominent tool Deckard.
Abstract: Code clone detection is an important problem for software maintenance and evolution. Many approaches consider either structure or identifiers, but none of the existing detection techniques model both sources of information. These techniques also depend on generic, handcrafted features to represent code fragments. We introduce learning-based detection techniques where everything for representing terms and fragments in source code is mined from the repository. Our code analysis supports a framework, which relies on deep learning, for automatically linking patterns mined at the lexical level with patterns mined at the syntactic level. We evaluated our novel learning-based approach for code clone detection with respect to feasibility from the point of view of software maintainers. We sampled and manually evaluated 398 file- and 480 method-level pairs across eight real-world Java systems; 93% of the file- and method-level samples were evaluated to be true positives. Among the true positives, we found pairs mapping to all four clone types. We compared our approach to a traditional structure-oriented technique and found that our learning-based approach detected clones that were either undetected or suboptimally reported by the prominent tool Deckard. Our results affirm that our learning-based approach is suitable for clone detection and a tenable technique for researchers.

532 citations


Journal ArticleDOI
TL;DR: Through a case study of the Qt, VTK, and ITK projects, it is found that code review coverage, participation, and expertise share a significant link with software quality.
Abstract: Software code review, i.e., the practice of having other team members critique changes to a software system, is a well-established best practice in both open source and proprietary software domains. Prior work has shown that formal code inspections tend to improve the quality of delivered software. However, the formal code inspection process mandates strict review criteria (e.g., in-person meetings and reviewer checklists) to ensure a base level of review quality, while the modern, lightweight code reviewing process does not. Although recent work explores the modern code review process, little is known about the relationship between modern code review practices and long-term software quality. Hence, in this paper, we study the relationship between post-release defects (a popular proxy for long-term software quality) and: (1) code review coverage, i.e., the proportion of changes that have been code reviewed, (2) code review participation, i.e., the degree of reviewer involvement in the code review process, and (3) code reviewer expertise, i.e., the level of domain-specific expertise of the code reviewers. Through a case study of the Qt, VTK, and ITK projects, we find that code review coverage, participation, and expertise share a significant link with software quality. Hence, our results empirically confirm the intuition that poorly-reviewed code has a negative impact on software quality in large systems using modern reviewing tools.

237 citations


Proceedings ArticleDOI
14 Mar 2016
TL;DR: The results show that ASAT use is widespread, but not ubiquitous, and that projects typically do not enforce a strict policy on AsAT use.
Abstract: The use of automatic static analysis has been a software engineering best practice for decades. However, we still do not know a lot about its use in real-world software projects: How prevalent is the use of Automated Static Analysis Tools (ASATs) such as FindBugs and JSHint? How do developers use these tools, and how does their use evolve over time? We research these questions in two studies on nine different ASATs for Java, JavaScript, Ruby, and Python with a population of 122 and 168,214 open-source projects. To compare warnings across the ASATs, we introduce the General Defect Classification (GDC) and provide a grounded-theory-derived mapping of 1,825 ASAT-specific warnings to 16 top-level GDC classes. Our results show that ASAT use is widespread, but not ubiquitous, and that projects typically do not enforce a strict policy on ASAT use. Most ASAT configurations deviate slightly from the default, but hardly any introduce new custom analyses. Only a very small set of default ASAT analyses is widely changed. Finally, most ASAT configurations, once introduced, never change. If they do, the changes are small and have a tendency to occur within one day of the configuration's initial introduction.

174 citations


Proceedings ArticleDOI
18 Jul 2016
TL;DR: The DroidRA instrumentation-based approach to address the issue of reflective calls in Android apps in a non-invasive way, and allows to boost an app so that it can be immediately analyzable, including by such static analyzers that were not reflection-aware.
Abstract: Android developers heavily use reflection in their apps for legitimate reasons, but also significantly for hiding malicious actions. Unfortunately, current state-of-the-art static analysis tools for Android are challenged by the presence of reflective calls which they usually ignore. Thus, the results of their security analysis, e.g., for private data leaks, are inconsistent given the measures taken by malware writers to elude static detection. We propose the DroidRA instrumentation-based approach to address this issue in a non-invasive way. With DroidRA, we reduce the resolution of reflective calls to a composite constant propagation problem. We leverage the COAL solver to infer the values of reflection targets and app, and we eventually instrument this app to include the corresponding traditional Java call for each reflective call. Our approach allows to boost an app so that it can be immediately analyzable, including by such static analyzers that were not reflection-aware. We evaluate DroidRA on benchmark apps as well as on real-world apps, and demonstrate that it can allow state-of-the-art tools to provide more sound and complete analysis results.

147 citations


Proceedings Article
09 Jul 2016
TL;DR: Experimental results on widely-used software projects indicate that NP-CNN significantly outperforms the state-of-the-art methods in locating the buggy source files.
Abstract: Bug reports provide an effective way for end-users to disclose potential bugs hidden in a software system, while automatically locating the potential buggy source code according to a bug report remains a great challenge in software maintenance. Many previous studies treated the source code as natural language by representing both the bug report and source code based on bag-of-words feature representations, and correlate the bug report and source code by measuring similarity in the same lexical feature space. However, these approaches fail to consider the structure information of source code which carries additional semantics beyond the lexical terms. Such information is important in modeling program functionality. In this paper, we propose a novel convolutional neural network NP-CNN, which leverages both lexical and program structure information to learn unified features from natural language and source code in programming language for automatically locating the potential buggy source code according to bug report. Experimental results on widely-used software projects indicate that NP-CNN significantly outperforms the state-of-the-art methods in locating the buggy source files.

137 citations


Journal ArticleDOI
TL;DR: The main goal of this survey is to analyze the effectiveness of different classes of software obfuscation against the continuously improving deobfuscation techniques and off-the-shelf code analysis tools.
Abstract: Software obfuscation has always been a controversially discussed research area. While theoretical results indicate that provably secure obfuscation in general is impossible, its widespread application in malware and commercial software shows that it is nevertheless popular in practice. Still, it remains largely unexplored to what extent today’s software obfuscations keep up with state-of-the-art code analysis and where we stand in the arms race between software developers and code analysts. The main goal of this survey is to analyze the effectiveness of different classes of software obfuscation against the continuously improving deobfuscation techniques and off-the-shelf code analysis tools. The answer very much depends on the goals of the analyst and the available resources. On the one hand, many forms of lightweight static analysis have difficulties with even basic obfuscation schemes, which explains the unbroken popularity of obfuscation among malware writers. On the other hand, more expensive analysis techniques, in particular when used interactively by a human analyst, can easily defeat many obfuscations. As a result, software obfuscation for the purpose of intellectual property protection remains highly challenging.

133 citations


Book ChapterDOI
17 Jul 2016
TL;DR: This tool paper introduces the Souffle architecture, usage and demonstrates its applicability for large-scale code analysis on the OpenJDK7 library as a use case.
Abstract: Souffle is an open source programming framework that performs static program analysis expressed in Datalog on very large code bases, including points-to analysis on OpenJDK7 (1.4M program variables, 350K objects, 160K methods) in under a minute. Souffle is being successfully used for Java security analyses at Oracle Labs due to (1) its high-performance, (2) support for rapid program analysis development, and (3) customizability. Souffle incorporates the highly flexible Datalog-based program analysis paradigm while exhibiting performance results that are on-par with manually developed state-of-the-art tools. In this tool paper, we introduce the Souffle architecture, usage and demonstrate its applicability for large-scale code analysis on the OpenJDK7 library as a use case.

128 citations


Proceedings ArticleDOI
14 May 2016
TL;DR: This paper complements traditional code ownership heuristics using code review activity, and suggests that reviewing activity captures an important aspect of code ownership, and should be included in approximations of it in future studies.
Abstract: Code ownership establishes a chain of responsibility for modules in large software systems. Although prior work uncovers a link between code ownership heuristics and software quality, these heuristics rely solely on the authorship of code changes. In addition to authoring code changes, developers also make important contributions to a module by reviewing code changes. Indeed, recent work shows that reviewers are highly active in modern code review processes, often suggesting alternative solutions or providing updates to the code changes. In this paper, we complement traditional code ownership heuristics using code review activity. Through a case study of six releases of the large Qt and OpenStack systems, we find that: (1) 67%--86% of developers did not author any code changes for a module, but still actively contributed by reviewing 21%--39% of the code changes, (2) code ownership heuristics that are aware of reviewing activity share a relationship with software quality, and (3) the proportion of reviewers without expertise shares a strong, increasing relationship with the likelihood of having post-release defects. Our results suggest that reviewing activity captures an important aspect of code ownership, and should be included in approximations of it in future studies.

122 citations


Proceedings ArticleDOI
02 Oct 2016
TL;DR: This paper presents and discusses the experiences applying the ThingML approach to different domains, which includes a modeling language and tool designed for supporting code generation and a highly customizable multi-platform code generation framework.
Abstract: One of the selling points of Model-Driven Software Engineering (MDSE) is the increase in productivity offered by automatically generating code from models. However, the practical adoption of code generation remains relatively slow and limited to niche applications. Tooling issues are often pointed out but more fundamentally, experience shows that: (i) models and modeling languages used for other purposes are not necessarily well suited for code generation and (ii) code generators are often seen as black-boxes which are not easy to trust and produce sub-optimal code. This paper presents and discusses our experiences applying the ThingML approach to different domains. ThingML includes a modeling language and tool designed for supporting code generation and a highly customizable multi-platform code generation framework. The approach is implemented in an open-source tool providing a family of code generators targeting heterogeneous platforms. It has been evaluated through several case studies and is being used for in the development of a commercial ambient assisted living system.

115 citations


Proceedings ArticleDOI
18 Jul 2016
TL;DR: New code parsing algorithms in the open source Dyninst tool kit are presented, including a new model for describing jump tables that improves the ability to precisely determine the control flow targets, a new interprocedural analysis to determine when a function is non-returning, and techniques for handling tail calls.
Abstract: Binary code analysis is an enabling technique for many applications. Modern compilers and run-time libraries have introduced significant complexities to binary code, which negatively affect the capabilities of binary analysis tool kits to analyze binary code, and may cause tools to report inaccurate information about binary code. Analysts may hence be confused and applications based on these tool kits may have degrading quality. We examine the problem of constructing control flow graphs from binary code and labeling the graphs with accurate function boundary annotations. We identified several challenging code constructs that represent hard-to-analyze aspects of binary code, and show code examples for each code construct. As part of this discussion, we present new code parsing algorithms in our open source Dyninst tool kit that support these constructs, including a new model for describing jump tables that improves our ability to precisely determine the control flow targets, a new interprocedural analysis to determine when a function is non-returning, and techniques for handling tail calls. We evaluated how various tool kits fare when handling these code constructs with real software as well as test binaries patterned after each challenging code construct we found in real software.

112 citations


Proceedings ArticleDOI
25 Aug 2016
TL;DR: This paper proposes a new approach - Bugram - that leverages n-gram language models instead of rules to detect bugs, and suggests that Bugram is complementary to existing rule-based bug detection approaches.
Abstract: To improve software reliability, many rule-based techniques have been proposed to infer programming rules and detect violations of these rules as bugs. These rule-based approaches often rely on the highly frequent appearances of certain patterns in a project to infer rules. It is known that if a pattern does not appear frequently enough, rules are not learned, thus missing many bugs. In this paper, we propose a new approach—Bugram—that leverages n-gram language models instead of rules to detect bugs. Bugram models program tokens sequentially, using the n-gram language model. Token sequences from the program are then assessed according to their probability in the learned model, and low probability sequences are marked as potential bugs. The assumption is that low probability token sequences in a program are unusual, which may indicate bugs, bad practices, or unusual/special uses of code of which developers may want to be aware. We evaluate Bugram in two ways. First, we apply Bugram on the latest versions of 16 open source Java projects. Results show that Bugram detects 59 bugs, 42 of which are manually verified as correct, 25 of which are true bugs and 17 are code snippets that should be refactored. Among the 25 true bugs, 23 cannot be detected by PR-Miner. We have reported these bugs to developers, 7 of which have already been confirmed by developers (4 of them have already been fixed), while the rest await confirmation. Second, we further compare Bugram with three additional graph- and rule-based bug detection tools, i.e., JADET, Tikanga, and GrouMiner. We apply Bugram on 14 Java projects evaluated in these three studies. Bugram detects 21 true bugs, at least 10 of which cannot be detected by these three tools. Our results suggest that Bugram is complementary to existing rule-based bug detection approaches.

Journal ArticleDOI
TL;DR: This approach combines taint analysis, which finds candidate vulnerabilities, with data mining, to predict the existence of false positives, and proposes doing automatic code correction by inserting fixes in the source code.
Abstract: Although a large research effort on web application security has been going on for more than a decade, the security of web applications continues to be a challenging problem. An important part of that problem derives from vulnerable source code, often written in unsafe languages like PHP. Source code static analysis tools are a solution to find vulnerabilities, but they tend to generate false positives, and require considerable effort for programmers to manually fix the code. We explore the use of a combination of methods to discover vulnerabilities in source code with fewer false positives. We combine taint analysis, which finds candidate vulnerabilities, with data mining, to predict the existence of false positives. This approach brings together two approaches that are apparently orthogonal: humans coding the knowledge about vulnerabilities (for taint analysis), joined with the seemingly orthogonal approach of automatically obtaining that knowledge (with machine learning, for data mining). Given this enhanced form of detection, we propose doing automatic code correction by inserting fixes in the source code. Our approach was implemented in the WAP tool, and an experimental evaluation was performed with a large set of PHP applications. Our tool found 388 vulnerabilities in 1.4 million lines of code. Its accuracy and precision were approximately 5% better than PhpMinerII's and 45% better than Pixy's.

Proceedings ArticleDOI
16 May 2016
TL;DR: A code readability model based on a richer set of features, including the ones proposed in this paper, achieves a significantly higher accuracy as compared to all of the state-of-the-art readability models.
Abstract: Code reading is one of the most frequent activities in software maintenance; before implementing changes, it is necessary to fully understand source code often written by other developers. Thus, readability is a crucial aspect of source code that may significantly influence program comprehension effort. In general, models used to estimate software readability take into account only structural aspects of source code, e.g., line length and a number of comments. However, source code is a particular form of text; therefore, a code readability model should not ignore the textual aspects of source code encapsulated in identifiers and comments. In this paper, we propose a set of textual features aimed at measuring code readability. We evaluated the proposed textual features on 600 code snippets manually evaluated (in terms of readability) by 5K+ people. The results demonstrate that the proposed features complement classic structural features when predicting code readability judgments. Consequently, a code readability model based on a richer set of features, including the ones proposed in this paper, achieves a significantly higher accuracy as compared to all of the state-of-the-art readability models.

Proceedings ArticleDOI
14 May 2016
TL;DR: A technique to complement partial verification results by automatic test case generation that causes dynamic symbolic execution to abort tests that lead to verified executions, to prune parts of the search space, and to prioritize tests that cover more properties that are not fully verified.
Abstract: Most techniques to detect program errors, such as testing, code reviews, and static program analysis, do not fully verify all possible executions of a program. They leave executions unverified when they do not check certain properties, fail to verify properties, or check properties under certain unsound assumptions such as the absence of arithmetic overflow. In this paper, we present a technique to complement partial verification results by automatic test case generation. In contrast to existing work, our technique supports the common case that the verification results are based on unsound assumptions. We annotate programs to reflect which executions have been verified, and under which assumptions. These annotations are then used to guide dynamic symbolic execution toward unverified program executions. Our main technical contribution is a code instrumentation that causes dynamic symbolic execution to abort tests that lead to verified executions, to prune parts of the search space, and to prioritize tests that cover more properties that are not fully verified. We have implemented our technique for the .NET static analyzer Clousot and the dynamic symbolic execution tool Pex. It produces smaller test suites (by up to 19.2%), covers more unverified executions (by up to 7.1%), and reduces testing time (by up to 52.4%) compared to combining Clousot and Pex without our technique.

Proceedings ArticleDOI
17 Mar 2016
TL;DR: This work introduces a new program synthesis methodology for Datalog specifications to produce highly efficient monolithic C++ analyzers and demonstrates its competitiveness with state-of-the-art handcrafted tools.
Abstract: Designing and crafting a static program analysis is challenging due to the complexity of the task at hand. Among the challenges are modelling the semantics of the input language, finding suitable abstractions for the analysis, and handwriting efficient code for the analysis in a traditional imperative language such as C++. Hence, the development of static program analysis tools is costly in terms of development time and resources for real world languages. To overcome, or at least alleviate the costs of developing a static program analysis, Datalog has been proposed as a domain specific language (DSL). With Datalog, a designer expresses a static program analysis in the form of a logical specification. While a domain specific language approach aids in the ease of development of program analyses, it is commonly accepted that such an approach has worse runtime performance than handcrafted static analysis tools. In this work, we introduce a new program synthesis methodology for Datalog specifications to produce highly efficient monolithic C++ analyzers. The synthesis technique requires the re-interpretation of the semi-naive evaluation as a scaffolding for translation using partial evaluation. To achieve high-performance, we employ staged-compilation techniques and specialize the underlying relational data structures for a given Datalog specification. Experimentation on benchmarks for large-scale program analysis validates the superior performance of our approach over available Datalog tools and demonstrates our competitiveness with state-of-the-art handcrafted tools.

Proceedings ArticleDOI
14 May 2016
TL;DR: This paper hypothesizes that code anomalies tend to ``flock together'' to realize a design problem, and analyzes to what extent groups of inter-related code anomalies, named agglomerations, suffice to locate design problems.
Abstract: Design problems affect every software system. Diverse software systems have been discontinued or reengineered due to design problems. As design documentation is often informal or nonexistent, design problems need to be located in the source code. The main difficulty to identify a design problem in the implementation stems from the fact that such problem is often scattered through several program elements. Previous work assumed that code anomalies -- popularly known as code smells -- may provide sufficient hints about the location of a design problem. However, each code anomaly alone may represent only a partial embodiment of a design problem. In this paper, we hypothesize that code anomalies tend to ``flock together'' to realize a design problem. We analyze to what extent groups of inter-related code anomalies, named agglomerations, suffice to locate design problems. We analyze more than 2200 agglomerations found in seven software systems of different sizes and from different domains. Our analysis indicates that certain forms of agglomerations are consistent indicators of both congenital and evolutionary design problems, with accuracy often higher than 80%.

Proceedings ArticleDOI
10 Jun 2016
TL;DR: A new static-analysis-enabled approach to trimming unused code from both Java applications and Java Runtime Environment (JRE) automatically is proposed, built on top of the Soot framework and evaluated based on a set of criteria: code size, code complexity, memory footprint, execution and garbage collection time, and security.
Abstract: Modern software engineering practice increasingly brings redundant code into software products, which has caused a phenomenon called bloatware, leading to software system maintenance, performance and reliability issues as well as security problems. With the rapid advances of smart devices and a more connected world, it is never more important to trim bloatware to improve the leanness, agility, reliability, performance, and security of the interconnected software and network systems. Previous methods have limited scopes and are usually not fully automated. In this paper, we propose a new static-analysis-enabled approach to trimming unused code from both Java applications and Java Runtime Environment (JRE) automatically. We have built a tool called JRed on top of the Soot framework. We have conducted a fairly comprehensive evaluation of JRed based on a set of criteria: code size, code complexity, memory footprint, execution and garbage collection time, and security. Our experimental results show that, Java application size can be reduced by 44.5% on average and the JRE code can be reduced by more than 82.5% on average. The code complexity is significantly reduced according to a set of well-known metrics. Furthermore, we report that by trimming redundant code, 48.6% of the known security vulnerabilities in the Java Runtime Environment JRE 6 update 45 has been removed.

Proceedings ArticleDOI
14 Mar 2016
TL;DR: The past of program-comprehension research is explored, the current state is discussed, and what future research on program comprehension might bring is outlined.
Abstract: Program comprehension is the main activity of the software developers. Although there has been substantial research to support the programmer, the high amount of time developers need to understand source code remained constant over thirty years. Beside more complex software, what might be the reason? In this paper, I explore the past of program-comprehension research, discuss the current state, and outline what future research on program comprehension might bring.

Proceedings ArticleDOI
14 May 2016
TL;DR: This paper describes CloudBuild, the build service infrastructure developed within Microsoft over the last few years, which is responsible for all aspects of a continuous integration workflow, including builds, test and code analysis, as well as drops, package and symbol creation and storage.
Abstract: Thousands of Microsoft engineers build and test hundreds of software products several times a day. It is essential that this continuous integration scales, guarantees short feedback cycles, and functions reliably with minimal human intervention. This paper describes CloudBuild, the build service infrastructure developed within Microsoft over the last few years. CloudBuild is responsible for all aspects of a continuous integration workflow, including builds, test and code analysis, as well as drops, package and symbol creation and storage. CloudBuild supports multiple build languages as long as they fulfill a coarse grained, file IO based contract. CloudBuild uses content based caching to run build-related tasks only when needed. Lastly, it builds on many machines in parallel. CloudBuild offers a reliable build service in the presence of unreliable components. It aims to rapidly onboard teams and hence has to support non-deterministic build tools and specification languages that under-declare dependencies. We will outline how we addressed these challenges and characterize the operations of CloudBuild. CloudBuild has on-boarded hundreds of codebases with only man-months of effort each. Some of these codebases are used by thousands of developers. The speed ups of build and test range from 1.3× to 10×, and service availability is 99%.

Proceedings ArticleDOI
21 Mar 2016
TL;DR: HornDroid is the first static analysis tool for Android to come with a formal proof of soundness, which covers the core of the analysis technique: besides yielding correctness assurances, this proof allowed us to identify some critical corner-cases that affect the soundness guarantees provided by some of the previous static analysis tools for Android.
Abstract: We present HornDroid, a new tool for the static analysis of information flow properties in Android applications. The core idea underlying HornDroid is to use Horn clauses for soundly abstracting the semantics of Android applications and to express security properties as a set of proof obligations that are automatically discharged by an off-the-shelf SMT solver. This approach makes it possible to fine-tune the analysis in order to achieve a high degree of precision while still using off-the-shelf verification tools, thereby leveraging the recent advances in this field. As a matter of fact, HornDroid outperforms state-of-the-art Android static analysis tools on benchmarks proposed by the community. Moreover, HornDroid is the first static analysis tool for Android to come with a formal proof of soundness, which covers the core of the analysis technique: besides yielding correctness assurances, this proof allowed us to identify some critical corner-cases that affect the soundness guarantees provided by some of the previous static analysis tools for Android.

Proceedings ArticleDOI
01 Oct 2016
TL;DR: This paper reviews 79 alarms handling studies collected through a systematic literature search and classify the approaches proposed into seven categories, finding that the categorized alarms handling approaches are complementary and they can be combined together in different ways.
Abstract: Static analysis tools have showcased their importance and usefulness in automated detection of code anomalies and defects. However, the large number of alarms reported and cost incurred in their manual inspections have been the major concerns with the usage of static analysis tools. Existing studies addressing these concerns differ greatly in their approaches to handle the alarms, varying from automatic postprocessing of alarms, supporting the tool-users during manual inspections of the alarms, to designing of light-weight static analysis tools. A comprehensive study of approaches for handling alarms is, however, not found. In this paper, we review 79 alarms handling studies collected through a systematic literature search and classify the approaches proposed into seven categories. The literature search is performed by combining the keywords-based database search and snowballing. Our review is intended to provide an overview of various alarms handling approaches, their merits and shortcomings, and different techniques used in their implementations. Our findings include that the categorized alarms handling approaches are complementary and they can be combined together in different ways. The categorized approaches and techniques employed in them can help the designers and developers of static analysis tools to make informed choices.

Proceedings ArticleDOI
18 Jul 2016
TL;DR: A new approach in which static analysis tools learn to detect vulnerabilities automatically using machine learning is presented, which uses a sequence model to learn to characterize vulnerabilities based on a set of annotated source code slices.
Abstract: The state of web security remains troubling as web applications continue to be favorite targets of hackers. Static analysis tools are important mechanisms for programmers to deal with this problem as they search for vulnerabilities automatically in the application source code, allowing programmers to remove them. However, developing these tools requires explicitly coding knowledge about how to discover each kind of vulnerability. This paper presents a new approach in which static analysis tools learn to detect vulnerabilities automatically using machine learning. The approach uses a sequence model to learn to characterize vulnerabilities based on a set of annotated source code slices. This model takes into consideration the order in which the code elements appear and are executed in the slices. The model created can then be used as a static analysis tool to discover and identify vulnerabilities in source code. The approach was implemented in the DEKANT tool and evaluated experimentally with a set of open source PHP applications and WordPress plugins, finding 16 zero-day vulnerabilities.

Proceedings ArticleDOI
14 May 2016
TL;DR: With FeatureIDE, instead of focusing on one particular preprocessor, the tool provides tool support, which can easily be adopted for further preprocessors, and currently supports development with CPP, Antenna, and Munge.
Abstract: Preprocessors are a common way to implement variability in software. They are used in numerous software systems, such as operating systems and databases. Due to the ability of preprocessors to enable and disable code fragments, not all parts of the program are active at the same time. Thus, programmers and tools need to handle the interactions resulting from annotations in the program. With our Eclipse-based tool FeatureIDE, we provide tool support to tackle multiple challenges with preprocessors, such as code comprehension, feature traceability, separation of concerns, and program analysis. With FeatureIDE, instead of focusing on one particular preprocessor, we provide tool support, which can easily be adopted for further preprocessors. Currently, we support development with CPP, Antenna, and Munge. https://youtu.be/jVe7f32mLCQ

Proceedings ArticleDOI
14 May 2016
TL;DR: This paper presents and evaluates two syntacticalsimilarity metrics, one of them is specifically designed to run fast, incombination with two carefully selected and self-tuning clustering algorithmsto automatically detect groups of similar code changes.
Abstract: Several research tools and projects require groups of similar code changes asinput. Examples are recommendation and bug finding tools that can providevaluable information to developers based on such data. With the help ofsimilar code changes they can simplify the application of bug fixes and codechanges to multiple locations in a project. But despite their benefit, thepractical value of existing tools is limited, as users need to manually specifythe input data, i.e., the groups of similar code changes.To overcome this drawback, this paper presents and evaluates two syntacticalsimilarity metrics, one of them is specifically designed to run fast, incombination with two carefully selected and self-tuning clustering algorithmsto automatically detect groups of similar code changes.We evaluate the combinations of metrics and clustering algorithms by applyingthem to several open source projects and also publish the detected groups ofsimilar code changes online as a reference dataset. The automatically detectedgroups of similar code changes work well when used as input for LASE, arecommendation system for code changes.

Proceedings ArticleDOI
14 Mar 2016
TL;DR: Early results show that lower maintainability indeed triggers more code refactoring in practice and these refactorings significantly decrease complexity, code lines, coupling and clone metrics, however, there is a decrease in comment related metrics in the refactored code.
Abstract: It is very common in various fields that there is a gap between theoretical results and their practical applications. This is true for code refactoring as well, which has a solid theoretical background while being used in development practice at the same time. However, more and more studies suggest that developers perform code refactoring entirely differently than the theory would suggest. Our paper encourages the further investigation of code refactorings in practice by providing an excessive open dataset of source code metrics and applied refactorings through several releases of 7 open-source systems. As a first step of processing this dataset, we examined the quality attributes of the refactored source code classes and the values of source code metrics improved by those refactorings. Our early results show that lower maintainability indeed triggers more code refactorings in practice and these refactorings significantly decrease complexity, code lines, coupling and clone metrics. However, we observed a decrease in comment related metrics in the refactored code.

Proceedings ArticleDOI
01 Oct 2016
TL;DR: This study strongly validates the use of compilation/decompilation as a normalisation technique and reduces false classifications to zero for six of the tools in a broad, thorough study of source code similarity detection.
Abstract: Source code analysis to detect code cloning, code plagiarism, and code reuse suffers from the problem of pervasive code modifications, i.e. transformations that may have a global effect. We compare 30 similarity detection techniques and tools against pervasive code modifications. We evaluate the tools using two experimental scenarios for Java source code. These are (1) pervasive modifications created with tools for source code and bytecode obfuscation and (2) source code normalisation through compilation and decompilation using different decompilers. Our experimental results show that highly specialised source code similarity detection techniques and tools can perform better than more general, textual similarity measures. Our study strongly validates the use of compilation/decompilation as a normalisation technique. Its use reduced false classifications to zero for six of the tools. This broad, thorough study is the largest in existence and potentially an invaluable guide for future users of similarity detection in source code.

Proceedings ArticleDOI
25 Aug 2016
TL;DR: This paper presents a tool that is based on the modular interactions inferred by static code analysis, which is combined with symbolic execution and directed inter-procedural path exploration, which provides an advantage in terms of statement coverage and ability to uncover more vulnerabilities.
Abstract: Concolic (concrete+symbolic) execution has recently gained popularity as an effective means to uncover non-trivial vulnerabilities in software, such as subtle buffer overflows. However, symbolic execution tools that are designed to optimize statement coverage often fail to cover potentially vulnerable code because of complex system interactions and scalability issues of constraint solvers. In this paper, we present a tool (MACKE) that is based on the modular interactions inferred by static code analysis, which is combined with symbolic execution and directed inter-procedural path exploration. This provides an advantage in terms of statement coverage and ability to uncover more vulnerabilities. Our tool includes a novel feature in the form of interactive vulnerability report generation that helps developers prioritize bug fixing based on severity scores. A demo of our tool is available at https://youtu.be/icC3jc3mHEU.

Journal ArticleDOI
01 Mar 2016
TL;DR: Analysis of the techniques for injecting malicious code into NoSQL data stores provides examples of new NoSQL injections as well as Cross-Site Request Forgery attacks, allowing attackers to bypass perimeter defenses such as firewalls.
Abstract: NoSQL data storage systems have become very popular due to their scalability and ease of use. Unfortunately, they lack the security measures and awareness that are required for data protection. Although the new data models and query formats of NoSQL data stores make old attacks such as SQL injections irrelevant, they give attackers new opportunities for injecting their malicious code into the statements passed to the database. Analysis of the techniques for injecting malicious code into NoSQL data stores provides examples of new NoSQL injections as well as Cross-Site Request Forgery attacks, allowing attackers to bypass perimeter defenses such as firewalls. Analysis of the source of these vulnerabilities and present methodologies can mitigate such attacks. Because code analysis alone is insufficient to prevent attacks in today's typical large-scale deployment, certain mitigations should be done throughout the entire software life cycle.

Journal ArticleDOI
TL;DR: A probabilistic model of the lexemes distribution whose parameters are automatically estimated by the Expectation-Maximization algorithm and defined zones improves the quality of clustering results so that they are close to a theoretical upper bound, as proved in this paper.
Abstract: In this paper, we present a software clustering approach that leverages the information conveyed by the zone in which each lexeme appears in the classes of object oriented systems. We define six zones in the source code: Class Name, Attribute Name, Method Name, Parameter Name, Comment, and Source Code Statement. These zones may convey information with different levels of relevance, and so their contribution should be differently weighed according to the software system under study. To this aim, we define a probabilistic model of the lexemes distribution whose parameters are automatically estimated by the Expectation-Maximization algorithm. The weights of the zones are then exploited to compute similarities among source code classes, which are then grouped by a k-Medoid clustering algorithm. To assess the validity of our solution in the software architecture recovery field, we applied our approach to 19 software systems from different application domains. We observed that the use of our probabilistic model and the defined zones improves the quality of clustering results so that they are close to a theoretical upper bound we have proved.

Journal ArticleDOI
01 Jan 2016
TL;DR: A mostly static approach called, AMA, Amrita Malware Analyzer, a framework capable of detecting the presence of malicious code through static code analysis of web page, performs probable plaintext attack using strings likely contained in malicious web pages.
Abstract: JavaScript language, through its dynamic feature, provides user interactivity with websites. It also pose serious security threats to both user and website. On top of this, obfuscation is widely used to hide its malicious purpose and to evade the detection of antivirus software. Malware embedded in web pages is regularly used as part of targeted attacks. To hinder detection by antivirus scanners, the malicious code is usually obfuscated, often with encodings like hexadecimal, unicode, base64, escaped characters and rarely with substitution ciphers like Vigenere, Caesar and Atbash. The malicious iframes are injected to the websites using JavaScript and are also made hidden from the users perspective in-order to prevent detection. To defend against obfuscated malicious JavaScript code, we propose a mostly static approach called, AMA, Amrita Malware Analyzer, a framework capable of detecting the presence of malicious code through static code analysis of web page. To this end, the framework performs probable plaintext attack using strings likely contained in malicious web pages. But this approach targets only few among many possible obfuscation strategies. The evaluation based on the links provided in the Malware domain list demonstrates high level accuracy