scispace - formally typeset
Search or ask a question

Showing papers by "Eric Bodden published in 2021"


Journal ArticleDOI
TL;DR: This paper has designed an extensive CrySL rule set for the Java Cryptography Architecture (JCA), and empirically evaluated it by analyzing 10,000 current Android apps and all 204,788 current Java software artefacts on Maven Central.
Abstract: Various studies have empirically shown that the majority of Java and Android applications misuse cryptographic libraries, causing devastating breaches of data security. It is crucial to detect such misuses early in the development process. To detect cryptography misuses, one must define secure uses first, a process mastered primarily by cryptography experts but not by developers. In this paper, we present CrySL , a specification language for bridging the cognitive gap between cryptography experts and developers. CrySL enables cryptography experts to specify the secure usage of the cryptographic libraries they provide. We have implemented a compiler that translates such CrySL specification into a context-sensitive and flow-sensitive demand-driven static analysis. The analysis then helps developers by automatically checking a given Java or Android app for compliance with the CrySL -encoded rules. We have designed an extensive CrySL rule set for the Java Cryptography Architecture (JCA), and empirically evaluated it by analyzing 10,000 current Android apps and all 204,788 current Java software artefacts on Maven Central. Our results show that misuse of cryptographic APIs is still widespread, with 95 percent of apps and 63 percent of Maven artefacts containing at least one misuse. Our easily extensible CrySL rule set covers more violations than previous special-purpose tools that contain hard-coded rules, while still offering a more precise analysis.

29 citations


Journal ArticleDOI
TL;DR: It is found that more than 87% (56%, resp.) of the vulnerable Java classes considered occur in Maven Central in re-bundled (re-packaged, resp.) form.

9 citations


Posted Content
TL;DR: This paper investigates the impact of various sources of variability on widely used Java cryptographic libraries including JCA, Bouncy Castle, and Google Tink and motivates an extension to the CrySL language named MetaCrySL, which builds on meta programming concepts.
Abstract: APIs are the primary mechanism for developers to gain access to externally defined services and tools. However, previous research has revealed API misuses that violate the contract of APIs to be prevalent. Such misuses can have harmful consequences, especially in the context of cryptographic libraries. Various API misuse detectors have been proposed to address this issue including CogniCrypt, one of the most versatile of such detectors and that uses a language CrySL to specify cryptographic API usage contracts. Nonetheless, existing approaches to detect API misuse had not been designed for systematic reuse, ignoring the fact that different versions of a library, different versions of a platform, and different recommendations or guidelines might introduce variability in the correct usage of an API. Yet, little is known about how such variability impacts the specification of the correct API usage. This paper investigates this question by analyzing the impact of various sources of variability on widely used Java cryptographic libraries including JCA, Bouncy Castle, and Google Tink. The results of our investigation show that sources of variability like new versions of the API and security standards significantly impact the specifications. We then use the insights gained from our investigation to motivate an extension to the CrySL language named MetaCrySL, which builds on meta programming concepts. We evaluate MetaCrySL by specifying usage rules for a family of Android versions and illustrate that MetaCrySL can model all forms of variability we identified and drastically reduce the size of a family of specifications for the correct usage of cryptographic APIs

7 citations


Proceedings ArticleDOI
20 Aug 2021
TL;DR: In this article, the authors investigate the integration of cloud-based static application security testing (SAST) tools into continuous integration (CI) or continuous delivery (CD) for assuring code quality and security.
Abstract: Integrating static analyses into continuous integration (CI) or continuous delivery (CD) has become the best practice for assuring code quality and security. Static Application Security Testing (SAST) tools fit well into CI/CD, because CI/CD allows time for deep static analyses on large code bases and prevents vulnerabilities in the early stages of the development lifecycle. In CI/CD, the SAST tools usually run in the cloud and provide findings via a web interface. Recent studies show that developers prefer seeing the findings of these tools directly in their IDEs. Most tools with IDE integration run lightweight static analyses and can give feedback at coding time, but SAST tools used in CI/CD take longer to run and usually are not able to do so. Can developers interact directly with a cloud-based SAST tool that is typically used in CI/CD through their IDE? We investigated if such a mechanism can integrate cloud-based SAST tools better into a developers’ workflow than web-based solutions. We interviewed developers to understand their expectations from an IDE solution. Guided by these interviews, we implemented an IDE prototype for an existing cloud-based SAST tool. With a usability test using this prototype, we found that the IDE solution promoted more frequent tool interactions. In particular, developers performed code scans three times more often. This indicates better integration of the cloud-based SAST tool into developers’ workflow. Furthermore, while our study did not show statistically significant improvement on developers’ code-fixing performance, it did show a promising reduction in time for fixing vulnerable code.

4 citations


Journal ArticleDOI
TL;DR: ModGuard is presented, a novel static analysis based on Doop which complements the Java module system with an analysis to automatically identify instances that escape their declaring module and is an effective aid in identifying integrity and confidentiality violations of sensitive instances.
Abstract: With version 9, Java has been given the new module system Jigsaw. Major goals were to simplify maintainability of the JDK and improve its security by encapsulating modules’ internal types. While the module system successfully limits the visibility of internal types, it does not prevent sensitive data from escaping. Since the module system reasons about types only, objects are allowed to escape even if that module declares the type as internal. Finding such unintended escapes is important, as they may violate a module’s integrity and confidentiality, but is a complex task as it requires one to reason about pointers and type hierarchy. We thus present ModGuard , a novel static analysis based on Doop which complements the Java module system with an analysis to automatically identify instances that escape their declaring module. Along with ModGuard we contribute a complete formal definition of a module’s entrypoints, i.e., the method implementations that a module actually allows other modules to directly invoke. We further make available a novel micro-benchmark suite MIC9Bench to show the effectiveness but also current shortcomings of ModGuard , and to enable comparative studies in the future. Finally, we describe a case study that we conducted using Apache Tomcat, which shows that a migration of applications towards Jigsaw modules does not prevent sensitive instances from escaping, yet also shows that ModGuard is an effective aid in identifying integrity and confidentiality violations of sensitive instances.

3 citations


Proceedings ArticleDOI
27 Mar 2021
TL;DR: In this article, the authors present an extensible framework for comparative analysis of Python call-graphs, which links call sites to potential call targets in a program, and evaluate three call graph generation algorithms: Code2flow, Pyan, and Wala.
Abstract: As one of the most popular programming languages, PYTHON has become a relevant target language for static analysis tools. The primary data structure for performing an inter-procedural static analysis is call-graph (CG), which links call sites to potential call targets in a program. There exists multiple algorithms for constructing callgraphs, tailored to specific languages. However, comparatively few implementations target PYTHON. Moreover, there is still lack of empirical evidence as to how these few algorithms perform in terms of precision and recall. This paper thus presents EVAL_CG, an extensible framework for comparative analysis of Python call-graphs. We conducted two experiments which run the CG algorithms on different Python programming constructs and real-world applications. In both experiments, we evaluate three CG generation frameworks namely, Code2flow, Pyan, and Wala. We record precision, recall, and running time, and identify sources of unsoundness of each framework. Our evaluation shows that none of the current CG construction frameworks produce a sound CG. Moreover, the static CGs contain many spurious edges. Code2flow is also comparatively slow. Hence, further research is needed to support CG generation for Python programs.

3 citations


Proceedings ArticleDOI
12 Jul 2021
TL;DR: In this article, the authors present a prototype Jupyter notebook annotator, HeaderGen, that automatically creates a narrative structure in notebooks by classifying and annotating code cells based on the machine learning workflow.
Abstract: Jupyter notebooks are now widely adopted by data analysts as they provide a convenient environment for presenting computational results in a literate-programming document that combines code snippets, rich text, and inline visualizations. Literate-programming documents are intended to be computational narratives that are supplemented with self-explanatory text, but, recent studies have shown that this is lacking in practice. Efforts in the software engineering community to increase code comprehension in literate programming are limited. To address this, as a first step, this paper presents a prototype Jupyter notebook annotator, HeaderGen, that automatically creates a narrative structure in notebooks by classifying and annotating code cells based on the machine learning workflow. HeaderGen generates a markdown cell header for each code cell by statically analyzing the notebook, and in addition, associates these cell headers with a clickable table of contents for easier navigation. Further, we discuss our vision and opportunities based on this prototype.

2 citations


Journal ArticleDOI
01 Feb 2021
TL;DR: Architectural Runtime Verification (ARV) is an approach specifically designed for the integrator—a generic way to analyze system behavior on architecture level using the principles of runtime Verification.
Abstract: Analyzing runtime behavior as part of debugging complex component-based systems used in the vehicle industry is an important aspect of the integration process. It is a laborious task that involves many manual steps. One reason for this is that, as of today, the analysis is usually not performed on the architecture level, where the system has initially been designed. Instead, it relies on source code debugging or visualizing signals and events. With an ever-growing complexity of such systems, it becomes increasingly difficult to find errors that manifest at integration level, i.e., when the components interact with each other in a complex environment. Architectural Runtime Verification (ARV) is an approach specifically designed for the integrator—a generic way to analyze system behavior on architecture level using the principles of Runtime Verification. This paper draws on our initial publication. It provides further details and an evaluation of the ideas using a database hosted in the cloud.

1 citations



Proceedings ArticleDOI
24 May 2021
TL;DR: In this article, the authors demonstrate a proof-of-concept solution to strengthen information hiding in Java, and show that this approach is backward compatible and that it blocks 84% of all information-hiding attacks in a large-scale sample set at an average performance overhead below 2%.
Abstract: The Java runtime is installed on billions of devices worldwide, and over years it has been a primary attack vector for online criminals. In this work, we address that many attack vectors exploit weaknesses in Java's information hiding, making use of illegal access to private members of system classes. To study to what extent such attacks can be mitigated, and at what cost, this paper demonstrates a proof-of-concept solution to strengthen information hiding. Experiments show that this approach is backward compatible, and that it blocks 84% of all information-hiding attacks in a large-scale sample set at an average performance overhead below 2%. Based on our experiments, we suggest a solution to strengthen information hiding for productive use that has the potential to outperform our proof of concept in terms of robustness and performance, and also would block the remaining information-hiding attacks. Finally, we conclude with general advice on the design of secure software.