scispace - formally typeset
Search or ask a question

Showing papers on "Static program analysis published in 2023"


Proceedings ArticleDOI
04 Jan 2023
TL;DR: In this paper , the authors present an approach to identify personal data processing, enabling developers and code reviewers in drafting privacy analyses and complying with regulations such as the General Data Protection Regulation (GDPR).
Abstract: Code review is a critical step in the software development life cycle, which assesses and boosts the code's effectiveness and correctness, pinpoints security issues, and raises its quality by adhering to best practices. Due to the increased need for personal data protection motivated by legislation, code reviewers need to understand where personal data is located in software systems and how it is handled. Although most recent work on code review focuses on security vulnerabilities, privacy-related techniques are not easy for code reviewers to implement, making their inclusion in the code review process challenging. In this paper, we present ongoing work on a new approach to identifying personal data processing, enabling developers and code reviewers in drafting privacy analyses and complying with regulations such as the General Data Protection Regulation (GDPR).

1 citations


Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper presented a vulnerability prediction model based on machine learning technique applied on metrics extracted from program source code, which can identify the vulnerable functions of a C/C++ software program based on code metrics.
Abstract: Detecting security vulnerabilities in the source code of software systems is one of the most important challenges in the field of software security. We need an effective solution to discover and patch vulnerabilities before our valuable information is compromised. Security testing is a type of software testing that checks whether software is vulnerable to cyber attacks. This study aimed to pursue three main objectives: (1) The first goal is to identify the vulnerable functions of a C/C++ software program based on code metrics. This can reduce the cost of software security testing and also redirect the related activities to identified vulnerable functions rather than to the entire software, (2) The second goal is to identify the type of attack related to the vulnerability function, and (3) Finally, the ultimate goal is to analyze the relationship between code metrics and the vulnerabilities. This goal can help us understand which code structure is most likely to contain vulnerable code. This paper first aimed to create a comprehensive view of the source code of the target software using graph concepts. Second, a set of source code metrics and calculated by crawling on the related graph using the static analysis approach. Finally, the vulnerability prediction model presented in this paper is based on machine learning technique applied on metrics extracted from program source code. Compared to previous work, new achievements have been made in this paper. One of the most important ones is the very high accuracy detection of the proposed model in detecting the type of vulnerability. Moreover, 15 code metrics are used to predict vulnerabilities. Our analysis on feature importance indicates that what structure the software program code has, most likely, it will be vulnerable. Experimental results in 10 real projects (OpenSSL, SQLite, FreeType, LibTiff, Libxslt, Binutils, FFmpeg, ImageMagick, OpenSC, and rdesktop) indicated that the security testing predictor proposed in this paper could predict on average 89% of the really vulnerable functions of the source code and 86% of the vulnerability type of the detected functions correctly.

1 citations


Posted ContentDOI
04 Apr 2023
TL;DR: In this paper , the authors present an analysis platform that integrates several static analysis tools that enable Git-based repositories to continuously monitor warnings across their version history and provide a visualization component in the form of a dashboard to display security trends and hotspots, which can be used to create a database of security alerts at a scale wellsuited for machine learning applications such as bug or vulnerability detection.
Abstract: Static analysis tools come in many forms andconfigurations, allowing them to handle various tasks in a (secure) development process: code style linting, bug/vulnerability detection, verification, etc., and adapt to the specific requirements of a software project, thus reducing the number of false positives.The wide range of configuration options poses a hurdle in their use for software developers, as the tools cannot be deployed out-of-the-box. However, static analysis tools only develop their full benefit if they are integrated into the software development workflow and used on regular. Vulnerability management should be integrated via version history to identify hotspots, for example. We present an analysis platform that integrates several static analysis tools that enable Git-based repositories to continuously monitor warnings across their version history. The framework is easily extensible with other tools and programming languages. We provide a visualization component in the form of a dashboard to display security trends and hotspots. Our tool can also be used to create a database of security alerts at a scale well-suited for machine learning applications such as bug or vulnerability detection.

Proceedings ArticleDOI
26 Jun 2023
TL;DR: In this paper , the authors propose a rating system based on static taint analysis, which selects the most efficient solver for program dependence of critical path to reduce the false-positives and time cost of static vulnerability detection.
Abstract: Program static analysis is of great value of source code software vulnerability detection, but it is often limited by scalability bottlenecks. Constraint solvers are inefficient due to complex program dependencies on millions of lines of program source code. A single solver is difficult to get the balance between the accuracy and the time cost. This paper discusses the program dependence and constraint solving of static value-flow analysis, and specifically implements a solver rating system based on static taint analysis, which selects the most efficient solver for program dependence of critical path to reduce the false-positives and time cost of static vulnerability detection. Through testing for Juliet test sets and several real-world projects, we found that the overall performance of the system was better than other single SMT solvers or default scheduling strategies.


Proceedings ArticleDOI
22 May 2023
TL;DR: In this paper , the authors define five code properties used by humans and automatic detectors to identify code smells and demonstrate how various code properties can be mapped to the 22 code smells defined by Martin Fowler.
Abstract: Code smells are structures in code that imply potential maintainability problems and may negatively impact software quality. One of the critical challenges with code smells is that their definitions are often vague, difficult to comprehend and subjective, making them hard to reliably and consistently detect and analyze by humans and automated systems. Most existing code smell detection approaches rely heavily on human interpretation and are typically supported by structural code metrics. Unfortunately, many of these approaches are incomplete and do not cover a range of code properties that could indicate potential code smells.This paper analyzes code smell detection approaches to identify code properties used for code smell detection and analysis. Informed by our previous work and the literature, we define five code properties used by humans and automatic detectors to identify code smells. We demonstrate how various code properties can be mapped to the 22 code smells defined by Martin Fowler. The resulting catalog of properties can help software engineers and code maintainability researchers analyze code smells and build automated code smell detectors that examine properties beyond the traditional structural metrics.

Journal ArticleDOI
TL;DR: In this paper , the authors present an integrated approach that combines low-level program analysis with high-level repository information, which maps repository information into a lowlevel intermediate program representation, making it available for state-of-theart program analysis.
Abstract: Software projects are complex technical and organizational systems involving large numbers of artifacts and developers. To understand and tame software complexity, a wide variety of program analysis techniques have been developed for bug detection, program comprehension, verification, and more. At the same time, repository mining techniques aim at obtaining insights into the inner socio-technical workings of software projects at a larger scale. While both program analysis and repository mining have been successful on their own, they are largely isolated, which leaves considerable potential for synergies untapped. We present SEAL, the first integrated approach that combines low-level program analysis with high-level repository information. SEAL maps repository information, mined from the development history of a project, onto a low-level intermediate program representation, making it available for state-of-the-art program analysis. SEAL’s integrated approach allows us to efficiently address software engineering problems that span multiple levels of abstraction, from low-level data flow to high-level organizational information. To demonstrate its merits and practicality, we use SEAL to determine which code changes modify central parts of a given software project, how authors interact (indirectly) with each other through code, and we demonstrate that putting static analysis’ results into a socio-technical context improves their expressiveness and interpretability.

Proceedings ArticleDOI
01 Mar 2023
TL;DR: In this paper , a meta-learner is used to detect the type of issue and locate parts of the code that need to be revised by developers, and fine-tune another pre-trained language model to locate these issues in the source code submitted by developers.
Abstract: Modern code review is a process for early detection and reduction of issues, which assists in ensuring the quality of the source code, detecting anomalies, and identifying potential improvements. However, this is a highly manual activity that requires a lot of resources and time. Recent research has addressed these problems by attempting to entirely automate this task (i.e., generating code reviews). However, we do believe that dismissing the reviewer from this process is not the best option in terms of its optimal functioning, especially considering the high error rates in the proposed approaches. Furthermore, this full automation is still too far to achieve given the complexity of the task that requires human intelligence. In this work, we aim to assist the reviewer in the code review process. We propose an approach for detecting the type of issue and locating parts of the code that need to be revised by developers. In the first phase, we propose a meta-learner that combines a learning-based model and a knowledge-based model to predict the type of issue from the review comment. Then, we use this component to create and label a large dataset composed of quadruplets . We use this data set to finetune a pre-trained language model to predict the types of issues (e.g., naming, resource handling, etc.), that need to be addressed in the original code snippet. Furthermore, we fine-tune another pre-trained language model to locate these issues in the source code submitted by developers. We evaluate the performance of our approach using a test set not considered during the training. Our results show that our model accurately locates and predicts the types of issues.

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors performed a quantitative analysis on code smell introduction and removal practices of DL developers and found that regular and irregular DL developers have less difference in both code smell removal practices.
Abstract: With increasing popularity of Deep Learning (DL) software development, code quality issues arise in DL software development. Code smell is one of the factors which reduce the quality of source code. Several previous studies investigated the prevalence of code smell in DL software systems to evaluate the quality of DL source code. However, there is still a lack of understanding of the awareness of individual DL developers in code smell. To more deeply understand the code smell risk in DL software development, it is needed to investigate the code smell awareness of DL developers. In this paper, we present an empirical study on code smell practices of DL developers. Specifically, we performed a quantitative analysis on code smell introduction and removal practices of DL developers. We collected a dataset of code smell introduction and removal history of DL developers from several open source DL software GitHub repositories. We then quantitatively analyzed the collected dataset. As a result of the quantitative analysis, we observed the following three findings on code smell introduction and removal practices of DL developers. First, DL developers tend to perform code smell introduction practice more than code smell removal practice. Second, DL developers have slightly broader code smell introduction scope than code smell removal scope. Third, regular and irregular DL developers have less difference in both code smell introduction and removal practices. The results indicate that DL developers have very poor awareness on code smell risk. Our findings suggest that DL software development project managers should provide a helpful guideline that makes DL developers actively participate code smell removal tasks.

DissertationDOI
07 Mar 2023
TL;DR: The RefactorErl tool as mentioned in this paper is an extensive static analyzer framework developed for Erlang, which offers several refactorings and code comprehension support for developers, and it can be used for parallelization, debugging or change impact analysis.
Abstract: Functional programming languages are getting more popular nowadays, thus there is a high demand on new tools that may support the development process. There are two main types of such tools, one is operating with dynamic information by running the code, the other is performing static analysis on the source code of the program. Erlang is a functional programming language designed for developing real world applications. The RefactorErl tool is an extensive static analyzer framework developed for Erlang. The tool offers several refactorings and code comprehension support for developers. In general, the focus of my research was to develop new static analysis methods. These methods extract compound semantic information from the source code, and the result could be used for other analysis methods. My results are related to control flow, control dependence, impact analysis and communication model of Erlang programs. In my dissertation, I have presented formal control flow rules based on the semantics of the language. The rules are compositional and can be used for developing control flow graph of an expression, or a function. The results of the control flow graph have been used further analysis techniques, like discovering parallelable components in legacy source codes. The information available in the control flow graphs can be used in many ways. It can be used for parallelization, debugging or change impact analysis. The control dependence graph is a more compact representation compared to control flow graph. It includes only direct control dependencies, while the control flow graph contains every execution path of a program. I have presented and extended the control dependence graph with data dependencies, what we call Erlang dependence graph. I have presented an impact analysis method based on the dependence graph. This method can be used for relevant test case selection. It selects only those test cases that could be affected by a change. The presented method can not only be used for impact analysis of refactorings, but can be generalized for an arbitrary modification. I have also presented a method for extracting the communication model from Erlang source codes. I have described the algorithms that can be used to identify the processes and the possible communication between them. I have extended the model with the analysis results of hidden communication. The Erlang Term Storage (ets) tables can be used as shared memory between processes. Any reading or writing operation is taken as an interaction with other processes accessing the same table. I have added analysis of generic server behaviors as another extension to the process model. This introduces another type processes and the hidden communication. The results can be used in code comprehension techniques, but the results from the communication model could be used in impact analysis as well.

Posted ContentDOI
01 May 2023
TL;DR: In this paper , the authors investigate the correlation between the complexity of a software code-base and the efficiency of software code reviews and inspections using a hybrid approach, utilizing automated static code analysis tools and manual code reviews to detect potential code quality concerns.
Abstract: The process of software development is complex and requires thorough quality control measures to ensure that software products adhere to the required standards. The primary goal of this study was to investigate the correlation between the complexity of a software code-base and the efficiency of software code reviews and inspections. The study used a hybrid approach, utilizing automated static code analysis tools and manual code reviews to detect potential code quality concerns. Additionally, we used cyclomatic complexity and other metrics to measure code complexity. Criteria were decided on to select technical personnel from industries for the purpose of conducting a code review and a feedback survey. Afterward, the responses obtained from the survey were used to determine the impact caused by the complexity of the code base on the efficiency of software code reviews and inspections.

OtherDOI
11 Apr 2023
TL;DR: Binary Ninja as discussed by the authors is a powerful reverse engineering tool that analyzes the control flow of a program and compares source code snippets to disassembled code, and provides the analysis of an unknown algorithm without source code access or decompiled pseudocode.
Abstract: Static analysis is the precursor of dynamic analysis. This chapter provides a brief overview of static analysis tools the readers can use for reverse engineering. Command-line tools can be a useful part of the reverse engineering process, especially in the initial information-gathering phase. Binary Ninja is a powerful reverse engineering tool. It comes with unique features and a cloud-based version worth highlighting. The chapter shows how pointers work in assembly, analyzes the control flow of a program, and compares source code snippets to disassembled code. It also provides the analysis of an unknown algorithm without source code access or decompiled pseudocode. The chapter is meant to help the readers practice analyzing the conditional flow of disassembled code and dissecting the meaning of every instruction.

Proceedings ArticleDOI
01 Mar 2023
TL;DR: Wang et al. as discussed by the authors proposed a slice-based code change representation approach which considers data and control dependencies between changed code and unchanged code and proposed a pre-trained sparse Transformer model to learn code change representations with three pre-training tasks.
Abstract: Code changes are at the very core of software development and maintenance. Deep learning techniques have been used to build a model from a massive number of code changes to solve software engineering tasks, e.g., commit message generation and bug-fix commit identification. However, existing code change representation learning approaches represent code change as lexical tokens or syntactical AST (abstract syntax tree) paths, limiting the capability to learn semantics of code changes. Besides, they mostly do not consider noisy or tangled code change, hurting the accuracy of solved tasks. To address the above problems, we first propose a slice-based code change representation approach which considers data and control dependencies between changed code and unchanged code. Then, we propose a pre-trained sparse Transformer model, named CCS2VEC, to learn code change representations with three pre-training tasks. Our experiments by fine-tuning our pre-trained model on three downstream tasks have demonstrated the improvement of CCS2VEC over the state-of-the-art CC2VEC.

Posted ContentDOI
03 May 2023
TL;DR: In this paper , the authors report the challenges of cascading warnings generated from two versions of programs and investigate program differencing tools and extend them to perform warning cascading automatically.
Abstract: Static analysis is widely used for software assurance. However, static analysis tools can report an overwhelming number of warnings, many of which are false positives. Applying static analysis to a new version, a large number of warnings can be only relevant to the old version. Inspecting these warnings is a waste of time and can prevent developers from finding the new bugs in the new version. In this paper, we report the challenges of cascading warnings generated from two versions of programs. We investigated program differencing tools and extend them to perform warning cascading automatically. Specifically, we used textual based diff tool, namely SCALe, abstract syntax tree (AST) based diff tool, namely GumTree, and control flow graph (CFG) based diff tool, namely Hydrogen. We reported our experience of applying these tools and hopefully our findings can provide developers understandings of pros and cons of each approach. In our evaluation, we used 96 pairs of benchmark programs for which we know ground-truth bugs and fixes as well as 12 pairs of real-world open-source projects. Our tools and data are available at https: //github.com/WarningCas/WarningCascading_Data.

Journal ArticleDOI
TL;DR: In this article , the authors present CertiCAN, a tool produced using the Coq proof assistant and based on the Prosa library for the formal verification of CAN analysis results, which can certify the results of RTaW-Pegase, an industrial CAN analysis tool.
Abstract: We present CertiCAN, a tool produced using the Coq proof assistant and based on the Prosa library for the formal verification of CAN analysis results. Result verification is a process that is lightweight and flexible compared to tool verification. Indeed, the formal verification of an industrial analyzer needs access to the source code, requires the proof of many optimizations or implementation tricks and new proof effort at each software update. In contrast, CertiCAN only relies on the result provided by such a tool and remains independent of the tool itself or its updates. Furthermore, it is usually faster to check a result than to produce it. All these reasons make CertiCAN a practical choice for industrial purposes. CertiCAN is based on the formal verification and combined use of two well-known CAN analysis techniques completed with additional optimizations. Experiments demonstrate that CertiCAN is computationally efficient and faster than the underlying combined analysis. It is able to certify the results of RTaW-Pegase, an industrial CAN analysis tool, even for large systems. This result paves the way for a broader acceptance of formal tools for the certification of real-time systems analysis results.

Journal ArticleDOI
TL;DR: In this article , the authors proposed a model that uses a syntax-metric parser engine to detect insecure software code bloats and security vulnerabilities, named Touba, assesses and analyzes the discovered cases and provides an interactive method for code review and statistical analysis.
Abstract: The term “code smell” or “bad smell” refers to a code that has been written incorrectly and reflects severe defects in software design. Some code smells cause, particularly, security vulnerabilities in software codes. Until now, identification of these codes is mainly done through software tools and not by process methods or models. Based on the Mikado methodology, this paper proposes a model that uses a syntax-metric parser engine to detect insecure software code bloats and security vulnerabilities. This model, named Touba, assesses and analyzes the discovered cases and provides an interactive method for code review and statistical analysis. Employing the proposed model in testing the Juliet Test Suites shows its outstanding performance in terms of the selected measures of precision, recall, and F-measure. The obtained results show that the proposed model has a better performance - compared to the existing tools - in terms of accuracy by 20.3%, recall by 16.76%, and F-measure by 18.61% on average. These results indicate the effectiveness of the proposed - security vulnerability identification - model as the main contribution of this investigation.

Proceedings ArticleDOI
01 Mar 2023
TL;DR: WasmA as discussed by the authors is a static analysis framework for WebAssembly that determines necessary information needed by static client analyses, like call, control-, and data-flow graphs, to reveal security issues.
Abstract: The usage of WebAssembly (Wasm) is not only increasing in the web browser, but also as a backend technology on servers. Since Wasm introduces several security issues, like the possibility to obfuscate malicious code and cryptomining, an adequate analysis framework is needed for creating analyses that reveal such issues. Existing state-of-the-art analysis approaches lack in soundness, in fully providing essential information to client analyses, or entail a considerable amount of overhead due to their dynamic nature. To meet this challenge, we developed WasmA a static analysis framework for WebAssembly that determines necessary information needed by static client analyses, like call, control-, and data-flow graphs. In the evaluation we show that WasmA is performant, generic and extensible and thus competitive in comparison to state-of-the art tools. The implementation of a cryptominer detection tool on top of WasmA shows its applicability. WasmA is able to provide the required functionality while having a comparative resource-efficient approach, and as a result WasmA outperforms the state of the art.

Posted ContentDOI
10 Feb 2023
TL;DR: In this article , the results of an experiment in the classroom over a period of 3 academic semesters, involving 65 submissions that carried out code review activity of 690 rules using PMD.
Abstract: Static analysis tools are frequently used to scan the source code and detect deviations from the project coding guidelines. Given their importance, linters are often introduced to classrooms to educate students on how to detect and potentially avoid these code anti-patterns. However, little is known about their effectiveness in raising students awareness, given that these linters tend to generate a large number of false positives. To increase the awareness of potential coding issues that violate coding standards, in this paper, we aim to reflect on our experience with teaching the use of static analysis for the purpose of evaluating its effectiveness in helping students with respect to improving software quality. This paper discusses the results of an experiment in the classroom over a period of 3 academic semesters, involving 65 submissions that carried out code review activity of 690 rules using PMD. The results of the quantitative and qualitative analysis shows that the presence of a set of PMD quality issues influence the acceptance or rejection of the issues, design, and best practices-related categories that take a longer time to be resolved, and students acknowledge the potential of using static analysis tools during code review. Through this experiment, code review can turn into a vital part of the educational computing plan. We envision our findings enabling educators to support students with code review strategies to raise students awareness about static analysis tools and scaffolding their coding skills.

Posted ContentDOI
20 Apr 2023
TL;DR: In this article , a method combining machine learning with a static analysis tool (i.e., Infer) is proposed to automatically repair source code in IR space, and to use a sequence-to-sequence model to propose fix in source code space.
Abstract: We propose a method combining machine learning with a static analysis tool (i.e. Infer) to automatically repair source code. Machine Learning methods perform well for producing idiomatic source code. However, their output is sometimes difficult to trust as language models can output incorrect code with high confidence. Static analysis tools are trustable, but also less flexible and produce non-idiomatic code. In this paper, we propose to fix resource leak bugs in IR space, and to use a sequence-to-sequence model to propose fix in source code space. We also study several decoding strategies, and use Infer to filter the output of the model. On a dataset of CodeNet submissions with potential resource leak bugs, our method is able to find a function with the same semantics that does not raise a warning with around 97% precision and 66% recall.

Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed a machine-learning-based system that can produce human-readable summarizations of the functionalities in the context of code vulnerability analysis, and they generated the first assembly code to function summary dataset and propose to leverage the encoder-decoder architecture.
Abstract: Reverse engineering is the process of understanding the inner working of a software system without having the source code. It is critical for firmware security validation, software vulnerability research, and malware analysis. However, it often requires a significant amount of manual effort. Recently, data-driven solutions were proposed to reduce manual effort by identifying the code clones on the assembly or the source level. However, security analysts still have to understand the matched assembly or source code to develop an understanding of the functionality, and it is assumed that such a matched candidate always exists. This research bridges the gap by introducing the problem of assembly code summarization. Given the assembly code as input, we propose a machine-learning-based system that can produce human-readable summarizations of the functionalities in the context of code vulnerability analysis. We generate the first assembly code to function summary dataset and propose to leverage the encoder-decoder architecture. With the attention mechanism, it is possible to understand what aspects of the assembly code had the largest impact on generating the summary. Our experiment shows that the proposed solution achieves high accuracy and the Bilingual Evaluation Understudy (BLEU) score. Finally, we have performed case studies on real-life CVE vulnerability cases to better understand the proposed method’s performance and practical implications.

Proceedings ArticleDOI
01 Mar 2023
TL;DR: Kim et al. as mentioned in this paper implemented and evaluated Travtrans which is based on the powerful text prediction language model GPT-2 by training on abstract syntax trees instead of treating code as plain texts.
Abstract: Automatic code completion is one of the most popular developer assistance features which is usually performed using static program analysis. But, studies have shown that static analysis can be riddled with false positives. Another issue with static-analysis based code completion is that the recommendation does not take into account the history/ context in which the software is operating and rely on type/ alphabetical ordering of suggestions. A recent development that has shown to be promising in this direction is the use of language models such as transformers that are trained on real-world code from Github to provide context sensitive, accurate code completion suggestions. Studies on transformer-based code completion have shown that such restrictions can be leveraged; i.e; training transformers on structural representations of code (specifically ASTs) could have a positive impact on the accuracy of code completion. To this end, the work by Kim et al. implemented and evaluated TravTrans which is based on the powerful text prediction language model GPT-2 by training on abstract syntax trees instead of treating code as plain texts. Using alternative source code representation such as AST provides the already potent language model with an additional layer of program semantic-awareness. But, TravTrans has adapted several rigid choices regarding various components of the transformer architecture like embedding sizes, sliding windows etc. TravTrans also suffers from issues related to running out of vocabulary. In this paper, we reproduce the TravTrans model and perform a deeper, fine-grained analysis of the impact of various architectural and code-level settings on the prediction. As a result of our fine-grained analysis, we also identify several aspects that need improvements like the fact that the model performs particularly poorly with code involving dictionaries and lists. We also offer solutions to a few of the issues like the out-of-vocabulary issue. Finally, our results motivates the need for a customizable-transformer architecture for coding tasks.

Proceedings ArticleDOI
01 Mar 2023
TL;DR: In this paper , the authors train a CodeT5 model with additional data from the target project to complete code that fits a specific project appropriately, which is called do-main adaptation, and is often used in neural machine translation.
Abstract: Code completion has the benefit of improving coding speed and reducing the chance of inducing bugs. In recent years, DL-based code completion techniques have been proposed. In particular, pre-trained models have shown outstanding performance because they can complete code by considering the context before and after it is completed. While the model can generate the set of candidate codes, some of those might need to be modified by developers because projects can have different coding rules.In this study, to complete code that fits a specific project appropriately, we train the CodeT5 model with additional data from the target project. This fine-tuning approach is called do-main adaptation, and is often used in neural machine translation. Our preliminary experiment observes that our domain-adapted model improves 5.3% of the perfect prediction rate and, 3.4% of the edit distance rate, compared to the fine-tuned model with the out-of-domain dataset. Furthermore, we discover that the improvement is greater with a larger repository size. The model that is trained with a small dataset, however, hardly improves or performs worse.

Posted ContentDOI
05 Jun 2023
TL;DR: In this article , the authors propose a static evaluation framework to quantify static errors in Python code completions, by leveraging Abstract Syntax Trees, which is not only more efficient but also applicable to code in the wild.
Abstract: Large language models trained on code have shown great potential to increase productivity of software developers. Several execution-based benchmarks have been proposed to evaluate functional correctness of model-generated code on simple programming problems. Nevertheless, it is expensive to perform the same evaluation on complex real-world projects considering the execution cost. On the contrary, static analysis tools such as linters, which can detect errors without running the program, haven't been well explored for evaluating code generation models. In this work, we propose a static evaluation framework to quantify static errors in Python code completions, by leveraging Abstract Syntax Trees. Compared with execution-based evaluation, our method is not only more efficient, but also applicable to code in the wild. For experiments, we collect code context from open source repos to generate one million function bodies using public models. Our static analysis reveals that Undefined Name and Unused Variable are the most common errors among others made by language models. Through extensive studies, we also show the impact of sampling temperature, model size, and context on static errors in code completions.

Journal ArticleDOI
TL;DR: In this paper , the authors investigated what types of comments novice students document in their source code and further categorized those comments using a machine learning approach, which is a common practice among developers to explain the purpose of the code in order to improve code comprehension and readability.
Abstract: Code comments are considered an efficient way to document the functionality of a particular block of code. Code commenting is a common practice among developers to explain the purpose of the code in order to improve code comprehension and readability. Researchers investigated the effect of code comments on software development tasks and demonstrated the use of comments in several ways, including maintenance, reusability, bug detection, etc. Given the importance of code comments, it becomes vital for novice developers to brush up on their code commenting skills. In this study, we initially investigated what types of comments novice students document in their source code and further categorized those comments using a machine learning approach. The work involves the initial manual classification of code comments and then building a machine learning model to classify student code comments automatically. The findings of our study revealed that novice developers/students’ comments are mainly related to Literal (26.66%) and Insufficient (26.66%). Further, we proposed and extended the taxonomy of such source code comments by adding a few more categories, i.e., License (5.18%), Profile (4.80%), Irrelevant (4.80%), Commented Code (4.44%), Autogenerated (1.48%), and Improper (1.10%). Moreover, we assessed our approach with three different machine-learning classifiers. Our implementation of machine learning models found that Decision Tree resulted in the overall highest accuracy, i.e., 85%. This study helps in predicting the type of code comments for a novice developer using a machine learning approach that can be implemented to generate automated feedback for students, thus saving teachers time for manual one-on-one feedback, which is a time-consuming activity.

Proceedings ArticleDOI
14 Jun 2023
TL;DR: In this paper , a study involving 11 static code analysers, and one AI-powered chatbot named ChatGPT, was carried out to assess their effectiveness in detecting 92 vulnerabilities representing the top 10 known vulnerability categories in web applications, as classified by OWASP.
Abstract: The prevalence and significance of web services in our daily lives make it imperative to ensure that they are – as much as possible – free from vulnerabilities. However, developing a complex piece of software free from any security vulnerabilities is hard, if not impossible. One way to progress towards achieving this holy grail is by using static code analysis tools to root out any common or known vulnerabilities that may accidentally be introduced during the development process. Static code analysis tools have significantly contributed to addressing the problem above, but are imperfect. It is conceivable that static code analysis can be improved by using AI-powered tools, which have recently increased in popularity. However, there is still very little work in analysing both types of tools’ effectiveness, and this is a research gap that our paper aims to fill. We carried out a study involving 11 static code analysers, and one AI-powered chatbot named ChatGPT, to assess their effectiveness in detecting 92 vulnerabilities representing the top 10 known vulnerability categories in web applications, as classified by OWASP. We particularly focused on PHP vulnerabilities since it is one of the most widely used languages in web applications. However, it has few security mechanisms to help its software developers. We found that the success rate of ChatGPT in terms of finding security vulnerabilities in PHP is around 62-68%. At the same time, the best traditional static code analyser tested has a success rate of 32%. Even combining several traditional static code analysers (with the best features on certain aspects of detection) would only achieve a rate of 53%, which is still significantly lower than ChatGPT’s success rate. Nonetheless, ChatGPT has a very high false positive rate of 91%. In comparison, the worst false positive rate of any traditional static code analyser is 82%. These findings highlight the promising potential of ChatGPT for improving the static code analysis process but reveal certain caveats (especially regarding accuracy) in its current state. Our findings suggest that one interesting possibility to explore in future works would be to pick the best of both worlds by combining traditional static code analysers with ChatGPT to find security vulnerabilities more effectively.

Posted ContentDOI
15 Feb 2023
TL;DR: In this article , a backward analysis is used to obtain the path-sensitivity of a variable, which contributes to the overall improvement of static type analysis for dynamically typed languages such as Python.
Abstract: Precise and fast static type analysis for dynamically typed language is very difficult. This is mainly because the lack of static type information makes it difficult to approximate all possible values of a variable. Actually, the existing static type analysis methods are imprecise or slow. In this paper, we propose a novel method to improve the precision of static type analysis for Python code, where a backward analysis is used to obtain the path-sensitivity. By doing so, our method aims to obtain more precise static type information, which contributes to the overall improvement of static analysis. To show the effectiveness of our method, we conducted a preliminary experiment to compare our method implementation and the existing analysis tool with respect to precision and time efficiency. The result shows our method provides more precise type analysis with fewer false positives than the existing static type analysis tool. Also it shows our proposed method increases the analysis time, but it is still within the range of practical use.

Journal ArticleDOI
TL;DR: In this article , the authors present an analysis of example usage barriers programmers faced in a previous study of React.js novices, and they build REVEAL to detect and repair the common errors they identified in copied code.
Abstract: Programmers typically learn APIs on-the-fly through web examples. Incompatibilities and omissions in copied example code can create barriers for these learners. We present an analysis of example usage barriers programmers faced in a previous study of React.js novices. We show that a small set of errors prevented programmers from using most found code examples. In response, we built REVEAL to detect and repair the common errors we identified in copied code. We describe the formative evaluation of REVEAL and show that REVEAL users were more likely to successfully integrate code examples than participants in the previous study.

Book ChapterDOI
01 Jan 2023
TL;DR: In this paper , the authors divide the malware analysis team into two different teams, each team has a different specific mission and each team is provided with toolkits to research and analyze malicious code and clarify the harmful effects of malicious code affecting the system.
Abstract: This research is used to consolidate and standardize a centralized anti-malware approach, helping the malware analysis team have a total solution that makes malware investigation and analysis simpler through thereby tracing the source of the malicious code. By aggregating the world’s leading solutions as well as some self-developed solutions, it provides the most efficient centralized malware detection, analysis, and treatment. The solution will divide the malware analysis team into two different teams. Each team will have a different specific mission. The first team will be provided with toolkits to research and analyze malicious code and clarify the harmful effects of malicious code affecting the system. The second team will be tasked with receiving analytical information from team one and using that information to find the source of malicious code on a number of network intelligence sources through which to find solutions to deal with malicious code, identify the source of malicious code, and identify hacker group is attacking the organization. The teaming and provision of these solution-specific tools will provide a complete process for the organization's malware research team to have a malware handling process as well as useful tools for analysis, handle, and find the source of malicious code.

Proceedings ArticleDOI
17 Mar 2023
TL;DR: In this article , the authors used graph neural networks and graph convolutional neural networks (GCN) to detect vulnerabilities in Java code and reported the content of the vulnerability content.
Abstract: In nowadays using different tools and apps is a basic need of people's behavior in life, but the security issues arising from the existence of source code plagiarism among tools and apps are likely to bring huge losses to companies and even countries, so detecting the existence of vulnerabilities or malicious code in software becomes an important part of protecting information and detecting software security. This project is based on the aspect of JAVA code vulnerability detection based on graph representation learning and software homology comparison to carry out research. This project will be based on the content of deep learning, using a large number of vulnerable source code, extracting its features, and transforming it into a graph so that it can be tested source code for comparison and report the vulnerability content. The main work and results of this project are as follows: 1.By extracting the example dataset and generating json files to save the feature information of relevant java code; by generating vector files, bytecode files and dot files, and batch extracting nodes and edges based on the contents of the dot files for subsequent machine learning use, the before and after steps and operations form a logical self-consistency to ensure the integrity of the project. 2.Through the study of graph neural networks and graph convolutional neural networks, relevant models are selected for machine learning using predecessor files and manual model tuning is performed to ensure good learning results and feedback for the machine learning part of the project. 3.This project training dataset negative samples for sard above the shared dataset, which contains 46636 java vulnerability source code, and dataset support environment, test dataset negative samples dataset also from sard, positive samples dataset are generated from the relevant person in charge. 4.Based on Graph Neural Network (GNN) and Graph Convolutional Neural Network (GCN), this project will design and implement a whole set of automated vulnerability detection system for java code. 5. All the related contents of this project, after the human extensive search of domestic and foreign related papers and materials, there are not all projects or contents similar to this project, the same papers and materials appear, all the problems involved in this project and related ideas are for the project this group of people thinking, looking for solutions.

Posted ContentDOI
10 Apr 2023
TL;DR: QChecker as discussed by the authors is a static analysis tool that supports finding bugs in quantum programs in Qiskit, which can help find bugs and potential problems in software that may only appear at runtime.
Abstract: Static analysis is the process of analyzing software code without executing the software. It can help find bugs and potential problems in software that may only appear at runtime. Although many static analysis tools have been developed for classical software, due to the nature of quantum programs, these existing tools are unsuitable for analyzing quantum programs. This paper presents QChecker, a static analysis tool that supports finding bugs in quantum programs in Qiskit. QChecker consists of two main modules: a module for extracting program information based on abstract syntax tree (AST), and a module for detecting bugs based on patterns. We evaluate the performance of QChecker using the Bugs4Q benchmark. The evaluation results show that QChecker can effectively detect various bugs in quantum programs.