Home
/
Authors
/
Yulei Sui

Author

Yulei Sui

Other affiliations: Northwestern Polytechnical University, Australian Artificial Intelligence Institute, University of New South Wales

Bio: Yulei Sui is an academic researcher from University of Technology, Sydney. The author has contributed to research in topics: Computer science & Pointer analysis. The author has an hindex of 19, co-authored 84 publications receiving 1027 citations. Previous affiliations of Yulei Sui include Northwestern Polytechnical University & Australian Artificial Intelligence Institute.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

SVF: interprocedural static value-flow analysis in LLVM

[...]

Yulei Sui¹, Jingling Xue¹•Institutions (1)

University of New South Wales¹

17 Mar 2016

TL;DR: SVF, which is fully implemented in LLVM, allows value-flow construction and pointer analysis to be performed in an iterative manner, thereby providing increasingly improved precision for both.

...read moreread less

Abstract: This paper presents SVF, a tool that enables scalable and precise interprocedural Static Value-Flow analysis for C programs by leveraging recent advances in sparse analysis. SVF, which is fully implemented in LLVM, allows value-flow construction and pointer analysis to be performed in an iterative manner, thereby providing increasingly improved precision for both. SVF accepts points- to information generated by any pointer analysis (e.g., Andersen’s analysis) and constructs an interprocedural memory SSA form, in which the def-use chains of both top-level and address-taken variables are captured. Such value-flows can be subsequently exploited to support various forms of program analysis or enable more precise pointer analysis (e.g., flow-sensitive analysis) to be performed sparsely. By dividing a pointer analysis into three loosely coupled components: Graph, Rules and Solver, SVF provides an extensible interface for users to write their own solutions easily. SVF is publicly available at http://unsw-corg.github.io/SVF.

...read moreread less

241 citations

Proceedings Article•DOI•

Static memory leak detection using full-sparse value-flow analysis

[...]

Yulei Sui¹, Ding Ye¹, Jingling Xue¹•Institutions (1)

University of New South Wales¹

15 Jul 2012

TL;DR: Saber is the first to use a full-sparse value-flow analysis for leak detection in C programs, and is effective at detecting 211 leaks in the 15 SPEC2000 C programs and five applications, while keeping the false positive rate at 18.5%.

...read moreread less

Abstract: We introduce a static detector, Saber, for detecting memory leaks in C programs. Leveraging recent advances on sparse pointer analysis, Saber is the first to use a full-sparse value-flow analysis for leak detection. Saber tracks the flow of values from allocation to free sites using a sparse value-flow graph (SVFG) that captures def-use chains and value flows via assignments for all memory locations represented by both top-level and address-taken pointers. By exploiting field-, flow- and context-sensitivity during different phases of the analysis, Saber detects leaks in a program by solving a graph reachability problem on its SVFG. Saber, which is fully implemented in Open64, is effective at detecting 211 leaks in the 15 SPEC2000 C programs and five applications, while keeping the false positive rate at 18.5%. We have also compared Saber with Fastcheck (which analyzes allocated objects flowing only into top-level pointers) and Sparrow (which handles all allocated objects using abstract interpretation) using the 15 SPEC2000 C programs. Saber is as accurate as Sparrow but is 14.2X faster and reports 40.7% more bugs than Fastcheck at a slightly higher false positive rate but is only 3.7X slower.

...read moreread less

105 citations

Journal Article•DOI•

Reinforcement-Learning-Guided Source Code Summarization via Hierarchical Attention

[...]

Wenhua Wang¹, Yuqun Zhang¹, Yulei Sui, Yao Wan², Zhou Zhao², Jian Wu², Philip S. Yu, Guandong Xu³ - Show less +4 more•Institutions (3)

Southern University of Science and Technology¹, Zhejiang University², IT University³

10 Mar 2020-IEEE Transactions on Software Engineering

TL;DR: This paper presents a new code summarization approach using hierarchical attention network by incorporating multiple code features, including type-augmented abstract syntax trees and program control flows into a deep reinforcement learning (DRL) framework for comment generation.

...read moreread less

Abstract: Code summarization (aka comment generation) provides a high-level natural language description of the function performed by code, which can benefit the software maintenance, code categorization and retrieval. To the best of our knowledge, the state-of-the-art approaches follow an encoder-decoder framework which encodes source code into a hidden space and later decodes it into a natural language space. Such approaches suffer from the following drawbacks: (a) they are mainly input by representing code as a sequence of tokens while ignoring code hierarchy; (b) most of the encoders only input simple features (e.g., tokens) while ignoring the features that can help capture the correlations between comments and code; (c) the decoders are typically trained to predict subsequent words by maximizing the likelihood of subsequent ground truth words, while in real world, they are excepted to generate the entire word sequence from scratch. As a result, such drawbacks lead to inferior and inconsistent comment generation accuracy. To address the above limitations, this paper presents a new code summarization approach using hierarchical attention network by incorporating multiple code features, including type-augmented abstract syntax trees and program control flows. Such features, along with plain code sequences, are injected into a deep reinforcement learning (DRL) framework (e.g., actor-critic network) for comment generation. Our approach assigns weights (pays “attention”) to tokens and statements when constructing the code representation to reflect the hierarchical code structure under different contexts regarding code features (e.g., control flows and abstract syntax trees). Our reinforcement learning mechanism further strengthens the prediction results through the actor network and the critic network, where the actor network provides the confidence of predicting subsequent words based on the current state, and the critic network computes the reward values of all the possible extensions of the current state to provide global guidance for explorations. Eventually, we employ an advantage reward to train both networks and conduct a set of experiments on a real-world dataset. The experimental results demonstrate that our approach outperforms the baselines by around 22% to 45% in BLEU-1 and outperforms the state-of-the-art approaches by around 5% to 60% in terms of S-BLEU and C-BLEU.

...read moreread less

73 citations

Journal Article•DOI•

Detecting Memory Leaks Statically with Full-Sparse Value-Flow Analysis

[...]

Yulei Sui¹, Ding Ye¹, Jingling Xue¹•Institutions (1)

University of New South Wales¹

01 Feb 2014-IEEE Transactions on Software Engineering

TL;DR: Saber is the first to use a full-sparse value-flow analysis for detecting memory leaks statically, and compares favorably with several static leak detectors in terms of accuracy, scalability and scalability.

...read moreread less

Abstract: We introduce a static detector, Saber, for detecting memory leaks in C programs. Leveraging recent advances on sparse pointer analysis, Saber is the first to use a full-sparse value-flow analysis for detecting memory leaks statically. Saber tracks the flow of values from allocation to free sites using a sparse value-flow graph (SVFG) that captures def-use chains and value flows via assignments for all memory locations represented by both top-level and address-taken pointers. By exploiting field-, flow- and context-sensitivity during different phases of the analysis, Saber detects memory leaks in a program by solving a graph reachability problem on its SVFG. Saber, which is fully implemented in Open64, is effective at detecting 254 leaks in the 15 SPEC2000 C programs and seven applications, while keeping the false positive rate at 18.3 percent. Saber compares favorably with several static leak detectors in terms of accuracy (leaks and false alarms reported) and scalability (LOC analyzed per second). In particular, compared with Fastcheck (which analyzes allocated objects flowing only into top-level pointers) using the 15 SPEC2000 C programs, Saber detects 44.1 percent more leaks at a slightly higher false positive rate but is only a few times slower.

...read moreread less

72 citations

Proceedings Article•DOI•

Typestate-guided fuzzer for discovering use-after-free vulnerabilities

[...]

Haijun Wang¹, Xiaofei Xie², Yi Li², Cheng Wen¹, Yuekang Li², Yang Liu², Shengchao Qin³, Hongxu Chen², Yulei Sui⁴ - Show less +5 more•Institutions (4)

Shenzhen University¹, Nanyang Technological University², Teesside University³, University of Technology, Sydney⁴

27 Jun 2020

TL;DR: This work proposes to model UaF vulnerabilities as typestate properties, and develops a typestate-guided fuzzer, named UAFL, for discovering vulnerabilities violating typestate Properties, and shows that UAFL substantially outperforms the state-of-the-art fuzzers in terms of the time taken to discover vulnerabilities.

...read moreread less

Abstract: Existing coverage-based fuzzers usually use the individual control flow graph (CFG) edge coverage to guide the fuzzing process, which has shown great potential in finding vulnerabilities. However, CFG edge coverage is not effective in discovering vulnerabilities such as use-after-free (UaF). This is because, to trigger UaF vulnerabilities, one needs not only to cover individual edges, but also to traverse some (long) sequence of edges in a particular order, which is challenging for existing fuzzers. To this end, we propose to model UaF vulnerabilities as typestate properties, and develop a typestate-guided fuzzer, named UAFL, for discovering vulnerabilities violating typestate properties. Given a typestate property, we first perform a static typestate analysis to find operation sequences potentially violating the property. Our fuzzing process is then guided by the operation sequences in order to progressively generate test cases triggering property violations. In addition, we also employ an information flow analysis to improve the efficiency of the fuzzing process. We have performed a thorough evaluation of UAFL on 14 widely-used real-world programs. The experiment results show that UAFL substantially outperforms the state-of-the-art fuzzers, including AFL, AFLFast, FairFuzz, MOpt, Angora and QSYM, in terms of the time taken to discover vulnerabilities. We have discovered 10 previously unknown vulnerabilities, and received 5 new CVEs.

...read moreread less

71 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

Collapse

Cited by

PDF

Open Access

More filters

Анализ уязвимостей вычислительных систем на основе алгебраических структур и потоков данных National Vulnerability Database

[...]

Воробьев Антон Александрович

01 Jan 2013

TL;DR: This work applied boolean algebras to develop a mathematical model describing the exploits of the NVD data source when using the classification based on the concept of measurement, and proved that she is a measure from the point of view of measure theory.

...read moreread less

Abstract: This work is a sequel of the studies in the analysis of vulnerabilities in computer systems. It applied boolean algebras to develop a mathematical model describing the exploits of the NVD data source when using the classification based on the concept of measurement. Quasimeasure has been offered for the boolean algebra, proved that she is a measure from the point of view of measure theory. Shows that the algebraic structure is also algebra of events.

...read moreread less

264 citations

Proceedings Article•DOI•

SVF: interprocedural static value-flow analysis in LLVM

[...]

Yulei Sui¹, Jingling Xue¹•Institutions (1)

University of New South Wales¹

17 Mar 2016

TL;DR: SVF, which is fully implemented in LLVM, allows value-flow construction and pointer analysis to be performed in an iterative manner, thereby providing increasingly improved precision for both.

...read moreread less

241 citations

Proceedings Article•DOI•

Learning Selective Self-Mutual Attention for RGB-D Saliency Detection

[...]

Nian Liu¹, Ni Zhang, Junwei Han•Institutions (1)

Zayed University¹

14 Jun 2020

TL;DR: This paper proposes to fuse attention learned in both modalities, Inspired by the Non-local model, to integrate the self-attention and each other's attention to propagate long-range contextual dependencies, thus incorporating multi-modal information to learn attention and propagate contexts more accurately.

...read moreread less

Abstract: Saliency detection on RGB-D images is receiving more and more research interests recently. Previous models adopt the early fusion or the result fusion scheme to fuse the input RGB and depth data or their saliency maps, which incur the problem of distribution gap or information loss. Some other models use the feature fusion scheme but are limited by the linear feature fusion methods. In this paper, we propose to fuse attention learned in both modalities. Inspired by the Non-local model, we integrate the self-attention and each other's attention to propagate long-range contextual dependencies, thus incorporating multi-modal information to learn attention and propagate contexts more accurately. Considering the reliability of the other modality's attention, we further propose a selection attention to weight the newly added attention term. We embed the proposed attention module in a two-stream CNN for RGB-D saliency detection. Furthermore, we also propose a residual fusion module to fuse the depth decoder features into the RGB stream. Experimental results on seven benchmark datasets demonstrate the effectiveness of the proposed model components and our final saliency model. Our code and saliency maps are available at https://github.com/nnizhang/S2MA.

...read moreread less

238 citations

Proceedings Article•DOI•

Hawkeye: Towards a Desired Directed Grey-box Fuzzer

[...]

Hongxu Chen¹, Yinxing Xue², Yuekang Li¹, Bihuan Chen³, Xiaofei Xie¹, Xiuheng Wu¹, Yang Liu¹ - Show less +3 more•Institutions (3)

Nanyang Technological University¹, University of Science and Technology of China², Fudan University³

15 Oct 2018

TL;DR: Hawkeye is implemented as a fuzzing framework and evaluated it on various real-world programs under different scenarios, showing that Hawkeye can reach the target sites and reproduce the crashes much faster than state-of-the-art grey-box fuzzers such as AFL and AFLGo.

...read moreread less

Abstract: Grey-box fuzzing is a practically effective approach to test real-world programs. However, most existing grey-box fuzzers lack directedness, i.e. the capability of executing towards user-specified target sites in the program. To emphasize existing challenges in directed fuzzing, we propose Hawkeye to feature four desired properties of directed grey-box fuzzers. Owing to a novel static analysis on the program under test and the target sites, Hawkeye precisely collects the information such as the call graph, function and basic block level distances to the targets. During fuzzing, Hawkeye evaluates exercised seeds based on both static information and the execution traces to generate the dynamic metrics, which are then used for seed prioritization, power scheduling and adaptive mutating. These strategies help Hawkeye to achieve better directedness and gravitate towards the target sites. We implemented Hawkeye as a fuzzing framework and evaluated it on various real-world programs under different scenarios. The experimental results showed that Hawkeye can reach the target sites and reproduce the crashes much faster than state-of-the-art grey-box fuzzers such as AFL and AFLGo. Specially, Hawkeye can reduce the time to exposure for certain vulnerabilities from about 3.5 hours to 0.5 hour. By now, Hawkeye has detected more than 41 previously unknown crashes in projects such as Oniguruma, MJS with the target sites provided by vulnerability prediction tools; all these crashes are confirmed and 15 of them have been assigned CVE IDs.

...read moreread less

174 citations

Proceedings Article•DOI•

MemorySanitizer: fast detector of uninitialized memory use in C++

[...]

Evgeniy Stepanov¹, Konstantin Serebryany¹•Institutions (1)

Google¹

07 Feb 2015

TL;DR: MemorySanitizer is a dynamic tool that detects uses of uninitialized memory in C and C++ and relies on bit-precise shadow memory at run-time, based on compile time instrumentation over dynamic binary instrumentation.

...read moreread less

Abstract: This paper presents MemorySanitizer, a dynamic tool that detects uses of uninitialized memory in C and C++. The tool is based on compile time instrumentation and relies on bit-precise shadow memory at run-time. Shadow propagation technique is used to avoid false positive reports on copying of uninitialized memory. MemorySanitizer finds bugs at a modest cost of 2.5× in execution time and 2× in memory usage; the tool has an optional origin tracking mode that provides better reports with moderate extra overhead. The reports with origins are more detailed compared to reports from other similar tools; such reports contain names of local variables and the entire history of the uninitialized memory including intermediate stores. In this paper we share our experience in deploying the tool at a large scale and demonstrate the benefits of compile-time instrumentation over dynamic binary instrumentation.

...read moreread less

134 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse