Showing papers by "Hai Yu published in 2020"

PDF

Open Access

Proceedings Article•DOI•

Watchman: monitoring dependency conflicts for Python library ecosystem

[...]

Ying Wang¹, Ming Wen², Yepang Liu, Yibo Wang¹, Zhenming Li¹, Wang Chao¹, Hai Yu¹, Shing-Chi Cheung³, Chang Xu⁴, Zhiliang Zhu¹ - Show less +6 more•Institutions (4)

Northeastern University (China)¹, Huazhong University of Science and Technology², Hong Kong University of Science and Technology³, Nanjing University⁴

27 Jun 2020

TL;DR: Watchman is designed and implemented, a technique to continuously monitor dependency conflicts for the PyPI ecosystem, and found several key factors that can lead to DC issues and their regressions.

...read moreread less

Abstract: The PyPI ecosystem has indexed millions of Python libraries to allow developers to automatically download and install dependencies of their projects based on the specified version constraints. Despite the convenience brought by automation, version constraints in Python projects can easily conflict, resulting in build failures. We refer to such conflicts as Dependency Conflict (DC) issues. Although DC issues are common in Python projects, developers lack tool support to gain a comprehensive knowledge for diagnosing the root causes of these issues. In this paper, we conducted an empirical study on 235 real-world DC issues. We studied the manifestation patterns and fixing strategies of these issues and found several key factors that can lead to DC issues and their regressions. Based on our findings, we designed and implemented Watchman, a technique to continuously monitor dependency conflicts for the PyPI ecosystem. In our evaluation, Watchman analyzed PyPI snapshots between 11 Jul 2019 and 16 Aug 2019, and found 117 potential DC issues. We reported these issues to the developers of the corresponding projects. So far, 63 issues have been confirmed, 38 of which have been quickly fixed by applying our suggested patches.

...read moreread less

32 citations

Journal Article•DOI•

Watershed-Based Superpixels With Global and Local Boundary Marching

[...]

Ye Yuan¹, Zhiliang Zhu¹, Hai Yu¹, Wei Zhang¹•Institutions (1)

Northeastern University (China)¹

18 Jun 2020-IEEE Transactions on Image Processing

TL;DR: This paper proposes a new strategy with two distinct criteria for global and local refinement of the boundary pixels, based on the watershed transformation, which reduces the compromise between the boundary adherence and compactness.

...read moreread less

Abstract: Superpixels are widely used in computer vision applications, as they conserve the running costs of subsequent processing while preserving the original performance. In most of the existing algorithms, the boundary adherence and the compactness of superpixels are necessarily inter-inhibitive because the color/gradient information is balanced against the position constraints, and the set criteria define all pixels indiscriminately. In this paper, we present a two-phase superpixel segmentation method based on the watershed transformation. After designing a new approach for calculating the flooding priority, we propose a new strategy with two distinct criteria for global and local refinement of the boundary pixels. These criteria reduce the compromise between the boundary adherence and compactness. Unlike the indiscriminate standards, our method applies different treatments to pixels in different environments, preserving the color homogeneity in content-rich areas while improving the regularity of the superpixels in content-plain regions. The superior accuracy and computing time of our proposed method are verified in comparison experiments with several state-of-the-art methods.

...read moreread less

13 citations

Journal Article•DOI•

A novel compressive sensing-based framework for image compression-encryption with S-box

[...]

Zhiliang Zhu¹, Yanjie Song¹, Wei Zhang¹, Hai Yu¹, Yuli Zhao¹ - Show less +1 more•Institutions (1)

Northeastern University (China)¹

01 Sep 2020-Multimedia Tools and Applications

TL;DR: A novel CS-based compression-encryption framework (CS-CEF) using the intrinsic property of CS to provide a strong plaintext sensitivity for the compression- Encryption scheme, which takes a low additional computation cost.

...read moreread less

Abstract: In this paper, we find that compressive sensing (CS) with the chaotic measurement matrix has a strong sensitivity to plaintext. Because of the quantification executed after CS, however, the plaintext sensitivity produced by CS may be weakened greatly. Thus, we propose a novel CS-based compression-encryption framework (CS-CEF) using the intrinsic property of CS to provide a strong plaintext sensitivity for the compression-encryption scheme, which takes a low additional computation cost. Meanwhile, a simple and efficient Substitution box (S-box) construction algorithm (SbCA) based on chaos is designed. Compared with the existing S-box construction methods, the simulation results prove that the proposed S-box has stronger cryptographic characteristics. Based on the above works, we develop an efficient and secure image compression-encryption scheme using S-box (CSb-CES) under the proposed CS-CEF. The simulations and security analysis illustrate that the proposed CSb-CES has the higher efficiency and security compared with the several state-of-the-art CS-based compression-encryption schemes.

...read moreread less

9 citations

Journal Article•DOI•

An Image Encryption Algorithm Based on Random Hamiltonian Path

[...]

Wei Zhang¹, Shuwen Wang¹, Weijie Han¹, Hai Yu¹, Zhiliang Zhu¹ - Show less +1 more•Institutions (1)

Northeastern University (China)¹

06 Jan 2020-Entropy

TL;DR: A method to generate random Hamiltonian path within digital images, which is equivalent to permutation in image encryption, is designed and an adjusted Bernoulli map is proposed to ensure the randomness of the generated Hamiltonian paths.

...read moreread less

Abstract: In graph theory, Hamiltonian path refers to the path that visits each vertex exactly once. In this paper, we designed a method to generate random Hamiltonian path within digital images, which is equivalent to permutation in image encryption. By these means, building a Hamiltonian path across bit planes can shuffle the distribution of the pixel’s bits. Furthermore, a similar thought can be applied for the substitution of pixel’s grey levels. To ensure the randomness of the generated Hamiltonian path, an adjusted Bernoulli map is proposed. By adopting these novel techniques, a bit-level image encryption scheme was devised. Evaluation of simulation results proves that the proposed scheme reached fair performance. In addition, a common flaw in calculating correlation coefficients of adjacent pixels was pinpointed by us. After enhancement, correlation coefficient becomes a stricter criterion for image encryption algorithms.

...read moreread less

9 citations

Journal Article•DOI•

Locating multiple information sources in social networks based on the naming game

[...]

Xue Yang¹, Xue Yang², Zhiliang Zhu², Hai Yu², Yuli Zhao², Ying Wang² - Show less +2 more•Institutions (2)

Anshan Normal University¹, Northeastern University (China)²

17 Dec 2020-Physics Letters A

TL;DR: This work uses the theory of the naming game to conduct observations on the problem of multiple source localization in the context of information propagation in social networks and proposes a method that can locate sources without knowing the number of information sources.

...read moreread less

5 citations

Journal Article•DOI•

A Naming Game-Based Method for the Location of Information Source in Social Networks

[...]

Xue Yang¹, Xue Yang², Zhiliang Zhu¹, Hai Yu¹, Yuli Zhao¹ - Show less +1 more•Institutions (2)

Northeastern University (China)¹, Anshan Normal University²

27 Apr 2020-Complexity

TL;DR: The design of a dynamic deployment method that reduces considerably the number of observations and the time needed to locate the source of information and calculates the probability of each node that acts as a source based on the information provided by observations is introduced.

...read moreread less

Abstract: We study herein the problem of the location of the information propagation source in social networks based on the network topology and a set of observations. We propose a concise and novel method to accurately locate the source of information using naming game theory. This study introduces the design of a dynamic deployment method that reduces considerably the number of observations and the time needed to locate the source. Moreover, it calculates the probability of each node that acts as a source based on the information provided by observations. This method can be potentially applied to various information propagation models. The simulation results reveal that the method is able to estimate the information source within a small number of hops from the true source.

...read moreread less

3 citations

Proceedings Article•DOI•

Wind Turbine Condition Monitoring Based on Variable Importance of Random Forest

[...]

Kai Shi¹, Chenni Wu¹, Yuechen Wang¹, Hai Yu¹, Zhiliang Zhu¹ - Show less +1 more•Institutions (1)

Northeastern University (China)¹

23 Oct 2020

TL;DR: This paper proposes a wind turbine condition monitoring method based on variable importance of random forest by utilizing the SCADA data, and applies the proposed method to four real cases from wind farms in China.

...read moreread less

Abstract: SCADA data lacks sensory data such as vibration and strain measurement for traditional wind turbine condition monitoring; it is updates in low frequency, one piece of data per 10 minutes in the main, which is also low for failure prediction. Thus it is a tough work to monitoring wind turbines' working condition based on SCADA data. To this end, this paper proposes a wind turbine condition monitoring method based on variable importance of random forest by utilizing the SCADA data. First, to minimize the misjudgment caused by individual outliers, we divide the SCADA time series into segments in unit of time period T. Second, we use decrease accuracy method to calculate the variable importance of random forest, as the feature vector of each segment, which characterizes a turbine's condition. Third, we compare a specific turbine's variable importance with the standard feature of healthy turbines to obtain the proximity of them. Fourth, the monitoring baseline is determined according to 3σ, and the deterioration function is applied to construct the failure probability model. To show the effectiveness, we apply the proposed method to four real cases from wind farms in China.

...read moreread less

2 citations

Posted Content•

Will Dependency Conflicts Affect My Program's Semantics?

[...]

Ying Wang¹, Rongxin Wu², Wang Chao¹, Ming Wen³, Yepang Liu⁴, Shing-Chi Cheung⁵, Hai Yu¹, Chang Xu⁶, Zhiliang Zhu¹ - Show less +5 more•Institutions (6)

Northeastern University (China)¹, Xiamen University², Huazhong University of Science and Technology³, Southern University of Science and Technology⁴, Hong Kong University of Science and Technology⁵, Nanjing University⁶

13 Jun 2020-arXiv: Software Engineering

TL;DR: In this article, the authors propose an automated testing technique Sensor, which synthesizes test cases using ingredients from the project under test to trigger inconsistent behaviors of the APIs with the same signatures in conflicting library versions.

...read moreread less

Abstract: Java projects are often built on top of various third-party libraries. If multiple versions of a library exist on the classpath, JVM will only load one version and shadow the others, which we refer to as dependency conflicts. This would give rise to semantic conflict (SC) issues, if the library APIs referenced by a project have identical method signatures but inconsistent semantics across the loaded and shadowed versions of libraries. SC issues are difficult for developers to diagnose in practice, since understanding them typically requires domain knowledge. Although adapting the existing test generation technique for dependency conflict issues, Riddle, to detect SC issues is feasible, its effectiveness is greatly compromised. This is mainly because Riddle randomly generates test inputs, while the SC issues typically require specific arguments in the tests to be exposed. To address that, we conducted an empirical study of 75 real SC issues to understand the characteristics of such specific arguments in the test cases that can capture the SC issues. Inspired by our empirical findings, we propose an automated testing technique Sensor, which synthesizes test cases using ingredients from the project under test to trigger inconsistent behaviors of the APIs with the same signatures in conflicting library versions. Our evaluation results show that \textsc{Sensor} is effective and useful: it achieved a $Precision$ of 0.803 and a $Recall$ of 0.760 on open-source projects and a $Precision$ of 0.821 on industrial projects; it detected 150 semantic conflict issues in 29 projects, 81.8\% of which had been confirmed as real bugs.

...read moreread less