Mitigation of vulnerabilities and incompatibility in open-source ecosystem

The rapid development of Open-source software (OSS) Ecosystem enhances the efficiency of software development by providing Third-party libraries (TPLs) for developers to avoid re-inventing the wheels. However, the usage of TPLs sometimes introduces vulnerabilities, as the TPLs may have security loop...

Full description

Saved in:
Bibliographic Details
Main Author: Zhang, Lyuye
Other Authors: Liu Yang
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/181586
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The rapid development of Open-source software (OSS) Ecosystem enhances the efficiency of software development by providing Third-party libraries (TPLs) for developers to avoid re-inventing the wheels. However, the usage of TPLs sometimes introduces vulnerabilities, as the TPLs may have security loopholes that could be exploited by attackers. Regarding the security aspect of OSS, researchers have made tremendous advancements by studying, detecting, evaluating, and fixing vulnerabilities within OSS ecosystem and TPLs. Towards a software project that leverages TPLs, Software Composition Analysis (SCA) has been proposed to detect TPLs and the potential vulnerabilities associated with them. By reporting the vulnerabilities to users, SCA serves as an inspector for TPLs used within software projects. Additionally, SCA tools are responsible for providing suggestions to mitigate the security risks, called remediation, such as adjusting the versions of the open-source libraries. We take Maven as an example, which is a popular package manager for Java projects, to illustrate the incompatibility and vulenrability issues within individual projects and the entire ecosystem. Having explored the Maven ecosystem from micro and macro perspectives, we propose a series of tools to address the issues. Then I delved into another package manager, Golang, to study the vulnerability life cycles in the Golang ecosystem. For an individual Maven projects, upgrading the versions of its dependencies helps reduce the risks of vulenrability. However, the adjusted dependencies by remediation inevitably introduce incompatibility risks. Modern SCA tools fail to consider the comprehensive compatibility risks while adjusting the versions of libraries. Some cause syntactic issues due to the missing signatures of symbols or methods. Other incompatibilities can cause different behaviors which lead to abnormal execution or even crashes. The syntactic issues can be detected by compilation and existing API checking tools. But the semantic breaking issues caused by inconsistent behaviors can only be revealed by tests that are limited by the coverage. To bridge the gap, I propose a Semantic Breaking Issue Detector (Sembid) to statically detect the semantic issue across upgrades over exposed APIs. The detector is able to efficiently detect APIs affected by semantic breaking between two versions for individual open-source libraries. Besides the compatibility issues, OSS could be susceptible to software vulnerabilities. SCA tools have been developed to detect and remediate such vulnerabilities. Nontheless, modern SCA tools only provide individual suggestions on how to resolve the vulnerabilities without a holistic solution for global optimization of all dependencies. Upgrading to secure versions is not always straightforward considering that versions are affected by various vulnerabilities and the upgrades may result in new threats, such as the incompatibility risks. To achieve the global optimization and ensure the usable dependency graph, we propose Compatible Remediation of Third-party libraries (Coral ) that accurately wipes out vulnerabilities without affecting benign dependencies with incompatibility issues. Coral takes in an initial dependency graph generated by the Maven client tool and returns the secure and compatible dependency configurations. Integrated Sembid, Coral can detect all types of incompatibilities to further enhance the usability of remediation within individual projects. But for the entire Maven ecosystem beyond individual libraries or projects, remediating the vulnerabilities presents another set of challenges, beyond the scope of SCA. From the study of the vulnerability propagation and evolution in the Maven ecosystem, I found vulnerabilities could be persistent in the ecosystem even after patch versions are released. This issue persists within the Maven ecosystem due to its inherent design limitations, and it will not be resolved simply through the ecosystem’s natural evolution. Hence, I conducted an empirical study to quantitatively evaluate the prevalence and root causes of persistent vulnerabilities. It turned out that the inflexible fixed dependency version specification blocks the propagation of patches along dependency paths. Accordingly, I proposed a tool called Ranger to restore version ranges from inflexible fixed versions and facilitate the automatic remediation of vulnerabilities in the ecosystem. The evaluation substantiates that Ranger could automatically remediate 90% of vulnerabilities within downstream libraries while ensuring compatibility. Besides legacy package manager like Maven, a package manager, Go Module, introduces brand-new mechanisms to resolve the legacy issues that occur in legacy ecosystems. However, benefits come with a price. As a pioneer decentralized package manager, Go Module has implemented several new features, such as allowing fixing commits instead of traditional versions as the dependency version specification and library version indexing systems for decentralized released libraries. These new implementations introduced potential lags of pushing patches of the vulnerable to downstream users, which is referred to as fixing lag. I conducted an empirical study to evaluate how Golang mechanisms affected the life cycles of vulnerabilities in the Golang ecosystem. I developed an algorithm to quantitatively model the life cycles of vulnerabilities. After locating the lagged vulnerabilities, I also submitted the inquiry about the reasons for the lags to the library maintainers. From the study and inquiries, interesting finds and insights were obtained.