Bibliography | Chow, Yiu Wai: Bimodal taint analysis for detecting unusual parameter-sink flows. University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology, Master Thesis No. 79 (2022). 44 pages, english.
|
Abstract | Finding vulnerabilities is a crucial activity, and automated techniques for this purpose are in high demand. For example, the Node Package Manager (npm) offers a massive amount of software packages, which get installed and used by millions of developers each day. Because of the dense network of dependencies between npm packages, vulnerabilities in individual packages may easily affect a wide range of software. Taint analysis is a powerful tool to detect such vulnerabilities. However, it is challenging to clearly define a problematic flow. A possible way to identify problematic flows is to incorporate natural language information like code convention and informal knowledge into the analysis. For example, a user might not find it surprising that a parameter named cmd of a function named execCommand is open to command injection. Thus this flow is likely unproblematic as the user will not pass untrusted data to cmd. In contrast, a user might not expect a parameter named value of a function named staticSetConfig to be vulnerable to command injection. Thus this flow is likely problematic as the user might pass untrusted data to value, since the natural language information from the parameter and function name suggests a different security context. To effectively exploit the implicit information in code, we introduce a bimodal taint analysis tool, Fluffy. The first modality is code: Fluffy uses a mining analysis implemented in CodeQL to find examples of flows from parameters to vulnerable sinks. The second modality is natural language: Fluffy uses a machine learning model that, based on a corpus of such examples, learns how to distinguish unexpected flows from expected flows using natural language information. We instantiate four neural models, offering different trade-offs between manual efforts required and accuracy of predictions. In our evaluation, Fluffy is able to achieve a F1-score of 0.85 or more on four common vulnerability types. In addition, Fluffy is able to flag eleven previously unknown vulnerabilities in real-life projects, of which six are confirmed.
|