Master Thesis MSTR-2020-23

BibliographyGhouri, M Fahad: Learning to Profile: Finding Optimization Opportunities through Machine Learning.
University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology, Master Thesis No. 23 (2020).
57 pages, english.

Software profilers are deeply used in testing and commissioning of application programs, designed for broad areas of use such as industry and research. Most software requirements not only group the functional details of the intended design together but also specify the performance needed in terms of execution time. Standard software profilers estimate execution time performance dynamically, meaning the code needs to be completely executable and the test environment needs to be set up in order to execute the code for testing. Executing code is not always as simple of a step as it may be inferred. Complex programs need complex test setup environments in order to make the execution of code possible. Additionally, the environment needs to be able to generate di erent events and scenarios, to make di erent branches of the program under test to be evaluated, with a performance analysis tool. If the execution time performance of software can be estimated without actually executing the code, it can help developers and testers identify certain code patterns, data structures or integration practices which might introduce performance bugs into the system. In this thesis, we present an approach which is engineered to help testers and developers analyze execution time performance of software code. The approach is unique in its way, as it uses a learning mechanism based on modern machine learning tools to achieve this requirement. Additionally, it tries to patch over the inherent limitations of classical software profilers by providing a static performance analysis tool which can help in environments where executing the code for performance bug detection can be an expensive endeavor. We believe such an addition to performance profiling tools can help pinpoint practices which may be hard to detect with standard pattern matching approaches based on already known vulnerabilities. We present a data generation setup needed for collecting the intended data for training our learning model. We also implement and evaluate our approach which is able to performance profile code tokens and predict its execution time classes. The shortcoming and possible future optimizations are also highlighted as part of this thesis.

Department(s)University of Stuttgart, Institute of Software Technology, Software Lab - Program Analysis
Superviser(s)Pradel, Prof. Michael
Entry dateDecember 16, 2020
   Publ. Computer Science