Bachelor Thesis BCLR-2022-19

BibliographySmponias, Philipp: A Systematic Categorization and Comparison of Approaches and Tools for the Quality Assurance of Jupyter Notebooks.
University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology, Bachelor Thesis No. 19 (2022).
49 pages, english.

Jupyter Notebooks are a widely used medium for exploring data, teaching and also as a starting point to realize one’s ideas. Jupyter Notebooks as a medium is also very reminiscent of the Literate Programming concept. In which documentation and code is very close to each other and can be written in the same document at the same time. [Knu84] However, it has been shown in the past that Jupyter Notebooks may have a problem. This is because an analysis of several thousand Jupyter Notebooks on GitHub showed, that very few are directly reproducible, poorly documented or simply buggy. [PMBF21] So that this problem can be limited, it is the goal of this work to look for tools and procedures, which can ensure or increase quality in Jupyter Notebooks. These are classified and also checked how user-friendly they are. The installation was also tested. For this a systematic approach was chosen and tried to proceed to a multivocal rapid review. For this not only white literature was used, but also grey literature. A Thinking Aloud Study was conducted to check the usability. Results show that there are indeed some tools for quality assurance, but they are not all published. Some are prototypes according to the state of the literature, but they are not published. However, another finding is that some of these tools either do not work or have insufficient documentation, making them almost unusable. Standardized methods on how best to proceed are kept very generic and differ very little from the software development approach.

Department(s)University of Stuttgart, Institute of Software Technology, Empirical Software Engineering
Superviser(s)Wagner, Prof. Stefan; Bogner, Dr. Justus
Entry dateOctober 21, 2022
   Publ. Computer Science