Masterarbeit MSTR-2024-42

Bibliograph.
Daten
Merkel, Manuel: Shaping the future : the transformative potential of AI in computer science VET programs.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik, Masterarbeit Nr. 42 (2024).
128 Seiten, englisch.
Kurzfassung

The recent surge in the field of generative artificial intelligence (GenAI) has the potential to bring about transformative changes across a range of sectors, including software engineering and education. In light of the considerable power of these tools and their largely freely available accessibility, there has been a notable discourse surrounding their potential integration within the domain of vocational education and training (VET) in computer science, with the possibility of substantial implications for the landscape of learning and programming. As GenAI tools, such as OpenAI's ChatGPT, are increasingly utilised in software engineering, it becomes imperative to understand the impact of these technologies on the next generation of software developers. This study was thus designed to examine the transformative impact of ChatGPT (GPT-4o model) on computer science VET programmes, with a specific focus on the influence of GenAI on vocational students engaged in the process of becoming experts in the field of software development. The study employed a twofold methodological approach, comprising web scraping and data mining from LeetCode, with the objective of comparing the software quality produced by LeetCode users with that generated by GPT-4o. Additionally, a quasi-experiment was conducted with 27 vocational students to ascertain whether ChatGPT facilitates or impedes the students' software development process. In order to gain insight into these matters, this study addresses three key research questions: (1) whether GPT-4o produces software of superior quality to that produced by humans, (2) how ChatGPT assists vocational astudents in completing software engineering tasks, and (3) how vocational students interact with ChatGPT and the challenges they face. In order to respond to the first research question, data were gathered on 2,321 LeetCode coding problems. A total of 57,238 validated user solutions were collated, while a total of 2,086 valid solutions were generated by GPT-4o. A total of 958,574 lines of code (LOC) were analysed through the SonarQube and LeetCode APIs with the objective of evaluating four quality metrics: (1) code quality (number of code smells per LOC), (2) code understandability (cognitive complexity score per LOC), (3) time behaviour (runtime rank), and (4) resource utilisation (memory usage rank). The findings indicate that GPT-4o does not present a considerable impediment to code quality, understandability, or runtime when generating code on a limited scale. Notably, the generated code even exhibits significantly lower values across all three metrics in comparison to the user-written code. However, no significantly superior values were observed for the generated code in terms of memory usage in comparison to the user code, which contravened the expectations. Furthermore, it was demonstrated that GPT-4o encountered challenges in generalising to problems that were not included in the training data set. Following an initial investigation into the quality of the generated code in comparison to that of users of LeetCode, a further investigation was conducted into the use of ChatGPT by 27 vocational students in a quasi-experimental design. As part of the investigation, a code base with a shopping cart theme was created, in which the participants were tasked to complete four typical software engineering tasks within a timeframe of 120 minutes. Consequently, the following properties were identified: (1) effectiveness (proportion of successfully completed tasks), (2) comprehension score for code and task (proportion of correctly answered questions), (3) cognitive load (dimensions of the NASA TLX), (4) code quality (number of code smells per LOC), (5) code understandability (cognitive complexity score per LOC), (6) application area of GenAI in the solution of software engineering tasks, and (7) perceived challenges in the application of GenAI. The results are dichotomous in nature. On the one hand, increased effectiveness, improved software quality and, in some cases, lower cognitive load when using ChatGPT were observed among vocational students. However, on the other hand, these results are tempered by a significant educational pitfall: the equivalent or lower comprehension score for code and task for participants who used ChatGPT to solve tasks, which undermines the positive outcomes. Furthermore, it was shown that the mere generation of solutions without adaptation of the information may already constitute an integrated application strategy among vocational students. The overuse of ChatGPT may have the disadvantage of losing deeper knowledge about the task, the code, and the context of the code. The potential for the powerful ChatGPT tool to exert a positive influence on vocational students is considerable, yet the consequences of inadequate use - as demonstrated herein - may lead to a reduction in educational quality. Consequently, further scientific research and insights are required in order to gain a deeper understanding of the impact of ChatGPT on the educational landscape.

Volltext und
andere Links
Volltext
Abteilung(en)Universität Stuttgart, Institut für Softwaretechnologie, Empirisches Software Engineering
BetreuerWagner, Prof. Stefan; Dörpinghaus, Dr. Jens; Munoz Baron, Marvin
Eingabedatum27. November 2024
   Publ. Informatik