Master Thesis MSTR-2019-76

BibliographyMuck, Michael: User-Defined Visual Shaping of Document Collections.
University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology, Master Thesis No. 76 (2019).
73 pages, english.
Abstract

Abstract

In the internet age there is a large amount of information available and the major task is to provide only the information which is relevant for the user. With the choice of different sources and customization options such as changing a filter in a news feed, a user is able to filter the flood of information. But what if the available sources or predefined categories do not fit? This thesis describes an approach to add an abstraction layer between the user and the information, which allows a higher degree of individualization. The approach first shapes a given text corpus in user-defined topics making it easier for a user to find documents of her interest. It combines the creativity and common knowledge of the human mind with the computing power of machines. With a different understanding of topics which is more oriented on interests than on word distributions and a new ensemble of unsupervised and supervised machine learning algorithms it sets itself apart from state-of-the-art assisted and unassisted topic modeling systems. During the interactive topic modeling process the user is put in a feedback loop which helps her to find and correct mistakes and assess the quality of the modeling output. The modeling output consists of the document-specific probabilities of each topic. It is utilized in a recommendation system which is adjustable for the user, an exploration of the document collection with diagrams and offers the possibility to create compound queries. The results and findings are evaluated quantitatively and discussed. The analysis tries to reveal the limitations but also the high potential of the presented concept. Finally, a conclusion of the thesis as well as a starting point for further studies is given to ensure that the research in this field is followed up.

Department(s)University of Stuttgart, Institute of Visualisation and Interactive Systems, Visualisation and Interactive Systems
Superviser(s)Ertl, Prof. Thomas; Knabben, Moritz; Knittel, Johannes
Entry dateFebruary 19, 2020
   Publ. Computer Science