Masterarbeit MSTR-2021-108

Bibliograph.
Daten
Satkunarajan, Jena: Visual analysis of news stories using neural language models.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik, Masterarbeit Nr. 108 (2021).
115 Seiten, englisch.
Kurzfassung

With the introduction of computers of varying sizes in the everyday life of the majority of the world population, we have seen a rapid increase in the amount of textual contents produced and distributed across the digitized globe. Among the insurmountable amounts of text found in the internet, news articles are of particular interest to many journalists, scientists and any other groups interested in the events captivating the public interest. As hundreds and thousands of online news providers report about the important and less important topics, it becomes an almost impossible challenge to gather and collect the valuable knowledge provided by all these sources. To gain an overview over the general happenings or to learn about specific topics thus becomes the task of identifying the novel information among an ocean of recurring, duplicate and rewritten stories. This thesis presents a combined approach to interactively visualise the novel content and the evolution of topics in news story corpora. A prototype framework is developed that utilises the GPT-2 transformer neural network based language model to assess the novelty of textual contents. Building on the resulting novelty scores, the textual contents of articles are visually highlighted to emphasise the novelty of the content. The novel article content is presented in multiple views, providing increasing levels of aggregation as the underlying article data grows in size. Employing a term weighting scheme incorporating the novelty scores, the ensuing document vectors are utilised to model the topics of the article corpus over time. The resulting, time-dependant topic clusters are presented in a multi-layered visualisation approach, providing multiple perspectives on the evolution of topics over time. The different visualisations and functionalities are combined into an interactive framework with multiple, coordinated views.

Volltext und
andere Links
Volltext
Abteilung(en)Universität Stuttgart, Institut für Visualisierung und Interaktive Systeme, Visualisierung und Interaktive Systeme
BetreuerErtl, Prof. Thomas; Knittel, Johannes
Eingabedatum28. Oktober 2022
   Publ. Informatik