Bachelorarbeit BCLR-2020-08

Roman, Alin: Generating Tweets Conditioned by Emotion and Topic.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik, Bachelorarbeit Nr. 8 (2020).
81 Seiten, englisch.

Social media platforms, such as Twitter, have become an important way of communication and have created new opportunities for Neural Language Generation. We focus on the generation of text conditioned by emotion and topic passed as input. The main purpose of this thesis is to answer to following 2 research questions: "Is it possible to distinguish the generated tweets from the real ones?" and "Will the topic and emotion actually be correct perceived?". We collect a corpus of tweets using different search queries.. The tweets are classified by topic using specific hashtags and terms. The emotion classification on the tweets has been performed after an analysis on different classifier models has been conducted. We also presented different approaches for the language model build and evaluated several generations. Our results show that the system is capable of generating sequences conditioned by emotion and topic, however not in a vast percentage. An average of 30% of the evaluated sequences have corresponded to the imputed requirements. When using short sequences, the system can mislead the reader when attempting to identify the real tweet from the generated sequence. Finally, different improvement suggestions were described, in order to increase the performance of the language models.

Abteilung(en)Universität Stuttgart, Institut für Maschinelle Sprachverarbeitung
BetreuerPadó, Prof. Sebastian, Klinger, Dr. Roman
Eingabedatum9. Juni 2020
   Publ. Informatik