Master Thesis MSTR-2022-06

Bibliography	Sabbatino, Valentino: Emotion Analysis for Nonsense Words. University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology, Master Thesis No. 6 (2022). 137 pages, english.
Abstract	Behind the meaning of a word and its components that convey that meaning, a variety of associations are conveyed, including an emotion that a word evokes. Different words can convey emotions to varying degrees. The question that now arises for us here is whether determining the intensity of an emotion is related to the fact that it is a real word that we know and with which we associate meaning and emotion or whether other properties affect it. For this purpose, in our work, we focus on nonsense words and investigate the relationship between nonsense words and emotions at the sublexical level, i.e., at both character and phoneme levels. Therefore, we conducted a crowdsourcing study and created a dataset of 340 words, including 272 nonsense words and 68 real words annotated for the intensity of joy, sadness, anger, disgust, fear, and surprise. We use a technique called Best-Worst scaling (BWS), which improves annotation consistency and provides reliable, fine-grained scores for the intensity of emotions. We found that people associate a certain degree of emotion with these nonsense words and that patterns can be identified at the phoneme level that might affect emotion intensity. We have developed several regression models based on a CNN-BiLSTM architecture and conducted experiments to determine whether computational models can adequately predict emotion intensity for a nonsense word and a given emotion and whether the patterns we found are also learned by the models, and what other patterns these models appear to learn. Here, we train and test models on nonsense words and further investigate whether this can be generalized to real words and whether models trained on real words (words from the NRC-EIL) also make good predictions for nonsense words as well as real words. Furthermore, we use these models to investigate whether the phoneme representation of words leads to better performance than considering only the characters of a word. Our results show that the CNN-BiLSTM regression models predict emotion intensities sufficiently well. The models using the NRC-EIL data and character rather than phoneme inputs perform particularly well for our nonsense word test set and the NRC-EIL test set. The better performance of the models using character inputs suggests that our models are better at identifying features of a word that are relevant to determining emotion intensity at the character level than at the phoneme level. This also indicates that the appearance of (nonsense) words has a greater impact on the evaluation of emotion intensity than does an assumed pronunciation.
Department(s)	University of Stuttgart, Institute for Natural Language Processing
Superviser(s)	Klinger, PD Dr. Roman; Schweitzer, Dr. Antje
Entry date	April 28, 2022

Publ. Computer Science