Bachelorarbeit BCLR-2023-111

Bibliograph.
Daten
Lautenschlager, Jonathan: Detection of non-recorded word senses.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik, Bachelorarbeit Nr. 111 (2023).
43 Seiten, englisch.
Kurzfassung

Dictionaries cover the senses of words at a certain point in time. If a word gains a new sense or loses an old one in a speaker community, its dictionary entry may become outdated. The aim of the thesis will be to investigate systems that discover missing dictionary entries in modern English and Swedish dictionaries by comparing target word usages from reference corpora to the dictionary entries for the target word. The basic task is to decide whether a word usage is covered by any sense in the dictionary entry of the target word or not. For this, we use a pre-trained Word-in-Context embedder that allows us to model this task in a few-shot scenario. Additionally, we use human annotations to tune and evaluate our models. Compared to a random sample from a corpus, our model is able to significantly increase the number of uncovered word usages.

Volltext und
andere Links
Volltext
Abteilung(en)Universität Stuttgart, Institut für Maschinelle Sprachverarbeitung
BetreuerSchulte im Walde, Prof. Sabine; Schlechtweg, Dr. Dominik; Hengchen, Dr. Simon
Eingabedatum16. Oktober 2024
   Publ. Informatik