|Zendler, Ulrich: How word-embedding methods improve information extraction and can be used for multilingual approaches. |
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik, Masterarbeit Nr. 102 (2018).
111 Seiten, englisch.
Expanding entity sets and extracting relations are key tasks in natural language processing (NLP), which is accomplished in various approaches. Recent successful attempts are all using word-embeddings like the ones presented by Mikolov et al. While most work concentrates on how to improve these tasks in general without considering a specific domain, it is of interest how to achieve even higher precisions when focusing on a specific domain and optimizing the methods towards a single purpose. Therefore this thesis suggests methods and adjustments to optimize the proposals for entity set expansion for the domain of drugs. While this is the main purpose of this thesis, it will also present a novel idea, how to improve the precision in relation extraction by using word-embeddings, which could be combined with existing successful relation extraction methods. And finally another key aspect of many international companies is tagged, by presenting a solution for multilingual information extraction system (IES), which is capable of preprocessing text of multiple languages, expanding entity sets independent of the language used and extracting relations on the texts.
|Abteilung(en)||Universität Stuttgart, Institut für Maschinelle Sprachverarbeitung|
|Betreuer||Padó, Prof. Sebastian, He, Dr. Yifan|
|Eingabedatum||19. Juni 2019|