Master Thesis MSTR-2018-102

BibliographyZendler, Ulrich: How word-embedding methods improve information extraction and can be used for multilingual approaches.
University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology, Master Thesis No. 102 (2018).
111 pages, english.
Abstract

Expanding entity sets and extracting relations are key tasks in natural language processing (NLP), which is accomplished in various approaches. Recent successful attempts are all using word-embeddings like the ones presented by Mikolov et al. While most work concentrates on how to improve these tasks in general without considering a specific domain, it is of interest how to achieve even higher precisions when focusing on a specific domain and optimizing the methods towards a single purpose. Therefore this thesis suggests methods and adjustments to optimize the proposals for entity set expansion for the domain of drugs. While this is the main purpose of this thesis, it will also present a novel idea, how to improve the precision in relation extraction by using word-embeddings, which could be combined with existing successful relation extraction methods. And finally another key aspect of many international companies is tagged, by presenting a solution for multilingual information extraction system (IES), which is capable of preprocessing text of multiple languages, expanding entity sets independent of the language used and extracting relations on the texts.

Full text and
other links
Volltext
Department(s)University of Stuttgart, Institute for Natural Language Processing
Superviser(s)Padó, Prof. Sebastian, He, Dr. Yifan
Entry dateJune 19, 2019
   Publ. Computer Science