Bachelor Thesis BCLR-2017-94

BibliographyXing, Shu: Open Information Extraction for Relation Identification of Biomedical Entities.
University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology, Bachelor Thesis No. 94 (2017).
71 pages, english.
CR-SchemaI.2.7 (Natural Language Processing)
I.5.4 (Pattern Recognition Applications)
J.3 (Life and Medical Sciences)
Abstract

Relation Extraction plays an significant role in generating knowledge from biomedical literature. Yet most of the present state-of-the-art automatic approaches are based on (semi-)supervised machine learning techniques, modeling the RE task as classification problems. Such approaches are usually restricted by training data of specific domains and pre-defined relation types. Along with shifting of the biomedical research emphasis from individual domains to whole system, the novel RE paradigm Open Information Extraction gains more attention, in which relations of any type could be identified. This thesis aims to help study how to apply the OpenIE into identification of relations in biomedicine area and how well it performs. For this purpose, we built an automated pipeline by combining three existing NER tools as well as an OpenIE system with two filters, and evaluated it on a dataset from MEDLINE. In this experiment, we focused on relations between chemicals, diseases and genes/proteins. The results show that this OpenIE-based pipeline can extract relations without limitations of relation types, and it can acquire a high precision (76.92%) but the F-score(12.66%) remains low due to the low recall. Conclusively, the approaches based on OpenIE paradigm can be further improved, especially in terms of recall. It is possible to achieve a higher accuracy by improving performances of the current OpenIE systems as well as the biomedical NERs.

Full text and
other links
PDF (4187015 Bytes)
Access to students' publications restricted to the faculty due to current privacy regulations
Department(s)University of Stuttgart, Institute for Natural Language Processing
Superviser(s)Padó, Prof. Sebastian, Klinger, Dr. Roman
Entry dateDecember 3, 2018
   Publ. Computer Science