Diploma Thesis DIP-3399

BibliographySchnabel, Tobias: Towards Robust Cross-Domain Domain Adaptation for Part-of-Speech Tagging.
University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology, Diploma Thesis No. 3399 (2013).
75 pages, english.
CR-SchemaI.2.7 (Natural Language Processing)

Most systems in natural language processing experience a substantial loss in performance when the data that the system is tested with differs significantly from the data that the system has been trained on. Systems for part-of-speech (POS) tagging, for example, are typically trained on newspaper texts but are often applied to texts of other domains such as medical texts. Domain adaptation (DA) techniques seek to improve such systems so that they are able to achieve consistently good performance - independent of the domains at hand.

We investigate the robustness of domain adaptation representations and methods across target domains using part-of-speech tagging as a case study. We find that there is no single representation and method that works equally well for all target domains. In particular, there are large differences between target domains that are more similar to the source domain and those that are less similar.

Full text and
other links
PDF (827421 Bytes)
Department(s)University of Stuttgart, Institute for Natural Language Processing
Superviser(s)Prof. Dr. Hinrich Schütze
Entry dateMay 2, 2013
   Publ. Computer Science