Diplomarbeit DIP-3399

Bibliograph.
Daten
Schnabel, Tobias: Towards Robust Cross-Domain Domain Adaptation for Part-of-Speech Tagging.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik, Diplomarbeit Nr. 3399 (2013).
75 Seiten, englisch.
CR-Klassif.I.2.7 (Natural Language Processing)
Kurzfassung

Most systems in natural language processing experience a substantial loss in performance when the data that the system is tested with differs significantly from the data that the system has been trained on. Systems for part-of-speech (POS) tagging, for example, are typically trained on newspaper texts but are often applied to texts of other domains such as medical texts. Domain adaptation (DA) techniques seek to improve such systems so that they are able to achieve consistently good performance - independent of the domains at hand.

We investigate the robustness of domain adaptation representations and methods across target domains using part-of-speech tagging as a case study. We find that there is no single representation and method that works equally well for all target domains. In particular, there are large differences between target domains that are more similar to the source domain and those that are less similar.

Volltext und
andere Links
PDF (827421 Bytes)
Abteilung(en)Universität Stuttgart, Institut für Maschinelle Sprachverarbeitung
BetreuerProf. Dr. Hinrich Schütze
Eingabedatum2. Mai 2013
   Publ. Informatik