Article in Proceedings INPROC-2008-150

BibliographyKassner, Laura; Nastase, Vivi; Strube, Michael: Acquiring a Taxonomy from the German Wikipedia.
In: Nicoletta Calzolari (ed.): Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08).
University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology.
pp. 1-4, english.
European Language Resources Association (ELRA), May 2008.
ISBN: 2-9517408-4-0.
Article in Proceedings (Conference Paper).
CorporationConference on Language Resources and Evaluation (LREC)
CR-SchemaI.2.4 (Knowledge Representation Formalisms and Methods)
I.2.7 (Natural Language Processing)
Keywordstaxonomy; ontology; taxonomy generation; ontology generation; semantic network; Wikipedia; WordNet; GermaNet; multilinguality
Abstract

This paper presents the process of acquiring a large, domain independent, taxonomy from the German Wikipedia. We build upon a previously implemented platform that extracts a semantic network and taxonomy from the English version of theWikipedia. We describe two accomplishments of our work: the semantic network for the German language in which isa links are identifed and annotated, and an expansion of the platform for easy adaptation for a new language. We identify the platform's strengths and shortcomings, which stem from the scarcity of free processing resources for languages other than English. We show that the taxonomy induction process is highly reliable - evaluated against the German version of WordNet, GermaNet, the resource obtained shows an accuracy of 83.34%.

Full text and
other links
PDF (355462 Bytes)
LREC-Proceedings
Contactlaura.kassner@gsame.uni-stuttgart.de
Department(s)University of Stuttgart, Institute of Parallel and Distributed Systems, Applications of Parallel and Distributed Systems
Entry dateApril 8, 2013
   Publ. Department   Publ. Institute   Publ. Computer Science