Artikel in Tagungsband INPROC-2009-29

Kaiser, Fabian; Schwarz, Holger; Jakob, Mihály: Using Wikipedia-based conceptual contexts to calculate document similarity.
In: ICDS2009: Proceedings of the 3rd International Conference on Digital Society.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik.
S. 322-327, englisch.
Cancun, Mexico: IEEE Computer Society, Februar 2009.
Artikel in Tagungsband (Konferenz-Beitrag).
CR-Klassif.H.3 (Information Storage and Retrieval)
H.3.3 (Information Search and Retrieval)

Rating the similarity of two or more text documents is an essential task in information retrieval. For example, document similarity can be used to rank search engine results, cluster documents according to topics etc. A major challenge in calculating document similarity originates from the fact that two documents can have the same topic or even mean the same, while they use different wording to describe the content. A sophisticated algorithm therefore will not directly operate on the texts but will have to find a more abstract representation that captures the texts' meaning. In this paper, we propose a novel approach for calculating the similarity of text documents. It builds on conceptual contexts that are derived from content and structure of the Wikipedia hypertext corpus.

Volltext und
andere Links
IEEE Xplore
Abteilung(en)Universität Stuttgart, Institut für Parallele und Verteilte Systeme, Anwendersoftware
Eingabedatum9. März 2009
   Publ. Abteilung   Publ. Institut   Publ. Informatik