Artikel in Tagungsband INPROC-2014-49

Gröger, Christoph; Schwarz, Holger; Mitschang, Bernhard: The Deep Data Warehouse. Link-based Integration and Enrichment of Warehouse Data and Unstructured Content.
In: Proceedings of the 18th IEEE International Enterprise Distributed Object Computing Conference (EDOC), 01-05 September, 2014, Ulm, Germany.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik.
IEEE, 1. September 2014.
Artikel in Tagungsband (Konferenz-Beitrag).
CR-Klassif.H.2.7 (Database Administration)

Data warehouses are at the core of enterprise IT and enable the efficient storage and analysis of structured data. Besides, unstructured content, e.g., emails and documents, constitutes more than half of the entire enterprise data and contains a lot of implicit knowledge about warehouse entities. Thus, holistic ana-lytics require the integration of structured warehouse data and unstructured content to generate novel insights. These insights can also be used to enrich the integrated data and to create a new basis for further analytics. Existing integration approaches only support a limited range of analytical applications and require the costly adaptation of the warehouse schema. In this paper, we present the Deep Data Warehouse (DeepDWH), a novel type of data warehouse based on the flexible integration and enrichment of warehouse data and unstructured content, addressing the variety challenge of Big Data. It relies on information-rich in-stance-level links between warehouse elements and content items, which are represented in a graph-oriented structure. Neither adaptations of the existing warehouse nor the design of an overall federated schema are required. We design a conceptual linking model and develop a logical schema for links based on a property graph. As a proof of concept, we present a prototypical imple-mentation of the DeepDWH including a link store based on a graph database.

Abteilung(en)Universität Stuttgart, Institut für Parallele und Verteilte Systeme, Anwendersoftware
Eingabedatum7. Juli 2014
   Publ. Abteilung   Publ. Institut   Publ. Informatik