Artikel in Tagungsband INPROC-2024-05

Bibliograph.
Daten
Schneider, Jan; Gröger, Christoph; Lutsch, Arnold: The Data Platform Evolution: From Data Warehouses over Data Lakes to Lakehouses.
In: Schwarz, Holger (Hrsg): Proceedings of the 34th GI-Workshop on Foundations of Databases (Grundlagen von Datenbanken), Hirsau, Germany.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik.
CEUR Workshop Proceedings; 3714, S. 67-71, englisch.
CEUR Workshop Proceedings, 2. Juli 2024.
ISSN: 1613-0073.
Artikel in Tagungsband (Workshop-Beitrag).
KörperschaftGesellschaft für Informatik
CR-Klassif.H.3.4 (Information Storage and Retrieval Systems and Software)
H.4.2 (Information Systems Applications Types of Systems)
KeywordsLakehouse; Data Warehouse; Data Lake; Data Management; Data Analytics
Kurzfassung

The continuously increasing availability of data and the growing maturity of data-driven analysis techniques have encouraged enterprises to collect and analyze huge amounts of business-relevant data in order to exploit it for competitive advantages. To facilitate these processes, various platforms for analytical data management have been developed: While data warehouses have traditionally been used by business analysts for reporting and OLAP, data lakes emerged as an alternative concept that also supports advanced analytics. As these two common types of data platforms show rather contrary characteristics and target different user groups and analytical approaches, enterprises usually need to employ both of them, resulting in complex, error-prone and cost-expensive architectures. To address these issues, efforts have recently become apparent to combine features of data warehouses and data lakes into so-called lakehouses, which pursue to serve all kinds of analytics from a single data platform. This paper provides an overview on the evolution of analytical data platforms from data warehouses over data lakes to lakehouses and elaborates on the vision and characteristics of the latter. Furthermore, it addresses the question of what aspects common data lakes are currently missing that prevent them from transitioning to lakehouses.

Volltext und
andere Links
PDF
Abteilung(en)Universität Stuttgart, Institut für Parallele und Verteilte Systeme, Anwendersoftware
Projekt(e)Architekturen &
Technologien für Datenplattformen
Eingabedatum30. August 2024
   Publ. Abteilung   Publ. Institut   Publ. Informatik