Artikel in Tagungsband INPROC-2023-06

Bibliograph.
Daten
Schneider, Jan; Gröger, Christoph; Lutsch, Arnold; Schwarz, Holger; Mitschang, Bernhard: Assessing the Lakehouse: Analysis, Requirements and Definition.
In: Filipe, Joaquim (Hrsg); Smialek, Michal (Hrsg); Brodsky, Alexander (Hrsg); Hammoudi, Slimane (Hrsg): Proceedings of the 25th International Conference on Enterprise Information Systems, ICEIS 2023, Volume 1, Prague, Czech Republic, April 24-26, 2023.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik.
S. 44-56, englisch.
Prague: SciTePress, 23. Mai 2023.
ISBN: 978-989-758-648-4; ISSN: 2184-4992; DOI: 10.5220/0011840500003467.
Artikel in Tagungsband (Konferenz-Beitrag).
CR-Klassif.H.2.4 (Database Management Systems)
H.2.7 (Database Administration)
H.2.8 (Database Applications)
KeywordsLakehouse; Data Warehouse; Data Lake; Data Management; Data Analytics
Kurzfassung

The digital transformation opens new opportunities for enterprises to optimize their business processes by applying data-driven analysis techniques. For storing and organizing the required huge amounts of data, different types of data platforms have been employed in the past, with data warehouses and data lakes being the most prominent ones. Since they possess rather contrary characteristics and address different types of analytics, companies typically utilize both of them, leading to complex architectures with replicated data and slow analytical processes. To counter these issues, vendors have recently been making efforts to break the boundaries and to combine features of both worlds into integrated data platforms. Such systems are commonly called lakehouses and promise to simplify enterprise analytics architectures by serving all kinds of analytical workloads from a single platform. However, it remains unclear how lakehouses can be characterized, since existing definitions focus al most arbitrarily on individual architectural or functional aspects and are often driven by marketing. In this paper, we assess prevalent definitions for lakehouses and finally propose a new definition, from which several technical requirements for lakehouses are derived. We apply these requirements to several popular data management tools, such as Delta Lake, Snowflake and Dremio in order to evaluate whether they enable the construction of lakehouses.

Volltext und
andere Links
Publication
DOI
Kontaktjan.schneider@ipvs.uni-stuttgart.de
Abteilung(en)Universität Stuttgart, Institut für Parallele und Verteilte Systeme, Anwendersoftware
Projekt(e)Data Platform Architectures & Technologies
Eingabedatum8. September 2023
   Publ. Abteilung   Publ. Institut   Publ. Informatik