Article in Proceedings INPROC-2019-14

BibliographyGiebler, Corinna; Gröger, Christoph; Hoos, Eva; Schwarz, Holger: Leveraging the Data Lake - Current State and Challenges.
In: Proceedings of the 21st International Conference on Big Data Analytics and Knowledge Discovery (DaWaK'19).
University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology.
pp. 1-10, german.
Springer Nature, August 26, 2019.
Article in Proceedings (Conference Paper).
CR-SchemaH.2.4 (Database Management Systems)
H.2.8 (Database Applications)
KeywordsData Lakes, State of the Art, Challenges
Abstract

The digital transformation leads to massive amounts of heterogeneous data challenging traditional data warehouse solutions in enterprises. In order to exploit these complex data for competitive advantages, the data lake recently emerged as a concept for more flexible and powerful data analytics. However, existing literature on data lakes is rather vague and incomplete, and the various realization approaches that have been proposed neither cover all aspects of data lakes nor do they provide a comprehensive design and realization strategy. Hence, enterprises face multiple challenges when building data lakes. To address these shortcomings, we investigate existing data lake literature and discuss various design and realization aspects for data lakes, such as governance or data models. Based on these insights, we identify challenges and research gaps concerning (1) data lake architecture, (2) data lake governance, and (3) a comprehensive strategy to realize data lakes. These challenges still need to be addressed to successfully leverage the data lake in practice.

ContactSenden Sie eine E-Mail an Corinna.Giebler@ipvs.uni-stuttgart.de
Department(s)University of Stuttgart, Institute of Parallel and Distributed Systems, Applications of Parallel and Distributed Systems
Entry dateJuly 4, 2019
   Publ. Department   Publ. Institute   Publ. Computer Science