Artikel in Tagungsband INPROC-2020-45

Bibliograph.
Daten
Eichler, Rebecca; Giebler, Corinna; Gröger, Christoph; Schwarz, Holger; Mitschang, Bernhard: HANDLE - A Generic Metadata Model for Data Lakes.
In: Song, Min (Hrsg); Song, Il-Yeol (Hrsg); Kotsis, Gabriele (Hrsg); Tjoa, A Min (Hrsg); Khalil, Ismail (Hrsg): Big Data Analytics and Knowledge Discovery.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik.
Lecture Notes in Computer Science; 12393, S. 73-88, englisch.
Springer Nature Switzerland AG, 11. September 2020.
DOI: https://doi.org/10.1007/978-3-030-59065-9_7.
Artikel in Tagungsband (Konferenz-Beitrag).
CR-Klassif.H.2 (Database Management)
KeywordsMetadata management; Metadata model; Data lake
Kurzfassung

The substantial increase in generated data induced the development of new concepts such as the data lake. A data lake is a large storage repository designed to enable flexible extraction of the data's value. A key aspect of exploiting data value in data lakes is the collection and management of metadata. To store and handle the metadata, a generic metadata model is required that can reflect metadata of any potential metadata management use case, e.g., data versioning or data lineage. However, an evaluation of existent metadata models yields that none so far are sufficiently generic. In this work, we present HANDLE, a generic metadata model for data lakes, which supports the flexible integration of metadata, data lake zones, metadata on various granular levels, and any metadata categorization. With these capabilities HANDLE enables comprehensive metadata management in data lakes. We show HANDLE's feasibility through the application to an exemplary access-use-case and a prototypical implementation. A comparison with existent models yields that HANDLE can reflect the same information and provides additional capabilities needed for metadata management in data lakes.

Volltext und
andere Links
PDF (2631933 Bytes)
Abteilung(en)Universität Stuttgart, Institut für Parallele und Verteilte Systeme, Anwendersoftware
Projekt(e)MetaMan
Eingabedatum28. September 2020
   Publ. Institut   Publ. Informatik