Bachelorarbeit BCLR-2017-42

Luz, Maximilian: Subspace-optimal data mining on spatially adaptive sparse grids.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik, Bachelorarbeit (2017).
143 Seiten, englisch.
CR-Klassif.F.2.1 (Numerical Algorithms and Problems)
G.1.0 (Numerical Analysis General)
G.1.2 (Numerical Analysis Approximation)

Continued improvements in technology lead to an ever-growing amount of data generated, for example, by scientific measurements and simulations. Data-mining is required to gain useful knowledge from this data, however, can be challenging especially due to the size and dimensionality of these problems. The use of regular grids for such applications is often limited by the curse of dimensionality, a phrase used to describe an exponential dependency of the computational complexity of a problem on the dimensionality of this problem. For many higher-dimensional problems, e.g. with 28 dimensions, regular grids cannot be used to compute results with the desired accuracy in a reasonable amount of time, even if the memory required to store and process them is available. With spatially adaptive sparse grids, this problem can be overcome, as they lessen the influence of the dimensionality on the size of the grid, furthermore, they have been successfully applied for many tasks, including regression on large data sets. However, the currently preferred and in practice highly performant streaming-algorithm for regression on spatially adaptive sparse grids employs many unnecessary operations to effectively utilize modern parallel computer architectures, such as graphics processing units (GPUs). In this thesis, we show that the implementation of a by computational complexity more promising subspace-linear algorithm on the GPU is able to out-perform the currently preferred streaming-algorithm on many scenarios, even though the this algorithm does not utilize modern architectures as well as the streaming-algorithm. Furthermore, we explore the construction of a new algorithm by combining both, streaming- and subspace-linear algorithm, which aims to process each subgrid of the grid with the algorithm deemed most efficient for its structure. We evaluated both of our algorithms against the highly optimized implementation of the streaming-algorithm provided in the SG++ framework, and could indeed show speed-ups for both algorithms, depending on the experiments.

Volltext und
andere Links
PDF (8827096 Bytes)
Abteilung(en)Universität Stuttgart, Institut für Parallele und Verteilte Systeme, Simulation großer Systeme
BetreuerPflüger, Jun.-Prof. Dirk; Pfander, David
Eingabedatum28. September 2018
   Publ. Informatik