Bachelor Thesis BCLR-2015-04

BibliographyFranke, Max: Sparse grid datamining with huge datasets.
University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology, Bachelor Thesis (2015).
73 pages, english.
CR-SchemaH.2.8 (Database Applications)
Abstract

Due to the inflated costs of disk space and the prevalence of sensor equipment everywhere, the scientific world is flooded by huge amounts of data. The intention being to somehow benefit from that data, data mining algorithms are used to evaluate those data. As conventional data mining methods scale at least linear with problem size and exponentially with input problem dimension, this poses a great problem as to the computing power required to mine these data. For the testing of data mining algorithms, very few real world reference datasets exist. Using an already in-place toolkit for data mining on sparse grids, the goal of this thesis is to generate one or more real world reference datasets for data mining purposes. For this purpose, multiple weather and photovoltaic datasets were used. It was possible to learn 6-dimensional datasets with 1.2 million data points and obtain a very good prediction of photovoltaic power. Thus, a dataset was obtained to test regression on. For classification, a 9-dimensional dataset with 200~000 data points was generated, which however didn't have overly good results, with a 41% hit rate over 4 classes. Here, further processing of the data will be necessary.

Full text and
other links
PDF (4146227 Bytes)
Department(s)University of Stuttgart, Institute of Parallel and Distributed Systems, Simulation of Large Systems
Superviser(s)Pflüger, Jun.-Prof. Dirk; Pfander, David
Entry dateSeptember 25, 2018
   Publ. Computer Science