Artikel in Tagungsband INPROC-2017-37

Bibliograph.
Daten
Peherstorfer, Benjamin; Pflüger, Dirk; Bungartz, Hans-Joachim: Density Estimation with Adaptive Sparse Grids for Large Data Sets.
In: Proceedings of the 2014 SIAM International Conference on Data Mining.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik.
S. 443-451, englisch.
SIAM, Januar 2017.
DOI: 10.1137/1.9781611973440.51.
Artikel in Tagungsband (Konferenz-Beitrag).
CR-Klassif.I.2 (Artificial Intelligence)
I.6 (Simulation and Modeling)
Keywordssparse grids; density estimation; big data
Kurzfassung

Nonparametric density estimation is a fundamental problem of statistics and data mining. Even though kernel density estimation is the most widely used method, its performance highly depends on the choice of the kernel bandwidth, and it can become computationally expensive for large data sets. We present an adaptive sparse-grid-based density estimation method which discretizes the estimated density function on basis functions centered at grid points rather than on kernels centered at the data points. Thus, the costs of evaluating the estimated density function are independent from the number of data points. We give details on how to estimate density functions on sparse grids and develop a cross validation technique for the parameter selection. We show numerical results to confirm that our sparse-grid-based method is well-suited for large data sets, and, finally, employ our method for the classification of astronomical objects to demonstrate that it is competitive to current kernel-based density estimation approaches with respect to classification accuracy and runtime.

Abteilung(en)Universität Stuttgart, Institut für Parallele und Verteilte Systeme, Simulation großer Systeme
Eingabedatum3. Juli 2017
   Publ. Institut   Publ. Informatik