Article in Proceedings INPROC-2008-148

BibliographyBungartz, Hans-Joachim; Pflüger, Dirk; Zimmer, Stefan: Adaptive Sparse Grid Techniques for Data Mining.
In: Bock, H.G. (ed.); Kostina, E. (ed.); Hoang, X.P. (ed.); Rannacher, R. (ed.): Modelling, Simulation and Optimization of Complex Processes 2006.
University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology.
pp. 121-130, english.
Berlin, Heidelberg: Springer-Verlag, July 2008.
ISBN: 978-3540794080.
Article in Proceedings (Conference Paper).
CorporationThird International Conference on High Performance Scientific Computing, March 6-10, 2006
CR-SchemaG.1.2 (Numerical Analysis Approximation)
H.2.8 (Database Applications)
Abstract

It was shown in [GaGT01] that the task of classification in data mining can be tackled by employing ansatz functions associated to grid points in the (often high dimensional) feature-space rather than using data-centered ansatz functions. To cope with the curse of dimensionality, sparse grids have been used. Based on this approach we propose an efficient finite-element-like discretization technique for classification instead of the combination technique used in [GaGT01]. The main goal of our method is to make use of adaptivity to further reduce the number of grid points needed. Employing adaptivity in classification is reasonable as the target function contains smooth regions as well as rough ones. Regarding implementational issues we present an algorithm for the fast multiplication of the vector of unknowns with the coefficient matrix. We give an example for the adaptive selection of grid points and show that special care has to be taken regarding the boundary values, as adaptive techniques commonly used for solving PDEs are not optimal here. Results for some typical classification tasks, including a problem from the UCI repository, are presented.

[GaGT01] J. Garcke, M. Griebel and M. Thess. Data Mining with Sparse Grids. Computing 67(3), 2001, p. 225 - 253.

Department(s)University of Stuttgart, Institute of Parallel and Distributed Systems, Simulation of Large Systems
Entry dateNovember 30, 2011
   Publ. Institute   Publ. Computer Science