Article in Proceedings INPROC-2020-32

BibliographyHirsch, Vitali; Reimann, Peter; Mitschang, Bernhard: Exploiting Domain Knowledge to Address Multi-Class Imbalance and a Heterogeneous Feature Space in Classification Tasks for Manufacturing Data.
In: Balazinska, Magdalena (ed.); Zhou, Xiaofang (ed.): Proceedings of the 46th International Conference on Very Large Databases (VLDB).
University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology.
Proceedings of the VLDB Endowment; 13(12), english.
ACM Digital Library, August 2020.
Article in Proceedings (Conference Paper).
CR-SchemaH.2.8 (Database Applications)
Abstract

Classification techniques are increasingly adopted for quality control in manufacturing, e. g., to help domain experts identify the cause of quality issues of defective products. However, real-world data often imply a set of analytical challenges, which lead to a reduced classification performance. Major challenges are a high degree of multi-class imbalance within data and a heterogeneous feature space that arises from the variety of underlying products. This paper considers such a challenging use case in the area of End-of-Line testing, i. e., the final functional test of complex products. Existing solutions to classification or data pre-processing only address individual analytical challenges in isolation. We propose a novel classification system that explicitly addresses both challenges of multi-class imbalance and a heterogeneous feature space together. As main contribution, this system exploits domain knowledge to systematically prepare the training data. Based on an experimental evaluation on real-world data, we show that our classification system outperforms any other classification technique in terms of accuracy. Furthermore, we can reduce the amount of rework required to solve a quality issue of a product.

Department(s)University of Stuttgart, Institute of Parallel and Distributed Systems, Applications of Parallel and Distributed Systems
Project(s)GSaME-NFG
Entry dateJune 23, 2020
   Publ. Institute   Publ. Computer Science