Article in Proceedings INPROC-2022-03

BibliographySpieß, Marco; Reimann, Peter; Weber, Christian; Mitschang, Bernhard: Analysis of Incremental Learning andWindowing to handle Combined Dataset Shifts on Binary Classification for Product Failure Prediction.
In: Proceedings of the 24th International Conference on Enterprise Information Systems (ICEIS 2022).
University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology.
english.
SciTePress, April 2022.
Article in Proceedings (Conference Paper).
CR-SchemaH.2.8 (Database Applications)
KeywordsBinary Classification; Dataset Shift; Incremental Learning; Product Failure Prediction; Windowing.
Abstract

Dataset Shifts (DSS) are known to cause poor predictive performance in supervised machine learning tasks. We present a challenging binary classification task for a real-world use case of product failure prediction. The target is to predict whether a product, e. g., a truck may fail during the warranty period. However, building a satisfactory classifier is difficult, because the characteristics of underlying training data entail two kinds of DSS. First, the distribution of product configurations may change over time, leading to a covariate shift. Second, products gradually fail at different points in time, so that the labels in training data may change, which may a concept shift. Further, both DSS show a trade-off relationship, i. e., addressing one of them may imply negative impacts on the other one. We discuss the results of an experimental study to investigate how different approaches to addressing DSS perform when they are faced with both a covariate and a concept shift. Thereby, we prove that existing approaches, e. g., incremental learning and windowing, especially suffer from the trade-off between both DSS. Nevertheless, we come up with a solution for a data-driven classifier that yields better results than a baseline solution that does not address DSS.

Department(s)University of Stuttgart, Institute of Parallel and Distributed Systems, Applications of Parallel and Distributed Systems
Project(s)GSaME-NFG
Entry dateMarch 23, 2022
   Publ. Institute   Publ. Computer Science