Analysis and Evaluation of Concept drift in Machine Learning

  • Anđela Ristivojević Banca Intesa ad Beograd; Факултет организационих наука, Универзитет у Београду
Keywords: concept drift, data drift, methods and algorithms for detecting concept drift, personal/cash loan production

Abstract

Machine learning models often encounter changes in data distributions, which can affect their performance and reliability. These changes are typically driven by environmental events and shifts in user preferences. In real-world environments, changes are often abrupt, causing models to fail to adapt effectively, which is known as concept drift. Detecting, monitoring, and addressing concept drift can significantly enhance model stability. Detecting techniques enable continuous analysis and model adaptation.

This paper provides a comprehensive definition of concept drift, emphasizing the distinction between concept drift and data, target and prediction drift. It also offers a clear differentiation of concept drift types.

The research was conducted on datasets from Banca Intesa ad Beograd, prepared to reflect various significant events over the past four years, simulating the dynamic environment in which bank operates. Additionally, the study addresses the binary classification problem of predicting cash loan production, as one of the bank's most important products. The research focuses on the application of selected open-source methods for detecting data, target, prediction and concept drift. These methods correspond to the binary classification problem and the chosen machine learning model.

The study aims to examine the existence of concept drift for the given dataset and machine learning model. Detecting all types of changes enables a better understanding of the model's lifecycle and the time frame in which the machine learning model operates. Finally, the paper proposes approaches to correct the detected changes.

Published
2025-02-25
Section
Information engineering