A significant challenge in maintaining real-world machine learning models is determining when to retrain or update the model due to continuous and unpredictable data evolution.
This decision is complex due to limited information availability, unknown nature of distribution shifts, and the need to specify a cost ratio between retraining and poor performance.
Existing methods do not provide a comprehensive solution to this retraining problem, as they fail to account for cost trade-offs, data scarcity, and key practical considerations.
A proposed uncertainty-based method continually forecasts model performance evolution using a bounded metric, outperforming existing baselines in experiments across 7 datasets for classification tasks.