Deploying machine learning models in safety-critical domains without access to ground truth labels poses challenges in ensuring reliable model performance on user data.
A novel framework called the suitability filter is proposed to detect performance deterioration by utilizing suitability signals that indicate potential prediction errors.
The suitability filter evaluates classifier accuracy on user data compared to a labeled test dataset, ensuring that any degradation stays within a predefined margin.
By aggregating suitability signals and using statistical hypothesis testing, the framework provides insights into decision uncertainty and reliably detects performance deviations due to covariate shift.