Researchers propose a fair machine learning algorithm to model interpretable differences between observed and desired human decision-making.
The algorithm aims to reduce disparities in a downstream outcome impacted by human decision, termed representational disparities.
A neural network is used to learn interpretable representational disparities, which could be corrected by nudges to human decision, mitigating outcome disparities.
The approach is validated using real-world datasets like German Credit, Adult, and Heritage Health.