Imbalanced regression occurs when continuous target variables have skewed distributions, creating sparse regions that are difficult for machine learning models to predict accurately.
Existing approaches often rely on arbitrary thresholds to categorize samples as rare or frequent, ignoring the continuous nature of target distributions.
To address these limitations, the proposed approach called LDAO (Local Distribution-based Adaptive Oversampling) learns the global distribution structure by decomposing the dataset into a mixture of local distributions and models each distribution independently before merging them into a balanced training set.
In extensive evaluations, LDAO outperforms state-of-the-art oversampling methods on both frequent and rare target values, demonstrating its effectiveness for addressing the challenge of imbalanced regression.