menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Challenges...
source image

Arxiv

22h

read

253

img
dot

Image Credit: Arxiv

Challenges learning from imbalanced data using tree-based models: Prevalence estimates systematically depend on hyperparameters and can be upwardly biased

  • Imbalanced binary classification problems are common in various fields of study.
  • Subsampling the majority class to create a balanced training dataset can bias the model's predictions.
  • Calibrating a random forest model using prevalence estimates can lead to unintended negative consequences, including upwardly biased estimates.
  • Random forests' prevalence estimates depend on the number of predictors considered at each split and the sampling rate used, revealing unexpected biases.

Read Full Article

like

15 Likes

For uninterrupted reading, download the app