Machine learning models are not always stable and we have to evaluate the stability of the machine learning model.
Cross-Validation is a technique used to assess how well our Machine learning models perform on unseen data.
Cross-Validation is a resampling technique with the fundamental idea of splitting the dataset into 2 parts- training data and test data.
One of the major advantages of the Hold Out method is that it is computationally inexpensive compared to other cross-validation techniques.
The Leave One Out Cross-Validation method gives unbiased estimates but has an extremely high variance because the model is run 'n' times to test every observation.
In K-Fold Cross-Validation, the mean of errors from all the iterations is calculated as the CV test error estimate.
Stratified K-Fold CV will create K-Folds by preserving the percentage of sample for each class.
Cross-validation is a powerful technique to ensure that your machine learning model generalizes well to unseen data.
The sklearn.model_selection.cross_val_score function further simplifies this process, making it easier to implement cross-validation and obtain consistent evaluation metrics for your models.
K-fold Cross Validation is typically performed using k=5 or k=10 as these values have been empirically shown to yield test error estimates that neither have high bias nor high variance.