themis

R package

Extra Recipes Steps for imbalanced data

Published

January 13, 2020


A dataset with an uneven number of cases in each class is said to be unbalanced. Many models produce a subpar performance on unbalanced datasets. A dataset can be balanced by increasing the number of minority cases using SMOTE, BorderlineSMOTE and ADASYN. Or by decreasing the number of majority cases using NearMiss or Tomek link removal.