RUSBoost: Improving classification performance when training data is skewed 论文
2008Proceedings - International Conference on Pattern Recognition/Proceedings/International Conference on Pattern Recognition引用 295
Imbalanced Data Classification TechniquesMachine Learning and Data ClassificationData Mining Algorithms and Applications
摘要
Constructing classification models using skewed training data can be a challenging task. We present RUSBoost, a new algorithm for alleviating the problem of class imbalance. RUSBoost combines data sampling and boosting, providing a simple and efficient method for improving classification performance when training data is imbalanced. In addition to performing favorably when compared to SMOTEBoost (another hybrid sampling/boosting algorithm), RUSBoost is computationally less expensive than SMOTEBoost and results in significantly shorter model training times. This combination of simplicity, speed and performance makes RUSBoost an excellent technique for learning from imbalanced data.