An Empirical Study of Machine Learning Robustness and Scalability for Imbalanced Tabular Clinical Data in Emergency and Critical Care 文章

ArXiv CS.CV2026-05-27NEWSen作者: Yusuf Brima, Marcellin Atemkeng

摘要

arXiv:2512.21602v3 Announce Type: replace-cross Abstract: Every year, millions of patients pass through emergency departments and intensive care units, where clinicians must make high-stakes decisions under time pressure and uncertainty. Machine learning could support prediction of deterioration, triage, and rare critical outcomes, but clinical data are often severely imbalanced, biasing models toward majority classes and reducing predictive performance. Developing robust and efficient models for imbalanced clinical tabular data therefore remains an important challenge. We evaluated six model families on imbalanced tabular data from the MIMIC-IV-ED and eICU databases: Decision Tree, Random Forest, XGBoost, TabNet, TabICL, and TabPFN v2.6. Trainable models were optimized using Bayesian hyperparameter tuning, while foundation models were evaluated in their pretrained inference regime without task-specific reweighting.