Explainable Machine Learning Models for Mortality Risk Prediction of Crimean-Congo Hemorrhagic Fever in Iraq

Tiba Zaki Abdulhameed; Rabia  Al Mamlook; Haider  Ali Hantoosh; Hasnaa  Imad Al-Shaikhli; Yasir  Younis Majeed; Suhad  A. Yousif; Tasnim  Gharaibeh

doi:10.22401/

Authors

Tiba Zaki Abdulhameed Department of Computer Science, College of Sciences, Al-Nahrain University, Baghdad, Iraq.
Rabia Al Mamlook Department of Business Administration, Trine University, IN, USA.
Haider Ali Hantoosh Public Health Department, Thi-Qar Directorate of Health, Thi-Qar, Iraq.
Hasnaa Imad Al-Shaikhli Department of Computer Science, College of Sciences, Al-Nahrain University, Baghdad, Iraq.
Yasir Younis Majeed Epidemiology, Ministry of Health, Baghdad, Iraq.
Suhad A. Yousif Department of Computer Science, College of Sciences, Al-Nahrain University, Baghdad, Iraq.
Tasnim Gharaibeh Department of Computer Science, Kalamazoo College, Kalamazoo, MI, USA.

DOI:

https://doi.org/10.22401/

Keywords:

Data Analysis, Prediction Accuracy, Outbreak, Feature Importance Analysis, Explainable AI

Abstract

In mid-2022, Iraq experienced a massive outbreak of Crimean-Congo hemorrhagic fever (CCHF), resulting in high mortality rates. The outbreak began in Thi-Qar province and subsequently spread to other provinces. This research analyzes data collected from Thi-Qar province to investigate the key factors influencing patient life risk. This is accomplished by collecting a real dataset (HemoIraq24) and conducting a statistical analysis, followed by developing explainable patient outcome prediction models using several machine learning algorithms. The most important factors contributing to the decision of the predicted outcome are obtained using feature importance and SHAP techniques. In addition, a web-based application has been developed based on the best ML prediction model to assist healthcare providers in clinical decision-making. The ML algorithms tested include Decision Trees, Random Forests, Logistic Regression, Gradient Boosting, and K-nearest neighbor. The highest baseline prediction model accuracy achieved is 89%. Feature importance analysis and SHAP are utilized for further feature engineering, causing an enhancement of 3% in prediction accuracy, with up to 8% enhancement in F1 score. It is found that the main factor contributing to the patient outcome is the days in the hospital, which means that the healthcare given in the hospitals is strong enough and can handle the endemic. The dataset can help with future research and is available at: HemoIraq24 Dataset.

References

[1] Ergonul, O.; "Crimean–Congo hemorrhagic fever virus: new outbreaks, new discoveries". Curr. Opin. Virol., 2 (2): 215–220, 2012.

[2] Mertz, G. J.; "Zoonoses: Infectious Diseases Transmissible From Animals to Humans, Fourth Edition". Clin. Infect. Dis., 63 (1): 148, 2011.

[3] Ergönül, Ö.; "Crimean-Congo haemorrhagic fever". Lancet Infect. Dis., 6 (4): 203–214, 2006.

[4] Rehman, K.; Bettani, M. A. K.; Veletzky, L.; Afridi, S.; Ramharter, M.; "Outbreak of Crimean-Congo haemorrhagic fever with atypical clinical presentation in the Karak District of Khyber Pakhtunkhwa, Pakistan". Infect. Dis. Poverty, 7 (1): 59-64, 2018.

[5] Sah, R.; Mohanty, A.; Mehta, V.; Chakraborty, S.; Chakraborty, C.; Dhama, K.; "Crimean-Congo haemorrhagic fever (CCHF) outbreak in Iraq: Currently emerging situation and mitigation strategies – Correspondence". Int. J. Surg., 106 (1743-9191): 106916, 2022.

[6] Khwarahm, N. R.; "Predicting the Spatial Distribution of Hyalomma ssp., Vector Ticks of Crimean–Congo Haemorrhagic Fever in Iraq". Sustain., 15 (18): 13669, 2023.

[7] Alhilfi, R. A.; Khaleel, H. A.; Raheem, B. M.; Mahdi, S. G.; Tabche, C.; Rawaf, S.; "Large outbreak of Crimean-Congo haemorrhagic fever in Iraq, 2022". IJID Reg., 6: 76–79, 2023.

[8] Verdonk, C.; Verdonk, F.; Dreyfus, G.; "How machine learning could be used in clinical practice during an epidemic". Crit. Care, 24 (1): 265 2020.

[9] Sharma, M.; Goel, A. K.; Singhal, P.; "Explainable AI Driven Applications for Patient Care and Treatment". In Explainable AI: Foundations, Methodologies and Applications, Mehta, M., Palade, V. and Chatterjee, I., Eds.; Springer International Publishing: Cham, Switzerland, 135–156, 2023.

[10] Ak, Ç.; Ergönül, Ö.; Gönen, M.; "A prospective prediction tool for understanding Crimean–Congo haemorrhagic fever dynamics in Turkey". Clin. Microbiol. Infect., 26 (1): 123.e1–123.e7, 2020.

[11] Wang, Z.; Yang, C.; Li, B.; Wu, H.; Xu, Z.; Feng, Z.; "Comparison of simulation and predictive efficacy for hemorrhagic fever with renal syndrome incidence in mainland China based on five time series models". Front. Public Health, 12: 1365942, 2024.

[12] Zhang, T.; Rabhi, F.; Chen, X.; Paik, H. young; MacIntyre, C. R.; "A machine learning-based universal outbreak risk prediction tool". Comput. Biol. Med., 169: 107876, 2024.

[13] Colubri, A.; Hartley, M.-A.; Siakor, M.; Wolfman, V.; Felix, A.; Sesay, T.; Shaffer, J. G.; Garry, R. F.; Grant, D. S.; Levine, A. C.; Sabeti, P. C.; "Machine-learning Prognostic Models from the 2014-16 Ebola Outbreak: Data-harmonization Challenges, Validation Strategies, and mHealth Applications". EClinicalMedicine, 11: 54–64, 2019.

[14] Forna, A.; Dorigatti, I.; Nouvellet, P.; Donnelly, C. A.; "Comparison of machine learning methods for estimating case fatality ratios: An Ebola outbreak simulation study". PLoS One, 16 (9): e0257005, 2021.

[15] Colubri, A.; Silver, T.; Fradet, T.; Retzepi, K.; Fry, B.; Sabeti, P.; "Transforming Clinical Data into Actionable Prognosis Models: Machine-Learning Framework and Field-Deployable App to Predict Outcome of Ebola Patients". PLoS Negl. Trop. Dis., 10 (3): e0004549, 2016.

[16] Jehad, R.; Yousif, S. A.; "Fake news classification using random forest and decision tree (j48)". Al-Nahrain J. Sci., 23 (4): 49–55, 2020.

[17] Schapire, R. E.; "The boosting approach to machine learning: An overview". In Nonlinear Estim. Classif. 1st ed.; Denison, D.D., Hansen, M.H., Holmes, C.C., Mallick, B., Yu, B., Eds.: Springer, New York, NY, USA, 149–171, 2003.

[18] Chen, T.; Guestrin, C.; "XGBoost: A Scalable Tree Boosting System". In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13–17, 2016; ACM: New York, NY, USA, 785–794, 2016.

[19] Podgorelec, V.; Kokol, P.; Stiglic, B.; Rozman, I.; "Decision trees: an overview and their use in medicine". J. Med. Syst., 26: 445–463, 2002.

[20] Blockeel, H.; Devos, L.; Frénay, B.; Nanfack, G.; Nijssen, S.; "Decision trees: from efficient prediction to responsible AI". Front. Artif. Intell., 6: 1124553, 2023.

[21] Elkahwagy, D. M. A. S.; Kiriacos, C. J.; Mansour, M.; "Logistic regression and other statistical tools in diagnostic biomarker studies". Clin. Transl. Oncol., 26 (9): 2172–2180, 2024.

[22] Hu, L.-Y.; Huang, M.-W.; Ke, S.-W.; Tsai, C.-F.; "The distance function effect on k-nearest neighbor classification for medical datasets". Springerplus, 5: 1304, 2016.

Explainable Machine Learning Models for Mortality Risk Prediction of Crimean-Congo Hemorrhagic Fever in Iraq

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

Most read articles by the same author(s)

Make a Submission

newsidebar

Information