Machine Learning for Health Insurance Prediction in Nigeria
Main Article Content
Abstract
Health insurance coverage remains critical to healthcare accessibility, particularly in developing nations like Nigeria. This paper focused on predicting the likelihood of medical insurance coverage among individuals in Nigeria by employing four prominent Machine learning techniques: Logistic Regression, Random Forest, Decision Tree, and Support Vector Machine classifiers. The dataset utilized for analysis comprises demographic information, socioeconomic factors, and health-related variables collected from a diverse sample across Nigeria. Four models are trained and evaluated: Logistic Regression widely accepted for its simplicity and interpretability. Random Forest is a robust ensemble learning algorithm capable of capturing complex relationships within the data. The decision Tree model is simple to understand and visualize and the Support Vector Machine model is known for producing a very good classification. Furthermore, the performance metrics uutilized to rate the predictive capabilities of the models are Accuracy, Precision, Sensitivity, F Score, and area under the Receiver Operating Characteristic (AUC & ROC Curve). Additionally, a features importance analysis is conducted for the identification of the dominant factors contributing to the prediction of the spread of medical insurance in Nigeria. The outcome of this paper gives insights in the efficiency of each machine learning models used to forecast medical insurance coverage, and identifying key determinants influencing insurance coverage can assist policymakers and healthcare stakeholders in devising targeted strategies to improve healthcare access and affordability for the Nigerian people.
Downloads
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
References
Onwujekwe, O., Ezumah, N., Mbachu, C., Obi, F., Ichoku, H., Uzochukwu, B., & Wang, H. (2019). Exploring Effectiveness of Different Health Financing Mechanism in Nigeria; What Need to Change and How Can It Happen? BMC Health Service Research 19:661 https://doi.org/10.1186/s12913-019-4512-4
Obikeze, E., Onyeje, D., Anyanti, J., Idogho, O., Ezenwaka, U., & Uguru, N. (2022). Assessment of Health Purchasing Functions for Universal Health Coverage in Nigeria: Evidence from Grey Literature and Key Informant Interviews. Health, 14, 330-341 https://doi.org/10.4236/health.2022.143026
Awosusi, A. (2022). Nigeria’s Mandatory Health Insurance and The March Towards Universal Health Coverage. The Lancet Global Health, 10, e1556.
Baba, M., & Omotara, B., (2013). Nigeria’s Public Health's Gains and Challenges
Badawy, M., Ramadan, N. & Hefny, H.A. (2023). Healthcare Predictive Analytics Using Machine Learning and Deep Learning Techniques: A Survey. Journal of Electical Systems and Inf Technol, 10:40. https://doi.org/10.1186/s43067-023-00108-y
Shaukat, Z., Zafar, W., Ahmad, W., Haq, I.U., Husnain, G., Al-Adhaileh, M.H., Ghadi, Y.Y. & Algarni, A. (2023). Revolutionizing Diabetes Diagnosis: Machine Learning Techniques Unleashed. Healthcare. 11, 2864. https://doi.org/10.3390/healthcare11212864
Rahman, M.M., Rahman, A., Akter, S. & Pinky, S.A. (2023). Hyperparameter Tuning Based Machine Learning Classifier for Breast Cancer Prediction. Journal of Computer and Communications. 11, 149-165 https://doi.org/10.4236/jcc.2023.114007
Han, H.J. & Suh, H.S. (2023). Predicting Unmet Healthcare Needs in Post-Disaster: A Machine Learning Approach. Int. J. Environ. Res. Public Health, 20, 6817. https://doi.org/10.3390/ijerph20196817
Chen, H., Wang, N., Zhou, Y., Mei, K., Tang, M. & Cai, G. (2023). Breast Cancer Prediction Based on Differential Privacy and Logistic Regression Optimization Model. Appl. Sci., 13, 10775. https://doi.org/10.3390/app131910755
Sun, H.T. & Pan, J.N. (2023). Heart Disease Prediction Using Machine Learning Algorithms with Self-Measureable Physical Condition Indicators. Journal of Data Analysis and Information Processing, 11(1), 1-10. https://doi.org/10.4236/jdaip.2023.111001
Wei, Y.Z., Zhang, D., Gao, M.Y., Tian, Y.H., He. Y., Huang, B.I. & Zheng, C.Y. (2023). Breast Cancer Prediction based on Machine Learning. Journal of Software Engineering and Applications, 6, 348-360. https://doi.org/10.4236/jsea/jsea/jsea-2023.168018
Reghunathan, R.K., Venkidusamy, P.N.P., Kurup, R.G., George, B. & Thomas, N. (2024). Machine Learning-Based Classification of Autism Spectrum Disorder across Age Groups. Eng. Proc., 62, 12. https://doi.org/10.3390/engproc2024062012
Cai, M.Y. (2023). A Novel Method for Disgnosis of Breast Cancer Tumors Based on Random Forest. Journal of Biosciences and Medicines, 11, 252-259. https://doi.org/10.4236/jbm.2023.114018
Gill, T.S., Shirazi, M.A. & Zaidi, S.S.H. (2023) Early Detection of Mesothelioma Using Machine Learning Algorithms Eng. Proc., 46, 6. https://doi.org/10.3390/engproc2023046006
Chae, M., Yoon, H., Lee, H. & Choi, J. (2024). Hearing Recovery Prediction for Patients with Chronic Otitis Media Who Underwent Canal-Wall-Down Mastoidectomy. J. Clin. Med, 13, 1557. https://doi.org/10.3390/jcm13061557
Zhang, Y., Zeng, H., Zhou, H., Li, J., Wang, T., Guo, Y., Cai, L., Hu, J., Zhang, X. & Chen, G. (2023). Predicting the Outcome of Patients with Aneurysmal Subarachnoid Hemorrhage: A Machine-Learning-Guided Scorecard. J. Clin. Med, 12, 7040. https://doi.org/10.3390/jcm12227040
Santana, I., Sobrinho, A., Silva, L. D. D., & Perkusich, A. (2023). A Machine Learning for COVID-19 and Influenza Classification during Coexisting Outbreak. Appl. Sci., 13, 11518. https://doi.org/10.3390/app132011518
Dipto, I.C., Islam, T., Rahman, H.M.M., & Rahman, A.A. (2020). Comparison of Different Machine Learning Algorithms for the Prediction of Coronary Artery Disease. Journal of Data Analysis and Information Processing 8(1), 41-68. https://doi.org/10.4236/jdaip.2020.82003.
Zheng, H. (2018). Analysis of Global Warming Using Machine Learning. Computational Water Energy, and Environmental Engineering, 7, 127-141. https://doi.org/10.4236/cweee.2018.73009
Oyoo, J.O., Wekesa, J.S. & Ogada, K.O. (2024). Predicting Road Traffic Collisions Using a Two-Layer Ensemble Machine Learning Algorithm. Appl. Syst. Innow, 7, 25. https://doi.org/10.3390/asi7020025
Almayyan, W. (2016). Lymph Disease Prediction Using Random Forest and Particle Swarm Optimization. Journal of Intelligent Learning Systems and Applications. 8, 51-62 http://dx.doi.org/10.4236/jilsa.2016.83005
Colot, C., Baecke, P, & Linden, I. (2021). Leveraging Fine-Grained Mobile Data for Churn Through Essence Random Forest. Journal Big Data, 8:63. https://doi.org/10.1186/s40537-021-00451-9
Getu, K. & Bhat, G. H. (2024). Application of Geospatial Techniques in Binary Logistic Regression Model for Analyzing Driving Factor of Urban Growth in Bhar Dar City Ethiopia. Heylion, 10. e25137. https://doi.org/10.1016/j.heliyon.2024.e25137
Wang, J., Ju, T., Li, B., Huang, C., Xia, X. & Li, Li. C. (2024). Characterization of Tropospheric Zone Pollution, Random Forest Trend Prediction and Analysis of Influencing Factors in South-Western Europe. Environmental Sciences Europe, 36:61, https://doi.org/10.1186/s12302-024-00863-3
Chen, H., Hu, S., Hua, R. & Zhao, X. (2021). Improved Naïve Bayes Classification Algorithm for Traffic Risk Management. EURASIP Journal on Advances in Signal Processing, 2021:30. https://doi.org/10.1186/s13634-021-00742-6
Gai, R. & Zhang, H. (2023). Prediction Model of Agricultural Water Quality Based on Optimized Logistic Regression Algorithm. EURASIP Journal on Advances in Signal Processing, 2023:21. https://doi.org/10.1186/s13634-023-00973-9
Liu, L., Luo, G. & Zhang, X. (2017). An algorithm based on logistic regression with data fusion in wireless sensor networks. EURASIP Journal on Wireless Communications and Networking, 2017:10. https://doi.org/10.1186/s13638-016-0793-z
Hancock, J.T., Bauder, R.A., Wang, H. & Khoshgoftaar, T.M. (2023). Explanable machine learning models for Medicare fraud detection. Journal of Big Data. 10:154. https://doi.org/10.1186/s40537-023-00821-5
Dahal, K.R. & Gautam, Y. (2020). Argumentative Comparative Analysis of Machine Learning on Coronary Artery Disease. Open Journal of Statistics, 10, 694-705. https://doi.org/10.4236/ojs.2020.104043
Rahman, M.M., Rahman, A., Akter, S. & Pinky, S.A. (2023). Hyperparameter Tuning Based MAchine Learning Classifier for Breast Cancer Prediction. Journal of Computer and Communications. 11, 149-165 https://doi.org/10.4236/jcc.2023.114007
Boateng, E.Y. & Abaye, D.A. (2019). A Review of the Logistic Regression model with Emphasis on Medical Research. Journal of Data Analysis and Information Processing. 7, 190-207. https://doi.org/10.4236/jdaip.2019.74012
Belgiu, M. & Dragout, L. (2016) Random Forest in Remote Sensing: A Review of Applications and Future Directions. ISPRS Journal of Photogrammetry and Remote Sensing. 114, 24-31. www.elsevier.com/locate/isprsjprs
Shaik, A.B., & Srinivasan, S. (2018). A Brief Survey on Random Forest Ensembles in Classification Model. International Conference on Innovative Computing and Communications. 253-256
Martinez-Taboada, F. & Redondo, J. I. (2020). The SIESTA (SEAAV Integrated Evaluation Sedation Tool for Anaesthesia) Project: Initial Development of a Malfactoral Sedation Assessment Tool for Dogs. PLoS ONE. 15(4): e0230799 https://doi.org/10.1371/Journal.pone.0230779
Lantz, B. (2013). Machine Learning with R. Packt Publishing Ltd. P308.
Cho, C.H., Yu, Y.W., & Kim, H.G. (2023). A Study on Dropout Prediction for University Students Using Machine Learning. Appl. Sci., 13, 12004. https://doi.org/10.3390/app132112004
Nordin, N.I., Mustafa, W.A., Lola, M.S., Madi, E.N., Kamil, A.A., Nasution, M.D.,Abdulhamid, K.A.A., Zainuddin, N.H., Aruchunan, E, & Abdullah, M.T. (2023). Enhancing COVID-19 Classification Accuracy with a Hybrid SVM-LR Model. Bioengineering, 10, 1318. https://doi.org/10.3390/bioengineering10111318
Zhang, J., Zhou, W., Yu, H., Wang, T., Wang, X., Liu, L. & Wen, Y. (2023). Prediction of Parkinson’s Disease Using Machine Learning Methods. Bioengineering, 13, 1761. https://doi.org/10.3390/biom13121761
Abbasi, E.Y., Zeng, Z., Magsi, A.H.., Ali, Q., Kumar, K. & Zubedi, A. (2023). Optimizing Skin Cancer Survival Prediction with Ensemble Techniques. Bioengineering, 11, 43. https://doi.org/10.3390/bioengineering11010043
Olaguez-Gonzalez, J.M., Chairez, I., Breton-Deval, L. & Alfaro-Ponce, M. (2023). Machine Learning Algorithm Applied to Predict Autism Spectrum Disorder Based on Gut Microbiome Composition. Biomedicines, 11, 2633. https://doi.org/10.3390/biomedicines11102633
Tu, K.-C., Tau, E.N.T., Chen, N. C. L., Chang, M.-C., Yu, T. C, Wang, C.-C., Liu, C.-F. & Kuo, C.-L. (2023). Machine Learning Algorithm Predicts Mortality Risk in Intensive Care Unit for Patients with traumatic Brain Injury. Diagonostic, 13, 3016. https://doi.org/10.3390/diagonostics13183016
Rojek, I., Kotlarz, P., Kozielski, M., Jagodzinski, M. & Krolikowski, Z. (2024). Development of AI-Based Prediction of Heart Attack Risk as an Element of Preventive Medicine. Electronics, 13, 272. https://doi.org/10.3390/electronics13020272
Singh, M.S., Thongam, K., Choudhary, P. & Bhagat, P.K. (2024). An Integrated Machine Learning Approach for Congestive Heart Failure Prediction. Diagnostics, 14, 736. https://doi.org/10.3390/diagnostics14070736