📞 +91-7667918914 | ✉️ ijarcce@gmail.com
IJARCCE Logo
International Journal of Advanced Research in Computer and Communication Engineering A monthly Peer-reviewed & Refereed journal
ISSN Online 2278-1021ISSN Print 2319-5940Since 2012
IJARCCE adheres to the suggestive parameters outlined by the University Grants Commission (UGC) for peer-reviewed journals, upholding high standards of research quality, ethical publishing, and academic excellence.
← Back to VOLUME 14, ISSUE 12, DECEMBER 2025

Optimized Ensemble Regression with Explainable AI for Interpretable Healthcare Cost Prediction

Md. Shahidur Rahman Saklain, Antar Sarker, Md. Sadiq Iqbal

DOI: 10.17148/IJARCCE.2025.141249

Abstract: Accurate prediction of healthcare insurance costs plays a crucial role in improving cost management, policy design, and healthcare planning. This study investigates the effectiveness of various machine learning (ML) algorithms in forecasting healthcare insurance expenditures and identifies the most suitable model for reliable cost estimation. A publicly available dataset containing demographic and lifestyle-related attributes such as age, sex, body mass index (BMI), number of children, smoking status, and region was utilized. Multiple regression-based ML models, including Linear Regression (LR), Support Vector Regression (SVR), Random Forest Regressor (RFR), XGBoost Regressor (XGBR), LightGBM (LGBM), and Gradient Boosted Regression (GBR), were implemented and compared. The evaluation results demonstrate that the GBR model outperformed other approaches by achieving the lowest mean squared error (MSE = 18,153,562.14) and mean absolute error (MAE = 2,270.97), along with the highest coefficient of determination (R² = 0.87), peak signal-to-noise ratio (PSNR = 22.97), and signal-to-noise ratio (SNR = 9.97). Cross-validation further confirmed its robustness, with the tenth fold achieving an R² of 0.91. To enhance model interpretability, explainable artificial intelligence (XAI) tools such as SHAP and LIME were applied to the final GBR model, revealing that “region” and “smoker” were the most influential factors affecting insurance costs. The findings confirm that GBR, combined with explainable AI techniques, offers a robust, transparent, and reliable solution for predicting healthcare insurance costs. Future work will focus on integrating more advanced explainable frameworks and real-world healthcare datasets to further improve reliability and applicability.

Keywords: Healthcare insurance cost prediction; machine learning; explainable artificial intelligence (XAI); regression models; gradient boosting

How to Cite:

[1] Md. Shahidur Rahman Saklain, Antar Sarker, Md. Sadiq Iqbal, “Optimized Ensemble Regression with Explainable AI for Interpretable Healthcare Cost Prediction,” International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), DOI: 10.17148/IJARCCE.2025.141249