Stacking Ensemble Learning Approach for Non-Alcoholic Fatty Liver Disease Identification: Leveraging Explainable Machine Learning for Enhanced Prediction Models
Abstract
In the past, heavy drinking was often linked to fatty liver. The prevalence of non-alcoholic fatty liver disease (NAFLD), which affects people who do not consume alcohol, has garnered a lot of attention in the last 20 years. Nearly all fatty liver diseases are now the leading cause of liver disease in industrialized nations. Fatty liver has traditionally been defined as having a hepatic fat content of more than 5% of liver weight. Several medical issues, including those caused by medications, poor diet, and infections, may lead to fatty infiltration of the liver. Modern scientific understanding, however, attributes fatty liver in most individuals to either being overweight or obese or to drinking too much alcohol. This research proposes a stacked ensemble approach to detect NAFLD efficiently and achieves 95.9% correct classification accuracy. It also compares the proposed method with other basic and boosting machine learning approaches. To improve machine learning for trustworthy and reliable NAFLD screening and diagnosis, we apply explainable AI methods to the ensemble model to identify the most influential features and patterns for NAFLD predictions.
2-J. M. Paik, L. Henry, Y. Younossi, J. Ong, S. Alqahtani, and Z. M. Younossi, “The burden of nonalcoholic fatty liver disease (NAFLD) is rapidly growing in every region of the world from1990 to 2019,” Hepatol Commun, vol. 7, no. 10, Oct. 2023, doi: 10.1097/HC9.0000000000000251.
3-K. Riazi et al., “The prevalence and incidence of NAFLD worldwide: a systematic review and meta-analysis,” Lancet Gastroenterol Hepatol, vol. 7, no. 9, pp. 851–861, Sep. 2022, doi: 10.1016/S2468-1253(22)00165-0.
4-X. Zhu et al., “Presence of sarcopenia identifies a special group of lean NAFLD in middle-aged and older people,” Hepatol Int, vol. 17, no. 2, pp. 313–325, Apr. 2023, doi: 10.1007/S12072-022-10439-Z/METRICS.
5-V. W. S. Wong, M. Ekstedt, G. L. H. Wong, and H. Hagström, “Changing epidemiology, global trends and implications for outcomes of NAFLD,” J Hepatol, vol. 79, no. 3, pp. 842–852, Sep. 2023, doi: 10.1016/J.JHEP.2023.04.036.
6-F. Radu, C. G. Potcovaru, T. Salmen, P. V. Filip, C. Pop, and C. Fierbințeanu-Braticievici, “The Link between NAFLD and Metabolic Syndrome,” Diagnostics 2023, Vol. 13, Page 614, vol. 13, no. 4, p. 614, Feb. 2023, doi: 10.3390/DIAGNOSTICS13040614.
7-A. Bianco et al., “Diet and Exercise Exert a Differential Effect on Glucose Metabolism Markers According to the Degree of NAFLD Severity,” Nutrients 2023, Vol. 15, Page 2252, vol. 15, no. 10, p. 2252, May 2023, doi: 10.3390/NU15102252.
8-N. Ujjwal, A. Singh, A. K. Jain, and R. G. Tiwari, “Exploiting Machine Learning for Lumpy Skin Disease Occurrence Detection,” 2022 10th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions), ICRITO 2022, 2022, doi: 10.1109/ICRITO56286.2022.9964656.
9-R. G. Tiwari, S. K. Yadav, A. Misra, and A. Sharma, “Classification of Swarm Collective Motion Using Machine Learning,” Smart Innovation, Systems and Technologies, vol. 316, pp. 173–181, 2023, doi: 10.1007/978-981-19-5403-0_14/COVER.
10-N. K. Trivedi, R. G. Tiwari, A. K. Agarwal, and V. Gautam, “A Detailed Investigation and Analysis of Using Machine Learning Techniques for Thyroid Diagnosis,” 2023 International Conference on Emerging Smart Computing and Informatics, ESCI 2023, 2023, doi: 10.1109/ESCI56872.2023.10099542.
11-V. Khullar, R. G. Tiwari, A. K. Agarwal, and S. Dutta, “Physiological Signals Based Anxiety Detection Using Ensemble Machine Learning,” Lecture Notes in Networks and Systems, vol. 291, pp. 597–608, 2022, doi: 10.1007/978-981-16-4284-5_53.
12-S. Qin et al., “Machine learning classifiers for screening nonalcoholic fatty liver disease in general adults,” Scientific Reports 2023 13:1, vol. 13, no. 1, pp. 1–7, Mar. 2023, doi: 10.1038/s41598-023-30750-5.
13-H. Ma, C. F. Xu, Z. Shen, C. H. Yu, and Y. M. Li, “Application of Machine Learning Techniques for Clinical Predictive Modeling: A Cross-Sectional Study on Nonalcoholic Fatty Liver Disease in China,” Biomed Res Int, vol. 2018, 2018, doi: 10.1155/2018/4304376.
14-T. C. F. Yip et al., “Laboratory parameter-based machine learning model for excluding non-alcoholic fatty liver disease (NAFLD) in the general population,” Aliment Pharmacol Ther, vol. 46, no. 4, pp. 447–456, Aug. 2017, doi: 10.1111/APT.14172.
15-P. Sorino et al., “Selecting the best machine learning algorithm to support the diagnosis of Non-Alcoholic Fatty Liver Disease: A meta learner study,” PLoS One, vol. 15, no. 10, p. e0240867, Oct. 2020, doi: 10.1371/JOURNAL.PONE.0240867.
16-Y.-H. Cheng, C.-Y. Chou, and Y. Hsiung, “Application of Machine Learning Methods to Predict Non-Alcohol Fatty Liver Disease in Taiwanese High-Tech Industry Workers”.
17-“Non-alcohol fatty liver disease (NAFLD).” Accessed: Mar. 04, 2024. [Online]. Available: https://www.kaggle.com/datasets/utkarshx27/non-alcohol-fatty-liver-disease/data
18-Z. Xu, D. Shen, T. Nie, Y. Kou, N. Yin, and X. Han, “A cluster-based oversampling algorithm combining SMOTE and k-means for imbalanced medical data,” Inf Sci (N Y), vol. 572, pp. 574–589, Sep. 2021, doi: 10.1016/J.INS.2021.02.056.
19-Ş. K. Çorbacıoğlu and G. Aksel, “Receiver operating characteristic curve analysis in diagnostic accuracy studies: A guide to interpreting the area under the curve value,” Turk J Emerg Med, vol. 23, no. 4, p. 195, Oct. 2023, doi: 10.4103/TJEM.TJEM_182_23.
20-J. Van Loco, M. Elskens, C. Croux, and H. Beernaert, “Linearity of calibration curves: Use and misuse of the correlation coefficient,” Accreditation and Quality Assurance, vol. 7, no. 7, pp. 281–285, Jul. 2002, doi: 10.1007/S00769-002-0487-6/METRICS.
Files | ||
Issue | Articles in Press | |
Section | Articles | |
Keywords | ||
Machine learning; Ensemble learning; Stacking models; Explainable AI; Interpretable machine learning; Feature importance. |
Rights and permissions | |
![]() |
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. |