Tehran University of Medical Sciences

Science Communicator Platform

Share By
Machine Learning Prediction of Metabolic-Associated Fatty Liver Disease in Type 2 Diabetes: Emphasizing Data Imputation and Feature Selection Publisher Pubmed



Khosravi Z ; Barzinpour F ; Rabizadeh S ; Nakhjavani M ; Esteghamati A
Authors

Source: PLOS ONE Published:2026


Abstract

Metabolic-Associated Fatty Liver Disease (MAFLD) is common among Type 2 Diabetes (T2DM) patients. The coexistence of these conditions increases the risk of MAFLD progression and diabetes complications. Detecting MAFLD early is challenging due to its asymptomatic initial stages. In this study, we aimed to develop a machine learning model to predict MAFLD in T2DM patients. We conducted a cross-sectional study on 3,654 Iranian T2DM patients using their demographic and lab data. This study involved thorough data preprocessing, including evaluating various imputation methods on simulated missingness in a complete subset of the dataset. Additionally, four feature selection methods were applied to eight machine learning models to identify the most effective predictive model. The XGBoost classifier without feature selection achieved the best performance, with an accuracy of 80.6% and an area under the receiver operating characteristic curve (AUC) of 88.9%. Notably, certain features, such as alanine aminotransferase (ALT), platelet count (PLT) and Vitamin D(VitD) influenced the predictive performance. © 2026 Khosravi et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.