A HYBRID MECHANISTIC-MACHINE LEARNING MODEL FOR PREDICTING RECURRENT VENOUS THROMBOEMBOLISM

Bannoud, MA; Martins, TD; Montalvão, SadL; Annichino-Bizzacchi, JM; Filho, RM; Maciel, MRW

doi:10.1016/j.htct.2025.104998

Hematology, Transfusion and Cell Therapy

ISSN: 2531-1379

Hematology, Transfusion and Cell Therapy é uma publicação científica trimestral da Associação Brasileira de Hematologia, Hemoterapia e Terapia Celular (ABHH), Associazione Italo-Brasiliana di Ematologia (AIBE), Eurasian Hematology Oncology Group (EHOG) e Sociedade Brasileira de Oncologia Pediátrica (SOBOPE).

A Hematology, Transfusion and Cell Therapy publica artigos originais, artigos de revisão e relatos de caso relacionados com várias temáticas da área de hematologia e hemoterapia. Até 2017, a revista foi publicada sob o título Revista Brasileira de Hematologia e Hemoterapia.

ISSN print: 2531-1379
ISSN online: 2531-1387

Publicado por Elsevier Editora Ltda, Rio de Janeiro, Brasil.

Ver mais

Indexada em:

Scopus, Medline, Directory of Open Access Journals (DOAJ), PubMed Central (PMC), Emerging Sources Citation Index (ESCI), SCImago Journal Rank (SJR), SNIP

Ver mais

Recurrent Venous Thromboembolism (RVTE) remains a significant clinical challenge due to its high morbidity and the limited predictive performance of current risk scores. Two primary approaches have been explored for RVTE prediction: (i) Machine Learning (ML) models, which utilize diverse clinical features but often lack physiological interpretability, and (ii) Mechanistic models of thrombus formation, which offer physiological insight but rarely integrate routine clinical or hematological data.

Objectives

This study presents a hybrid modeling framework that integrates Artificial Neural Networks (ANNs), Ordinary Differential Equations (ODEs), and explainable Artificial Intelligence (XAI) to improve both the predictive accuracy and interpretability of RVTE risk. By combining clinical and hematological data with patient-specific kinetic parameters derived from a mechanistic model of the coagulation cascade, along with thrombin generation dynamics and clinical outcomes, the approach effectively bridges data-driven learning and physiological understanding.

Material and methods

Data from 235 patients with a first episode of Venous Thromboembolism (VTE) were used, with 164 for model training and 71 for external validation. A hybrid model was developed by integrating a Multilayer Perceptron (MLP) with a mechanistic system of ODEs simulating the coagulation cascade. The MLP mapped 39 clinical and hematological features to eight sensitive kinetic parameters, which were then used to simulate patient-specific thrombin generation and compute the Endogenous Thrombin Potential (ETP). ETP values were used in a classification stage to predict RVTE risk. A total of 192 model configurations, combining different data preprocessing strategies and eight Metaheuristic Optimization Algorithms (MOAs), were evaluated to identify the best-performing approach.

Results

The top-performing model, named ANN-ODE-GWOb, was optimized using the Grey Wolf Optimizer (GWO) with standardized inputs and eight sensitive kinetic parameters. It achieved an Area Under the receiver operating characteristic Curve (AUC) of 0.93, sensitivity of 0.97, and specificity of 0.90 on the full dataset. An ETP threshold of 1,927 nM·min was identified for RVTE classification. On the independent validation set, the model demonstrated strong generalization, with an AUC of 0.896, perfect sensitivity (1.00), and specificity of 0.80. Variables such as age, thrombosis location, residual thrombus, body mass index, anticoagulant duration, sex, thrombosis cause, and presence of the G20210A mutation significantly influenced patient-specific kinetic parameters. Their impact on thrombin dynamics and RVTE risk was consistent with established clinical knowledge.

Discussion and Conclusion

The proposed model outperformed existing clinical scores and ML-based models for RVTE prediction. Critically, it enabled interpretation of how key variables, including antithrombin activity, sex, thrombus location, cancer, diabetes, D-dimer, and age influence thrombin generation and RVTE risk. This hybrid ML and mechanistic framework advances RVTE risk prediction by coupling physiological insight with data-driven accuracy. It highlights the potential of integrating ML, ODE-based modeling, and XAI to connect clinical data, thrombin dynamics, and thrombotic outcomes.

O texto completo está disponível em PDF

Baixar PDF

Indexada em:

Siga-nos:

Receba a nossa Newsletter