Journal Information
Vol. 46. Issue S4.
HEMO 2024
Pages S259-S260 (October 2024)
Vol. 46. Issue S4.
HEMO 2024
Pages S259-S260 (October 2024)
Full text access
REVEALING HIDDEN PATTERNS: HOW UNSUPERVISED MACHINE LEARNING AND MCA PREDICT SURVIVAL IN NODAL PERIPHERAL T-CELL LYMPHOMA PATIENTS
Visits
386
CO Reicherta,b,c, G Carneiroa, HF Cullera,b, FA Freitasa,b, VG Rochaa,b, CA Murga-Zamalload, LAPC Lagea,b, R Olimpioc, J Pereiraa,d
a Faculdade de Medicina da Universidade de São Paulo (FMUSP), São Paulo, Brazil
b Hospital das Clínicas da Faculdade de Medicina da Universidade de São Paulo (HCFMUSP), São Paulo, Brazil
c MBA Data Science e Analytics, Universidade de São Paulo (USP), Piracicaba, Brazil
d Universidade de Illinois, Chicago, United States
This item has received
Article information
Special issue
This article is part of special issue:
Vol. 46. Issue S4

HEMO 2024

More info

Unsupervised machine learning techniques are employed to understand patterns and behaviors of variables in databases. Multiple Correspondence Analysis (MCA), an extension of correspondence analysis, can be used to verify the association of categorical variables and their categories. In this study, we evaluated the relationship between clinical-demographic variables of 154 patients with nodal peripheral T-cell lymphoma (PTCL) to understand how these variables relate to the unfavorable clinical outcome, death.

Methodology

MCA was used to reduce the dimensionality of the real world database by creating two dimensions, 1 and 2, which were subsequently categorized into group 1 and group 2. Dimension 1, comprising ECOG, IPI, treatment, overall response, remission, and bone marrow transplant, represented about 30% of the total observed variance. Survival analysis was conducted to assess the association between these groups and overall survival (OS) and mortality rate.

Results

The categories of dimension 1, group 1 and group 2, were associated with OS and mortality rate, with group 1 being associated with 5.5 months (95% CI: 2.70 – 8.40) of OS, while group 2 had a survival time of 277.20 months (95% CI: 91.21 – 463.12). Moreover, the mortality rate in group 1 was 87% (n = 67) and in group 2 it was 36% (n = 28) (p < 0.001). The risk of death for group 1 was 11.11 times (95% CI: 9.75-18.30; β= 2.41; p < 0.001).

Discussion

These findings indicate a significant disparity in survival outcomes based on the clinical-demographic profiles of the patients. Group 1, characterized by poorer clinical indicators, exhibited substantially shorter OS and higher mortality rates. The ability of MCA to effectively reduce dimensionality while preserving the clinical relevance of the variables underscores its utility in identifying key prognostic factors.

Conclusion

The MCA technique was effective in reducing the dimensionality of the database, maintaining the clinical characteristics of the variables robustly for use in Cox regression. This approach can aid in the identification of high-risk patient groups and inform treatment strategies aimed at improving clinical outcomes.

Full text is only available in PDF
Download PDF
Idiomas
Hematology, Transfusion and Cell Therapy
Article options
Tools