Hamostaseologie 2021; 41(S 01): S1
DOI: 10.1055/s-0041-1728079
Oral Communication
Cancer-associated Thrombophilia

Development and validation of machine learning predictive models applying for cancer-associated deep vein thrombosis: A 1035-sample retrospective cohort study

S Jin
1   Division of Medical & Surgical Nursing, School of Nursing, Peking University, Beijing
,
D Qin
1   Division of Medical & Surgical Nursing, School of Nursing, Peking University, Beijing
,
BS Liang
2   Department of Biostatistics, School of Public Health, Peking University, Beijing
,
LC Zhang
1   Division of Medical & Surgical Nursing, School of Nursing, Peking University, Beijing
,
XX Wei
1   Division of Medical & Surgical Nursing, School of Nursing, Peking University, Beijing
,
YJ Wang
1   Division of Medical & Surgical Nursing, School of Nursing, Peking University, Beijing
,
B Zhuang
1   Division of Medical & Surgical Nursing, School of Nursing, Peking University, Beijing
,
T Zhang
3   Division of Medical & Surgical Nursing, School of Public Health, Peking University, Beijing
,
ZP Yang
4   Department of Gastrointestinal Surgery, Beijing Shijitan Hospital, Capital Medical University/The 9th Clinical Medical College, Peking University, Beijing
,
YW Cao
1   Division of Medical & Surgical Nursing, School of Nursing, Peking University, Beijing
,
SL Jin
1   Division of Medical & Surgical Nursing, School of Nursing, Peking University, Beijing
,
P Yang
1   Division of Medical & Surgical Nursing, School of Nursing, Peking University, Beijing
,
B Jiang
5   Department of Medical Oncology, Beijing Shijitan Hospital, Capital Medical University/The 9th Clinical Medical College, Peking University, Beijing
,
BQ Rao
4   Department of Gastrointestinal Surgery, Beijing Shijitan Hospital, Capital Medical University/The 9th Clinical Medical College, Peking University, Beijing
,
HP Shi
4   Department of Gastrointestinal Surgery, Beijing Shijitan Hospital, Capital Medical University/The 9th Clinical Medical College, Peking University, Beijing
,
Q Lu
1   Division of Medical & Surgical Nursing, School of Nursing, Peking University, Beijing
› Author Affiliations
 

Objective This study aims to develop machine learning(ML) models for cancer-associated deep vein thrombosis (DVT), and compare the performance of these models with the currently widely-applied predictive model, the Khorana Score, with or without using D-Dimer.

Material and Methods We consecutively and retrospectively extracted data of 1035 cancer patients from a tertiary hospital. Both uni-variable analysis and the Lasso regression were applied to select the important predictors. Model training (training set, 652/725) and hyper-parameter tuning (testing set, 73/725) were implemented on 70 % (725/1035) of the data using a ten-fold cross-validation method. The remaining 30 % (310/1035) data was used to compare the performance with six indicators, the area under the receiver operating characteristic curve(AUC), sensitivity, specificity, accuracy, the Brier Score, and the calibration curve, among all five ML models, linear discriminant analysis (LDA), logistic regression (LR), classification tree (CT), random forest (RF, an assemble algorithm), and support vector machine (SVM), and the Khorana Score, with or without using D-Dimer.

Results The percentage of cancer-associated DVT in this study was 22.3 % (231/1035). The top five important predictors were D-Dimer, age, Charlson Comorbidity Index (CCI), length of stay (LOS), and previously VTE history. Five ML models were developed and validated. In validation set, LDA (AUC: 0.756 & 0.773, none-D-Dimer model & D-Dimer model, respectively), and LR (AUC: 0.752 & 0.772) performed best, followed by RF (AUC: 0.638 & 0.660), Khorana Score (AUC: 0.604 & 0.642), CT (AUC: 0.604 & 0.638), and SVM (AUC: 0.593 & 0.665).

Conclusion This study developed and validated ML predictive models for cancer-related DVT. The combination with D-Dimer showed improved performance of all models. LDA and LR out-performed Khorana Score, but CT, RF, and SVM did not surpass it. A nomogram and a web calculator were used to visualize the best recommended model, the D-Dimer LR model, which largely narrowed the gap between model development and clinical application. The web calculator can be found at https://webcalculatorofcancerassociateddvt.shinyapps.io/dynnomapp/. In the future, the performance and clinical application of ML models for cancer-associated DVT might be improved further.



Publication History

Article published online:
18 June 2021

© 2021. Thieme. All rights reserved.

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany