实用医学杂志 ›› 2024, Vol. 40 ›› Issue (6): 844-849.doi: 10.3969/j.issn.1006-5725.2024.06.019

• 医学检查与临床诊断 • 上一篇    下一篇

基于机器学习利用常规检验指标建立胃癌淋巴结转移预测模型

严健亮1,2,谢泽宇3,景蓉蓉1,崔明1()   

  1. 1.南通大学附属医院检验科 (江苏 南通 226006 )
    2.南通大学医学院 (江苏 南通 226006 )
    3.河海大学商学院 (南京 211100 )
  • 收稿日期:2023-09-22 出版日期:2024-03-25 发布日期:2024-04-08
  • 通讯作者: 崔明 E-mail:wscm163@163.com
  • 基金资助:
    中国博士后科学基金资助项目(2020M681688);南通市市级基础科学研究和社会民生科技指令性项目(MS12021001)

Research on establishing gastric cancer lymph node metastasis prediction model based on machine learning and routine laboratory indicators

Jianliang YAN1,2,Zeyu XIE3,Rongrong JING1,Ming. CUI1()   

  1. Department of Laboratory Medicine,Affiliated Hospital of Nantong University,Nantong 226006,China
    Natong University College of Medicine,Nantong 226006,China
  • Received:2023-09-22 Online:2024-03-25 Published:2024-04-08
  • Contact: Ming. CUI E-mail:wscm163@163.com

摘要:

目的 利用机器学习算法,建立一种基于常规检验指标的胃癌淋巴结转移(lymph node metastasis, LNM)预测模型。 方法 收集南通大学附属医院2020年1月至2022年1月期间741例胃癌患者数据用于模型训练和测试,收集2023年1-10月期间102例胃癌患者数据用于模型验证;使用XGBoost方法计算指标重要性,从66个指标中过滤出重要指标集合;构建并训练5种机器学习算法:K近邻、支持向量机、多层感知器、随机森林、Adaboost进行对比分析,并在验证集中进一步验证模型的稳定性和预测能力。 结果 本研究筛选出由9个常规检验指标组成的重要指标集合,并训练构建出胃癌LNM预测模型V9。此外,通过多种机器学习算法对比实验发现,基于Boosting策略的Adaboost算法效果最好,其曲线下面积、F1值、准确率、灵敏度、特异度等评估指标均在0.833~0.968之间,在验证集上预测准确率达94.12%。 结论 V9是一种具有辅助临床诊断价值的胃癌LNM预测模型,能够准确评估患者的风险,并为临床决策提供依据。

关键词: 胃癌, 淋巴结转移, 常规检验指标, 机器学习

Abstract:

Objective To establish a prediction model for lymph node metastasis (LNM) of gastric cancer based on routine laboratory indicators using machine learning algorithms. Methods This study collected data of 741 gastric cancer patients at Affiliated Hospital of Nantong University between January 2020 and January 2022 for model training and testing. Additionally, data of 102 gastric cancer patients between January 2023 and October 2023 were collected for model validation. XGBoost algorithm was used to calculate the importance of indicators and filter out a set of important indicators from 66 indicators. Five machine learning algorithms, including K-Nearest Neighbor, Support Vector Machine, Multilayer Perceptron, Random Forest and Adaboost, were constructed and trained for comparative analysis. Furthermore, the stability and accuracy of the model were further validated on the validation set. Results This study selected a set of important indicators composed of 9 routine laboratory indicators and trained the gastric cancer LNM prediction model, named V9. Additionally, through comparative experiments, it was found that the Adaboost algorithm based on the boosting strategy had the best performance, with evaluation metrics such as area under the curve, F1 score, accuracy, sensitivity, and specificity ranging from 0.833 to 0.968. The accuracy of the predictions on the validation set was 94.12%. Conclusion V9 was a gastric cancer LNM prediction model that has auxiliary clinical diagnostic value. It can be used to assess the risk of patients accurately and provide a basis for clinical decision-making.

Key words: gastric cancer, lymph node metastasis, routine laboratory indicators, machine learning

中图分类号: