广西医学

目的基于LesionMix数据增强和熵最小化损失建立一种半监督肺癌CT影像分割方法——熵最小化病灶增强（EMLM）。方法首先，提出LesionMix数据增强方法，即通过对少量有标注的CT影像进行病灶信息提取并重复利用，来提升标注数据的利用效率。其次，提出两阶段半监督训练策略，第一阶段通过LesionMix数据增强方法使模型快速学习到少量标注数据的病灶特征，第二阶段使用熵最小化损失函数使其拟合真实数据分布情况，提高模型分割效果。最后，在LIDC⁃IDRI数据集上，通过对比实验和消融实验评估EMLM方法的分割性能。结果对比实验结果显示，在30%和10%标注比例的情况下，EMLM方法的戴斯相似度系数（DSC）均高于当前6种最佳半监督分割方法（URPC模型、UAMT模型、RD模型、MT模型、AEM模型、CPS模型），在50%标注比例的情况下，EMLM方法的DSC高于MT模型、RD模型、CPS模型、UAMT模型（P<0.05）。消融实验结果显示，使用Baseline模型同时配合EMLM方法时的DSC大于仅使用Baseline模型或者使用Baseline模型单独配合熵最小化损失（P<0.05），与使用Baseline模型单独配合LesionMix数据增强方法差异无统计学意义（P>0.05）。结论对于肺癌病灶分割，EMLM方法可以有效降低对标注数据的依赖并实现良好的分割效果。LesionMix数据增强方法与熵最小化损失实现了对肺癌病灶的重复利用，提高了标注的利用效率，同时可以更好地拟合真实数据分布情况而获得更佳的分割结果，从而有效提升了模型对肺癌病灶的分割能力。

Objective To establish a semi⁃supervised lung cancer CT imaging segmentation method: entropy minimization LesionMix (EMLM) based on the LesionMix data augmentation and entropy minimization loss. Methods First of all, the LesionMix data augmentation method was proposed, namely, lesion information was extracted and reutilized from a small number of annotated CT imaging to improve the utilization efficiency of annotated data. Secondly, a two⁃stage semi⁃supervised training strategy was proposed. In the first stage, the LesionMix data augmentation method was used to enable the model to quickly learn the lesion characteristics of a small number of annotated data. In the second stage, entropy minimization loss function was used to fit the real data distribution and improve the segmentation effect of the model. Finally, on the LIDC⁃IDRI data set, the segmentation performance of the EMLM method was evaluated by comparative experiment and ablation experiment. Results The results of comparative experiment revealed that Dice similarity coefficient (DSC) of the EMLM method was higher than that of the current six optimal semi⁃supervised segmentation methods (URPC model, UAMT model, RD model, MT model, AEM model, and CPS model) at 30% and 10% annotated ratio, and at 50% annotated ratio, DSC of EMLM method was higher than that of MT model, RD model, CPS model, and UAMT model (P<0.05). The results of ablation experiment indicated that DSC of Baseline model combined simultaneously with EMLM method was greater than that of Baseline model alone or Baseline model combined alone with entropy minimization loss (P<0.05), whereas there was no statistically significant difference between Baseline model combined simultaneously with EMLM method and Baseline model combined alone with LesionMix data augmentation method (P>0.05). Conclusion For lesion segmentation of lung cancer, EMLM method can effectively reduce the dependence on annotated data and achieve a favorable segmentation effect. The LesionMix data augmentation method and entropy minimization loss realize the reutilization on lung cancer lesions, improve the utilization efficiency of annotations, and simultaneously can better fit the real data distribution to obtain a superior segmentation effect, thereby effectively elevating the model's segmentation ability to lung cancer lesions.