Evaluation of a Mammography-based Deep Learning Model for Breast Cancer Risk Prediction in a Triennial Screening Program.

Rothwell J.W.D.; Rogers P.; Payne N.R.; Huang Y.; Kaggie J.D.; Hickman S.E.; KilburnToppin F.; Kasmai B.; Juette A.; Gilbert, F. J.

Evaluation of a Mammography-based Deep Learning Model for Breast Cancer Risk Prediction in a Triennial Screening Program.

Authors

Rothwell J.W.D.

Rogers P.

Payne N.R.

Huang Y.

Kaggie J.D.

Hickman S.E.

KilburnToppin F.

Kasmai B.

Juette A.

Gilbert, F. J.

Check for full-text access

http://libkey.io/10.1148/radiol.250391

Issue Date

2025

Type

Article

Abstract

Background Deep learning risk algorithms for personalized breast cancer screening outperform traditional methods in retrospective evaluations, but triennial screening assessments are lacking. Purpose To evaluate the predictive ability of 3-year risk scores generated by a deep learning algorithm (Mirai) to identify women who developed interval cancers (ICs) in the UK breast screening program, which invites women aged 50-70 years for triennial mammography. Materials and Methods For this retrospective study, Mirai processed digital screening mammograms with negative results collected from a 3-year cohort (January 2014 to December 2016) across two sites and two primary mammography systems. Exclusions included screen-detected cancers (baseline and next round), implants, and nonstandard views. The reference standard was no cancer diagnosis within 40 months of negative screening, confirmed histopathologically. The primary objective was predicting ICs at 1-, 2-, and 3-year time points after baseline screening. Secondary objectives were assessing predictions across age quartiles and Breast Imaging Reporting and Data System (BI-RADS) breast densities. Areas under the receiver operating characteristic curve (AUCs) and true positives (ICs) were calculated across operating thresholds. Risk score distributions were compared with the Mann-Whitney U test, and AUCs were compared with the DeLong test. Results Analysis included 134 217 examinations from the same number of women (mean age, 59.1 years +/- 7.9 [SD]), including 524 ICs. There was no evidence of performance differences among 1-, 2-, and 3-year IC predictions (P >= .63), age quartiles (P >= .73), or breast densities (P >= .99). Overall AUCs were 0.72 (95% CI: 0.65, 0.78), 0.67 (95% CI: 0.64, 0.70), and 0.67 (95% CI: 0.65, 0.70) for 1-, 2-, and 3-year IC predictions, respectively. C indexes for age quartiles were 0.67 (95% CI: 0.62, 0.71) for age younger than 52 years, 0.70 (95% CI: 0.65, 0.75) for age 52-58 years, 0.71 (95% CI: 0.67, 0.75) for age 59-65 years, and 0.71 (95% CI: 0.67, 0.75) for age of 66 years and older. C indexes for BI-RADS categories a, b, c, and d were 0.70 (95% CI: 0.62, 0.78), 0.69 (95% CI: 0.65, 0.73), 0.68 (95% CI: 0.64, 0.71), and 0.67 (95% CI: 0.62, 0.73), respectively. Three-year risk scores retrospectively predicted 3.6% (19 of 524), 14.5% (76 of 524), 26.1% (137 of 524), and 42.4% (222 of 524) of ICs for women assigned the highest 1%, 5%, 10%, and 20% of scores. Conclusion Mirai could identify women for more frequent screening or additional imaging, detecting ICs earlier. Â© RSNA, 2025 Supplemental material is available for this article. See also the editorial by Philpotts in this issue.

Journal

Radiology

Volume

317

URI

https://hdl.handle.net/20.500.14753/509

Collections

Cancer

Full item page

Evaluation of a Mammography-based Deep Learning Model for Breast Cancer Risk Prediction in a Triennial Screening Program.

Authors

Check for full-text access

Issue Date

Type

Language

Keywords

Research Projects

Organizational Units

Journal Issue

Alternative Title

Abstract

Description

Citation

Publisher

License

Journal

Volume

Issue

URI

PubMed ID

DOI

ISSN

EISSN

Collections