Advancing Multimodal AI for Breast Cancer Detection, Interpretation, and Risk Profiling

Overview

We are developing multimodal AI systems to better predict and detect breast cancer.

Breast cancer remains the most prevalent cancer among women worldwide and a leading cause of cancer-related mortality. Early detection through imaging-based screening has dramatically improved outcomes, yet the process remains limited by human variability, false positives, and the challenge of identifying subtle, high-risk features across modalities such as mammography and ultrasound. Our research aims to address these limitations through the development of multimodal artificial intelligence systems that enhance both cancer detection and risk prediction.

To accommodate the unique challenges of high-resolution breast imaging, we have designed hierarchical and weakly supervised AI models capable of learning from image-level labels while producing lesion-level interpretability maps. These systems matched radiologist performance in detecting malignant findings on mammography and ultrasound while enabling a significant reduction in false positives during reader studies. Beyond supervised training, we advanced self-supervised and multiple-instance learning (MIL) methods to improve instance-level representations when only coarse labels are available, a common scenario in clinical datasets.

Our models integrate multi-view and multimodality information, combining mammography, ultrasound, and density features to enhance predictive accuracy and generalizability across millions of clinical images. This line of research not only advances technical understanding of AI architectures for medical imaging but also provides clinically actionable tools that improve diagnostic consistency, reduce unnecessary biopsies, and enable data-driven breast cancer risk profiling—paving the way to precision screening and personalized prevention strategies.

Keywords

Breast Cancer
AI
Mammography

**Figure 1.** (a) Ultrasound (US) images were pre-processed to extract the breast laterality (i.e., left or right breast) and to include only the part of the image which shows the breast (cropping out the image periphery which typically contains textual metadata about the patient and US acquisition technique). (b) For each breast, we assigned a cancer label using the recorded pathology reports for the respective patient within a time interval ranging from 30 days before to 120 days after the US examination. We applied additional filtering on the internal test set to ensure that cancers in positive exams are visible in the US images and negative exams have at least one cancer-negative follow-up. (c) The AI system processes all US images acquired from one breast to compute probabilistic predictions for the presence of malignant lesions. The AI system also generates saliency maps that indicate the informative regions in each image. (d) We evaluated the system on an internal test set (AUROC: 0.976, 95% CI: 0.972, 0.980, n = 79,156 breasts) and an external test set (AUROC: 0.927, 95% CI: 0.907, 0.959, n = 780 images). e In a reader study consisting of 663 exams (n = 1024 breasts), we showed that the AI system can improve the specificity and positive predictive value (PPV) for 10 attending radiologists while maintaining the same level of sensitivity and negative predictive value (NPV).

**Figure 2.** Overall architecture of the Globally-Aware Multiple Instance Classifier (GMIC). The model first employs a computationally efficient global module to generate a saliency map that captures global context and highlights regions of interest (ROIs) on the mammogram that may correspond to breast cancers. A local module then processes only these ROIs to extract fine-grained spatial details. Finally, a fusion module integrates both global and local information to produce the final cancer diagnosis. The entire model is trained end-to-end using only image-level binary labels indicating the presence of breast cancer, yet it can accurately localize suspicious regions.

Project Team

External Collaborators

Krzysztof J. Geras, PhD (Project Lead), NYU Langone Health
Laura Heacock, MD , NYU Langone Health
Alana A. Lewin, MD , NYU Langone Health
Chang Yu, PhD , NYU Langone Health
Yanqi Xu , NYU Center for Data Science

Publications

Shen Y, Shamout FE, Oliver JR, et al. Artificial intelligence system reduces false-positive findings in the interpretation of breast ultrasound exams. Nat Commun. 2021;12(1):5645. Published 2021 Sep 24. doi:10.1038/s41467-021-26023-2
Shen Y, Wu N, Phang J, et al. An interpretable classifier for high-resolution breast cancer screening images utilizing weakly supervised localization. Med Image Anal. 2021;68:101908. doi:10.1016/j.media.2020.101908
Shen Y, Wu N, Phang J, et al. Globally-Aware Multiple Instance Classifier for Breast Cancer Screening. Mach Learn Med Imaging. 2019;11861:18-26. doi:10.1007/978-3-030-32692-0_3
Shen Y, Park J, Yeung F, Goldberg E, Heacock L, Shamout F, Geras KJ. Leveraging transformers to improve breast cancer classification and risk assessment with multi-modal and longitudinal data. arXiv. Published November 6, 2023. doi:10.48550/arXiv.2311.03217
Xu Y, Shen Y, Fernandez-Granda C, Heacock L, Geras KJ. Understanding differences in applying DETR to natural and medical images. Mach Learn Biomed Imaging. 2025;3:152-170. doi:10.59275/j.melba.2025-g137
Liu K, Zhu W, Shen Y, Liu S, Razavian N, Geras KJ, Fernandez-Granda C. Multiple instance learning via iterative self-paced supervised contrastive learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023:3355-3365.
Wu N, Huang Z, Shen Y, et al. Reducing False-Positive Biopsies using Deep Neural Networks that Utilize both Local and Global Image Context of Screening Mammograms. J Digit Imaging. 2021;34(6):1414-1423. doi:10.1007/s10278-021-00530-6
Schaffter T, Buist DSM, Lee CI, et al. Evaluation of Combined Artificial Intelligence and Radiologist Assessment to Interpret Screening Mammograms. JAMA Netw Open. 2020;3(3):e200265. Published 2020 Mar 2. doi:10.1001/jamanetworkopen.2020.0265
Wu N, Phang J, Park J, et al. Deep Neural Networks Improve Radiologists’ Performance in Breast Cancer Screening. IEEE Trans Med Imaging. 2020;39(4):1184-1194. doi:10.1109/TMI.2019.2945514
Wu N, Geras KJ, Shen Y, Su J, Kim SG, Kim E, Wolfson S, Moy L, Cho K. Breast density classification with deep convolutional neural networks. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE; 2018:6682-6686.
Chen Y, Yang H, Pan H, Siddiqui F, Verdone A, Zhang Q, Chopra S, Zhao C, Shen Y. Burextract-llama: An llm for clinical concept extraction in breast ultrasound reports. In: Proceedings of the 1st International Workshop on Multimedia Computing for Health and Medicine. 2024:53-58.
Liu K, Shen Y, Wu N, Chłędowski J, Fernandez-Granda C, Geras KJ. Weakly-supervised High-resolution Segmentation of Mammography Images for Breast Cancer Diagnosis. Proc Mach Learn Res. 2021;143:268-285.
Zeng KG, Dutt T, Witowski J, et al. Improving Information Extraction from Pathology Reports using Named Entity Recognition. Preprint. Res Sq. 2023;rs.3.rs-3035772. Published 2023 Jul 3. doi:10.21203/rs.3.rs-3035772/v1

Acknowledgements

We acknowledge support from the following grants: NIH 1R01EB036530, Milstein Pilot Project Award 2025, Manhasset Women’s Coalition Against Breast Cancer Research Fund 2024, Shifrin-Myer Breast Cancer Discover Award 2024.