Deep learning to estimate human epidermal growth factor receptor 2 status from hematoxylin and eosin-stained breast tissue images

Published in Journal of Pathology Informatics, 2019

Several therapeutically important mutations in cancers are economically detected using immunohistochemistry (IHC), which highlights the overexpression of specific antigens associated with the mutation. However, IHC panels can be imprecise and relatively expensive in low-income settings. On the other hand, although hematoxylin and eosin (H&E) staining used to visualize the general tissue morphology is a routine and low cost, it does not highlight any specific antigen or mutation.Using the human epidermal growth factor receptor 2 (HER2) mutation in breast cancer as an example, we strengthen the case for cost-effective detection and screening of overexpression of HER2 protein in H&E-stained tissue. We use computational methods that reliably detect subtle morphological changes associated with the over-expression of mutation-specific proteins directly from H&E images. Our pipeline achieved an AUC of 0.82 (confidence interval [CI]: 0.65–0.98) on held-out cases and an AUC of 0.76 (CI: 0.61–0.89) on the independent dataset from TCGA. We also demonstrate the region-level correspondence of HER2 overexpression between a patient’s IHC and H&E serial sections. Our work strengthens the case for automatically quantifying the overexpression of mutation-specific proteins in H&E-stained digital pathology, and it highlights the importance of multi-stage machine learning pipelines for added robustness and interpretability.

Citation

Anand, D., Kurian, N. C., Dhage, S., Kumar, N., Rane, S., Gann, P. H., & Sethi, A. (2020). Deep learning to estimate human epidermal growth factor receptor 2 status from hematoxylin and eosin-stained breast tissue images. Journal of Pathology Informatics, 11.

Paper Link