- Research
- Open access
- Published:
Genomic alterations are associated with response to aromatase inhibitor therapy for ER-positive postmenopausal ductal carcinoma in situ: (CALGB 40903, Alliance)
Breast Cancer Research volume 27, Article number: 26 (2025)
Abstract
Purpose
CALGB 40903 (Alliance) was a phase II single arm multicenter trial conducted in postmenopausal patients diagnosed with estrogen-receptor (ER) positive breast ductal carcinoma in situ (DCIS) without invasion. Patients were treated with the aromatase inhibitor (AI) letrozole for 6 months prior to surgery with change in magnetic resonance imaging (MRI) enhancement volume compared to baseline as the primary endpoint. In the current study, we performed sequence analysis of pre- and post-treatment specimens to determine gene expression and DNA copy number parameters associated with treatment and response.
Experimental design
Paraffin sections from pretreatment biopsies and post-treatment surgical specimens were evaluated for presence of DCIS. Proliferation based on KI67 staining was quantified by a study pathologist. Macrodissection of the DCIS components from thin sections was the source of RNA and DNA. Whole-transcriptome RNA and shallow whole-genome DNA sequencing were performed. PAM50 analysis to assign intrinsic subtypes with associated probability of class membership was performed. Differential gene expression comparing responders versus non-responders and pre- versus post-treatment specimens was performed using a two-tiered approach based on candidate genes and a whole genome survey with appropriate multiple testing corrections.
Results
Based on availability of specimens and presence of DCIS component, 29 patients (from the 70 who completed the treatment trial) were included in the final data set, including five who had a pathologic complete response (pCR). Response to treatment was qualified categorically based on a threshold of 10% KI67 in the post-treatment surgical specimen or pCR. Based on this criterion, six of the 29 DCIS were considered non-responders (> 10% KI67) and five subjects with pCR were assigned to the responder group. No standard clinical variables were associated with response. On the basis of gene expression analysis, 19 of the pre-treatment samples were classified as luminal A, all of which were classified as responders. PAM50 classification of the other ten pre-treatment samples included luminal B, HER2, basal, and normal-like, six of which were non-responders. PAM50 class membership shifted from baseline to post-treatment in eight cases, most often from luminal A to normal-like (five cases). Selected genes associated with estrogen receptor levels in invasive breast cancer were higher in AI responsive tumors. AI treatment resulted in reductions in estrogen and proliferation related genes.
Conclusions
Letrozole treatment produced an effective growth response, particularly in DCIS initially classified as luminal A. Study inclusion criteria of DCIS with at least 1% ER positive cells resulted in the inclusion of other subtypes that failed to respond. Treatment also induced both minor and major changes in intrinsic subtype based on PAM50 probabilities. Overall, these data indicate that response to AI treatment in ER( +) DCIS is variable and analogous to that observed in invasive breast cancers.
Translational relevance
Treatment for breast DCIS ranges from active surveillance to mastectomy, often combined with adjuvant endocrine therapy. The work presented here based on a unique neoadjuvant trial provides direct information on hormone therapy responsiveness of this disease and further couples the biology of invasive breast cancer to its non-obligate precursor.
Trial registration: ClinicalTrials.gov Identifier: NCT01439711
Introduction
Ductal carcinoma in situ (DCIS) is a proliferative condition of the breast epithelium, commonly diagnosed on the basis of microcalcifications detected by screening mammography. As it is one of the conditions that carries a substantially elevated risk for subsequent invasive cancer, recommendations for treatment include surgery, radiotherapy, and for DCIS that express hormone receptors, endocrine therapy. While pre-operative neoadjuvant endocrine therapy (NET) is not standard of care for DCIS, in the setting of invasive cancers, NET has been used to treat hormone receptor positive invasive cancers prior to surgery and results in increased breast conservation with up to a 20% complete pathologic response [1]. Pre-operative therapy in general provides unparalleled insight into the response of individual cancers. For endocrine therapy of invasive breast cancer, pre-operative response is also related to relapse-free survival. The pre-operative endocrine prognostic index (PEPI) score, developed and validated on data from two NET trials for invasive breast cancer, integrates a series of 4 variables obtained from the post-treatment surgical specimen [2, 3]. One of these variables is the proliferation rate measured by KI67 with cancers that retain a high percentage of cycling cells after endocrine therapy more likely to recur. This can be valuable knowledge for guiding the application of cytotoxic adjuvant therapy for cancers that respond poorly to endocrine therapy alone.
Cancer and Leukemia Group B (CALGB) 40903 was a phase II single arm multicenter trial conducted in postmenopausal patients diagnosed on core biopsy with estrogen receptor (ER) positive DCIS without invasion [4]. Patients were treated with the aromatase inhibitor (AI) letrozole for 6 months prior to surgery with the primary endpoint being change in magnetic resonance imaging (MRI) enhancement volume compared to baseline. In addition to imaging, tissue-based biomarkers were also examined from baseline biopsies and surgical specimens. Seventy women completed the intervention per the protocol and 9 (15%) of the subjects experienced a complete pathologic response, a rate that is comparable to invasive cancers treated with neoadjuvant endocrine therapy. Also similar to invasive cancer, immunohistochemical detection of ER, progesterone receptor (PR), and KI67 demonstrated significantly reduced levels after letrozole treatment. In the current study, we examined specimens from this trial to determine gene expression and copy number events that were associated with treatment and response.
Materials and methods
Subjects and tissue specimens
The present study was performed using demographic, clinical and genomic data from participants enrolled in CALGB 40903 (NCT01439711), a phase II open-label single-arm multicenter cooperative group trial conducted for postmenopausal patients diagnosed with ER-positive DCIS. All clinical endpoints and measures were finalized by February 2018. CALGB is now part of the Alliance for Clinical Trials in Oncology. The parent trial was reviewed and supported by the National Institutes of Health, Division of Cancer Prevention and was approved by the institutional review board at each participating site. Enrollment for this trial occurred between August 1, 2012 and February 1, 2016. Each participant signed an IRB-approved protocol-specific informed consent document in accordance with federal and institutional guidelines which included consent for secondary analysis. All eligible patients underwent baseline radiological assessment by mammogram and breast MRI followed by 6 months of preoperative therapy with a daily oral dose of 2.5 mg letrozole. Follow-up breast MRI was obtained after 3 and 6 months of treatment. Patients with radiologic progression at 3 months underwent surgery and discontinued letrozole treatment, while those without progression continued treatment for the remaining 3 months followed by definitive surgery. Surgery type was decided according to patient election with surgeon recommendations based on the National Comprehensive Cancer Network treatment guidelines for DCIS. Additional details about the clinical study design and results have been reported [4]. One of the study pathologists (A.H.) reviewed all samples and marked the region of interest (ROI) for macrodissection to isolate nucleic acids for the current study.
Proliferation analysis
The proliferation associated protein KI67 was detected on a representative section from the pre- and post-treatment samples at Duke University. The percentage of cells positive for KI67 was scored by a study pathologist (YYC) in three 20X fields containing DCIS and was averaged to arrive at the final proliferation index for the pre- and post-treatment specimens.
DNA and RNA sequencing
Genomic analyses were conducted at Duke University. Regions containing DCIS were macrodissected and DNA and RNA were extracted for sequencing using the AllPrep kit for FFPE tissue (Qiagen) according to the manufacturer’s instructions. DNA and RNA quantification was performed using a Qubit Fluorometer (Invitrogen). Low-pass whole-genome sequencing to measure copy number variation was performed on DNA sheared to ~ 500 bp after library preparation with the KAPA HyperPrep Kit (Roche). Libraries were pooled to equimolar concentrations and were sequenced on the Illumina NovaSeq 6000 S-Prime flow cell at 2.5 nM concentration producing 150 bp paired end reads.
RNA-seq using 150 bp paired-end reads from 10 ng of total RNA was performed to evaluate global expression patterns using the Sigma-Aldrich SeqPlex kit. Double-stranded cDNA was prepared using the SeqPlex RNA Amplification Kit (Sigma) through 25 cycles. After sonic fragmentation, cDNA was blunt ended, had an “A” added to the 3’ ends, and then ligated to Illumina sequencing adapters. Ligated fragments were amplified for 12 cycles using primers incorporating unique dual index tag and sequenced on the Illumina NovaSeq 6000 platform.
AI response classification
For the present analyses, study participants were divided into “responders” or “non-responders” using a threshold of 10% KI67 after treatment. This value was chosen based on the pre-operative endocrine prognostic index (PEPI) [2] that has been employed for assessing endocrine therapy response for invasive breast cancers. Patients who experienced a pathologic complete response (pCR) had no post-treatment DCIS remaining by definition and these were classified as responders.
Statistics and bioinformatics considerations
Clinical variables: Clinical variables were tested for association with KI67 response using the Mann–Whitney U test and the chi-square test for continuous and categorical variables, respectively. For categorical variables, “unknown” or “indeterminate” values were excluded. For nuclear grade, grade 3 was tested versus grades 1 or 2 combined.
RNA-seq: The quality of the raw sequencing reads, stored in demultiplexed FASTQ files, was assessed and reported by using FastQC (v0.12.1). and MultiQC (v1.14) [5]. Quantitative data from these reports were tabulated for review. Sequences with adapter contaminations and low-quality sequences were cleaned using Trimmomatic (v0.39[6]. The sequencing quality was reassessed following trimming.
The raw sequencing reads were aligned to the reference genome using the STAR (v2.7.10b) [7] aligner and mapped to genomic features, including genes and exons, using STAR’s built-in module. The human reference sequence (GRCh38) and annotation GTF file (hg38) were obtained from GENCODE (v38.13) [8, 9]. The read-level mapping quality was evaluated through STAR output, including fraction of reads mapped to gene regions, ambiguous regions, non-feature regions, or multiple loci. Likewise, the base-level mapping quality was assessed through CollectRnaSeqMetrics from Picard Toolkits (v3.0.0: http://broadinstitute.github.io/picard). This metric counts the number of bases mapped to coding UTR, intergenic regions, intronic regions and ribosomal regions. A detailed report of the quality control metrics is included as part of the Supplementary Material.
Sequence counts were normalized using a variance-stabilizing transformation. Differential expression analyses were performed within the framework of a negative binomial model using R (v4.2.2) [10] and its extension package, DESeq2 (v1.38.3) [11]. To test for changes in pre- vs. post-treatment expression, a patient identifier was included as a covariate and the parameter minReplicatesForReplace was set to “Inf” to prevent outlier replacements. To minimize spurious results from low-expression genes, genes without a minimum of five normalized counts in at least five samples were prefiltered from the test of differential expression between pre- and post-treatment. This prefiltering was relaxed to exclude genes without a minimum of five normalized counts in a least three samples to test differential expression between responder and non-responder patients as the design was less complex. For both tests, log2 fold change (LFC) shrinkage using a normal prior distribution was applied to correct estimates for genes with low counts and high dispersion.
P values for analyses using an a priori defined set of candidate genes were adjusted for family-wise error rate (FWER) using a Bonferroni correction based on the number of genes and number of comparisons (i.e., two comparisons: pre- versus post-treatment and responder versus non-responder) in the analyses. P values for genome-wide analyses were adjusted for false discovery rate (FDR) using a Benjamini–Hochberg procedure. Gene set enrichment analyses (GSEA) were conducted based on the functionally annotated Reactome [12, 13] database and performed using the R package fgsea (v1.24.0) [14]. The package was run using default parameters, including limiting the pathways tested to only those with 15 to 500 genes. P values for pathway analyses were adjusted for FDR using a Benjamini–Hochberg procedure.
As reported by Gao, et al. [15], tissue samples that differ in size influencing the rate of formalin fixation resulting in a pattern of gene expression differences. Pre-treatment samples were all core needle biopsy specimens whereas after treatment specimens were the larger surgical excisional biopsies. Genes have been reported as being sensitive to fixation timing were indicated with an asterisk in pre- versus post-treatment DE results and were excluded from GSEA.
To examine the correlation of top genome-wide hits associated with response to letrozole with expression of ESR1 in the TCGA (Firehose Legacy) dataset, the TCGA data were accessed using cBioPortal [16,17,18] on 9/12/2023; TCGA data were generated by the TCGA Research Network (https://www.cancer.gov/tcga).
PAM50: PAM50 is a widely used 50-gene signature that classifies invasive breast cancer into five molecular intrinsic subtypes: luminal A, luminal B, HER2-enriched, basal-like, and normal-like. Each of the five molecular subtypes varies by their biological properties and prognoses [19, 20].
The PAM50 classification was conducted based on normalized gene counts using the variance-stabilization transformation implemented by the DESeq2 package. Molecular subtypes of pre-treatment samples and post-treatment samples were classified separately using the genefu package (v2.30.0) [21] from Bioconductor. To illustrate switches in PAM50 classification between pre- and post-treatment samples, a Sankey diagram was generated in R using the package ggsankey v0.0.99999 [22]. Subtype and subtype switches were based on the highest probability (derived from the genefu package) of class membership (Supplemental Table S1).
DNA-seq: The raw sequences were aligned to the human hg38 reference genome using the BWA-MEM algorithm (v0.7.17) [23]. The reference genome was obtained from GATK bundle v0 (https://gatk.broadinstitute.org). The aligned bam files were preprocessed removing duplicates and position recalibration by using picard-tools (v3.0.0; http://broadinstitute.github.io/picard/faq.html) and GATK (v4.4.0.0) [24]. Copy number variation (CNV) was called using CODEX2 (v1.3.0) [25] for all available pre-treatment samples. Welch’s two-sample t-test was used to analyze the association with KI67 score and amplification vs. non-amplification for selected genes. P values for the Welch’s two-sample t-test were adjusted for family-wise error rate (FWER) using a Bonferroni procedure. The Mantel–Haenszel chi-squared test with continuity correction was used to test for association of in any of the selected genes with KI67 response status.
Results
Of the 70 patients who completed letrozole treatment per protocol, 59 had baseline RNA samples and 26 had matching post-treatment samples available for the present study (CONSORT diagram, Fig. 1). Two samples were deemed to have insufficient DCIS to isolate RNA and DNA, and one sample was excluded due to low sequencing quality. Additionally, one pre-treatment sample was eliminated based on its PC1 outlier status from a principal components analysis of RNA-seq data (Supplementary Fig. S1). The final analysis population therefore consisted of 24 subjects with paired pre-and post-treatment RNA samples, and 5 subjects with pre-treatment samples who achieved pCR. Of these 29 subjects, 24 had pre-treatment DNA-seq samples passing quality checks, including three who achieved pCR.
The characteristics of the responding and nonresponding groups are summarized in Table 1. No standard clinical variables were associated with response, although there was a trend showing higher proportion of high-grade DCIS in the non-responder group (p = 0.06).
PAM50 classification
We chose PAM50 as the most common expression-based categorization system for breast cancer. Classification was conducted using all available RNA-seq samples passing quality control, classifying 55 pre-treatment samples and 26 post-treatment samples separately. Individual assignments with associated probabilities are provided for the final analysis population in Supplementary Table S1 (Excel Supp Table file). This table also contains information on ER, PR, grade, and imaging parameters, including the MRI response category (the original clinical endpoint of the study). Figure 2A is a spaghetti plot showing the KI67 response (pre- and post-treatment where available) color coded by the pretreatment PAM50 assignments, including the five cases that had a pCR. The first notable aspect of this comparison is that two of these cancers were categorized as basal with associated probabilities > 0.6, (one of which was a pCR). One of the entry criteria for this treatment trial was expression of either ER or PR in at least 1% of the tumor cells assessed by immunohistochemistry and pathologist review. One of the cases classified as basal had low levels of ER (1% positivity) and no demonstrable PR staining, which is consistent with the intrinsic subtype assignments. This case had persistently high proliferation after treatment and was categorized in this study as a non-responder. The other case classified as basal by PAM50 had high ER expression, relatively low PR expression, and experienced a pCR. Of the remaining non-responders, three were HER2 and two were luminal B. Conversely, of the remaining 22 responders, 19 were classified as luminal A (p < 0.001 Fisher’s Exact Test for association of KI67 response with luminal A versus other subtypes). Of the five cases that experienced a pathologic complete response, three were classified as luminal A, one as normal-subtype, and one as basal.
A Proliferation rates (KI67) of the matched pre- and post-letrozole-treated DCIS specimens, color-coded by PAM50 subtypes. The five cases with a pathologic complete response (pCR) are shown in the pre-treatment column (triangles). There were 24 cases with matched pre- and post-treatment samples (circles). Cases that exhibited a subtype switch upon treatment are indicated by the change in color of the post-treatment dots. B Sankey diagram of PAM50 classifications of the 24 matched pre- and post-letrozole-treated DCIS specimens
We next evaluated whether there was any change in intrinsic subtype after treatment. We highlight any individual switch (based on the class assignment probabilities provided in Supplementary Table S1) in subtype in Fig. 2A and summarized over the cases in the Sankey diagram (Fig. 2B). Overall, eight cases demonstrated a shift in subtype assignment after treatment. In particular, five of the responsive luminal A cases were classified as “normal” subtype after treatment. In four of these cases, the pre-treatment assignments also carried a substantial probability of being a “normal” subtype. Of the cases that failed to respond to letrozole and experienced a shift in subtypes, one HER2 switched to basal and one luminal B switched to “normal” subtype.
Gene expression changes resulting from treatment
Suppression of estrogen signaling by the aromatase inhibitor, particularly in the responsive cases, should be reflected by substantially altered gene expression. We compared pre- versus post-treatment expression across the study set irrespective of intrinsic subtype or KI67 response in two steps: (1) using an a priori defined set of 76 candidate genes (Supplementary Table S2, Excel file) including the PAM50 as cardinal elements of breast cancer fundamental processes and an additional 26 genes implicated in endocrine response and (2) a genome-wide analysis. Following prefiltering, 74 of the 76 candidate genes remained to be tested for differential expression (Supplementary Material: RNAseq-DE.html). P values for each differential expression analysis were adjusted for multiple testing; FWER-adjusted for the candidate gene set analysis, and FDR-adjusted for the genome-wide analysis. Top genes in each case are presented in Fig. 3 and tabulated with adjusted significance values in Supplementary Tables S3 and S4. As expected, down-regulated genes were largely associated with proliferation and the cell cycle (e.g., MKI67, ANLN, CCNE1, and CENPF) or estrogen signaling (e.g., PGR, SLC39A6). The only candidate gene with substantially higher expression after treatment was KRT5, a basal cytokeratin. Genome-wide analysis largely confirmed these findings with most of the top down-regulated genes associated with proliferation or estrogen responsiveness. The most prominent upregulated genome-wide hits are components of the immediate early (IE) response (FOS, FOSB, DUSP1, and MAFB). Previous work by Gao et al. demonstrated that there is a coordinate increase in the expression of a series of genes, including these canonical early response genes, related to fixation time, i.e., the rapid fixation of core needle biopsies (pretreatment) compared to slower penetration of the fixative into a surgical specimen (post treatment) [15] leading to a systematic issue in comparing these matched specimens. These genes remain in the box plot (Fig. 3, denoted with asterisks) but all significant genes identified by Gao et al. were eliminated from pathway analyses. MMP7, a matrix metalloproteinase and not known to be part of the IE response, also showed increased levels after AI treatment. Significant pathways derived from GSEA (Reactome gene sets) lend further support to the nature of the change of the transcriptional program after treatment, particularly as it relates to proliferation and the cell cycle (Supplementary Table S5). Of the top pathways, “Extracellular Matrix Organization” was the only pathway not associated with cell division with MMP7 as one of the component genes. Spaghetti plots of several canonical genes indicate the response of individual tumors (Supplementary Fig. S2) and full listings of the genome-wide pre- versus post-treatment differential expression and GSEA results are provided as Supplementary Tables S9 and S10 (Excel file).
Top genes from candidate gene and genome-wide analyses, ranked by p value for differential expression after letrozole treatment. Gene lists and pathways with p values are reported in Supplementary Tables S2, S3, and S4. PGR was both a candidate gene and a top-ranking gene in the genome-wide analysis. Genes indicated with an asterisk have been found to be sensitive to the timing of tissue fixation (Gao, et al., 2018), which differed in pre- and post-treatment samples
Gene expression associated with KI67 reduction
Categorizing the DCIS patients into responders and non-responders based on the 10% KI67 level in the post-treatment sample (including the five pCR cases as responders), we explored gene expression associated with this phenotype. For this analysis, we used the pretreatment samples to identify baseline characteristics of responsive/non-responsive DCIS. Again, two types of analyses were performed: (1) the a priori set of 76 (54 passing QC) candidate genes and (2) a genome-wide analysis. P values for each differential expression analysis were adjusted for multiple testing; FWER adjusted for the candidate gene set analysis, and FDR adjusted for the genome-wide analysis. Only one gene in the a priori list, ESR1 (FWER adjusted p = 0.0171), was significantly associated after correction for multiple testing. Other estrogen-related genes (including PGR, NAT1, BCL2, MAPT, GATA3, and IL6ST) were observed to be higher in responding tumors, whereas the proliferation associated CENPF was observed to be higher in non-responsive DCIS (Fig. 4, Supplementary Table S6). Elevated ERBB2 expression trended with non-responders; however, two of the responsive DCIS also expressed high levels of this gene (Fig. 4). For the genome-wide analysis, a number of transcripts survived false discovery correction (Fig. 4, Supplementary Table S7, GSEA Supplementary Table S8) and most of these were highly correlated with estrogen receptor expression in invasive breast cancers (TCGA correlations shown in Supplementary Fig. S3). Top coding genes not correlated with ESR1 include CXCL9 and LLGL2, both with higher levels in the non-responding tumors. Full lists of the genome-wide responders vs. non-responders differential expression and GSEA results are provided as Supplementary Tables S11 and S12 (Excel file).
Copy number changes and response
Low-pass whole-genome sequencing was performed on DNA extracted from the same macrodissected material from which the RNA was derived. For the 24 pre-treatment samples included in the DNA-seq analysis, mean coverage of the genome ranged from 0.05 to 0.78X with a median of 0.52X. Given the complexity of copy number changes (variable intervals, admixtures of normal and DCIS cells), we chose to focus our analysis on prior findings of three gene amplification events associated with neoadjuvant endocrine resistance in invasive breast cancers; ERBB2, CCND1 and FGFR1 [26, 27]. Grouping cases that were amplified at these loci versus all others (diploid or loss), we observed no correlation between KI67 response and copy number gains containing any of these three genes individually or taken together (i.e., amplification at any of the three loci versus all others) (Supplementary Fig. S4).
Discussion
Neoadjuvant or pre-operative therapy provides the most direct opportunity to analyze tumor response in vivo. Breast DCIS is typically excised before treatment but the current study is based on samples collected in a unique clinical trial of 6 months of pre-operative therapy with the aromatase inhibitor letrozole for postmenopausal women presenting with hormone receptor positive DCIS. We performed RNA-seq analysis and low-pass whole-genome sequencing on the pretreatment biopsies and post-treatment surgical specimens to identify parameters related to response to estrogen deprivation.
The most clinically accepted measure of response for invasive cancer in the neoadjuvant setting, particularly for chemotherapy, is whether the treatment results in a pCR, as this is a powerful predictor of long-term outcome. Endocrine therapy, which is perhaps more cytostatic than cytotoxic, produces a pCR less often and therefore alternative measures of effectiveness or response have been proposed. One of these is the rate of tumor cell proliferation measured after or during therapy, typically evaluated by immunostaining for the KI67 antigen. In a neoadjuvant trial comparing aromatase inhibitors (ACOSOG Z1031; NCT00265759), invasive cancers that exhibited > 10% proliferating cells after treatment were considered nonresponsive to the hormone therapy and were switched to chemotherapy [3]. The DCIS treatment trial from which the specimens for this study were derived used the change in MRI enhancement volume at six months of treatment compared to baseline as the primary response endpoint [4]. For the current study, we chose post-treatment proliferation as a direct pharmacodynamic indicator of response to therapy. We grouped cases that achieved a pCR together with those that had < 10% KI67 positive cells after treatment into the “responder” category. Based on these criteria, six cases were considered “non-responders”.
RNA-seq analysis of these specimens, both before and after treatment, provides a detailed picture of the DCIS lesions related to hormone deprivation. We applied the gene expression classifier developed for invasive breast cancers consisting of 50 genes (PAM50) to categorize the lesions into intrinsic subtypes [19]. It is an open question as to whether PAM50 accurately classifies DCIS, particularly whether there is a completely comparable basal subtype and whether the classifier “forces” cancers into a pre-defined ratio of subtypes based on the distribution of invasive cancers [28]. Nonetheless, this approach does provide a gene-expression-based classification that revolves around receptor status and other key biologic facets that are present in both pre-invasive and invasive breast cancers. Entry criteria for the clinical trial did not intentionally exclude possible HER2, basal, or normal subtype since DCIS with as few as 1% estrogen or progesterone receptor positive tumor cells were eligible. PAM50 subtypes did closely track with response to letrozole in this study. All of the DCIS classified as luminal A responded to the treatment whereas the non-responders were a mixture of HER2, luminal B, and basal subtypes. This is consistent with the distribution of subtypes with respect to KI67 response observed in invasive cancers treated with aromatase inhibitors [1], strengthening the link between invasive and pre-invasive cancers and their shared biology.
We also performed RNA-seq and PAM50 analysis on the residual DCIS after treatment. In both responders and non-responders, we noted some classification shifting. Most commonly, we observed that a number of luminal A responders ended up being classified as normal subtypes after treatment. The majority of these cases had a substantial probability of assignment as normal-subtype before treatment and retained luminal A probability after treatment. This shift could be related to fewer DCIS epithelia in the post-treatment specimen or to a reduced ER-related gene expression signature after hormone deprivation. There were several notable subtype shifts in the non-responders, including a luminal B reclassified as “normal” and a HER2 that was classified as basal after treatment. The mixed probabilities of these cases prior to treatment could relate to pre-existing tumor heterogeneity and selection during the course of the neoadjuvant treatment.
Specific gene expression and DNA amplification events that correlated with response were also explored. High levels of SCUBE2, part of the Oncotype DX estrogen response panel of genes, was associated with response consistent with an intact hormone signaling pathway. Other genes (BMPR1B, INPP5, TMEM26) that correlate with ESR1 expression in invasive cancers were also top hits related to response. The top protein coding genes elevated in the non-responders were CXCL9 and LLGL2. Prior studies in invasive breast cancer have shown that low levels of LLGL2 are associated with clinical outcome after tamoxifen therapy [29, 30] and our current results support this finding with respect to DCIS and estrogen deprivation. No apparent links between CXCL9 expression and hormone response have been described. HER2 expression and/or amplification has been associated with hormone therapy resistance and we do note a non-significant trend of higher levels of ERBB2 and the co-expressed GRB7 in non-responding tumors. However, two of the responders (one luminal A and one HER2) also had high levels of ERBB2 indicating that this is not an entirely dominant molecular signal. DNA amplification events in invasive breast cancer, including ERBB2, have been associated with reduced response to neoadjuvant hormonal therapy [26]. We analyzed these specifically in our DCIS data set and did not find an association with response to letrozole.
Comparison of pre- versus post-treatment gene expression was also revealing. As expected, proliferation and estrogen-associated genes dominate the list of genes that are down-regulated after treatment. Of note, two genes associated with more basal characteristics, KRT5 and MMP7 demonstrated elevated expression levels suggestive of treatment induced cellular selection. We also confirm prior work from Gao et al. regarding systematic changes in early response genes related to the differential fixation time of the small core biopsies versus the larger surgical specimens [15].
Limitations of the study include the relatively small subset of samples from the original clinical trial available for this study, which may bias toward larger tumors and the age of the clinical specimens at the time of molecular analyses. Mitigating the second point, we removed specimens that failed RNA-seq QC metrics. Strengths of the study include the unique clinical trial forming the basis for the work and that all samples were prospectively collected in a uniform manner in the context of this clinical trial.
In summary, the results from this study indicate that neoadjuvant aromatase inhibitor treatment of DCIS shows a number of similarities to invasive cancer. A luminal phenotype at baseline was strongly associated with KI67 suppression with AI. The results of our study continue to highlight that an intact estrogen signaling pathway is the primary determinant for response and resistance and that genomic subtyping may identify patients most likely to benefit from AI treatment. Studies are ongoing to determine whether endocrine therapy alone may be sufficient treatment for the most endocrine-sensitive DCIS and whether genomic predictors could help stratify treatment strategy.
Code and data availability
The scripts to reproduce the preprocessing of sequencing data and downstream analysis reported in this manuscript are accessible through a public source code repository (https://gitlab.oit.duke.edu/dcibioinformatics/pubs/calgb40903). The rendered HTML reports for the analysis of the study RNA- and DNA-sequencing data, and the TCGA data are available in the code repository. The final version of the codebase is also available through Zenodo: DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.5281/zenodo.14225948. The source clinical and genomic data are currently available from the corresponding author by request. The process of depositing these data into the database of Genotypes and Phenotypes (dbGaP) is in progress. Data quality was ensured by review of data by study statisticians and investigators.
References
Ellis MJ, Suman VJ, Hoog J, Lin L, Snider J, Prat A, Parker JS, Luo J, DeSchryver K, Allred DC, et al. Randomized phase II neoadjuvant comparison between letrozole, anastrozole, and exemestane for postmenopausal women with estrogen receptor-rich stage 2 to 3 breast cancer: clinical and biomarker outcomes and predictive value of the baseline PAM50-based intrinsic subtype–ACOSOG Z1031. J Clin Oncol. 2011;29(17):2342–9.
Ellis MJ, Tao Y, Luo J, A’Hern R, Evans DB, Bhatnagar AS, Chaudri Ross HA, von Kameke A, Miller WR, Smith I, et al. Outcome prediction for estrogen receptor-positive breast cancer based on postneoadjuvant endocrine therapy tumor characteristics. J Natl Cancer Inst. 2008;100(19):1380–8.
Ellis MJ, Suman VJ, Hoog J, Goncalves R, Sanati S, Creighton CJ, DeSchryver K, Crouch E, Brink A, Watson M, et al. Ki67 proliferation index as a tool for chemotherapy decisions during and after neoadjuvant aromatase inhibitor treatment of breast cancer: results from the American college of surgeons oncology group Z1031 trial (Alliance). J Clin Oncol. 2017;35(10):1061–9.
Hwang ES, Hyslop T, Hendrix LH, Duong S, Bedrosian I, Price E, Caudle A, Hieken T, Guenther J, Hudis CA, et al. Phase II single-arm study of preoperative letrozole for estrogen receptor-positive postmenopausal ductal carcinoma in situ: CALGB 40903 (Alliance). J Clin Oncol. 2020;38(12):1284–92.
Ewels P, Magnusson M, Lundin S, Kaller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016;32(19):3047–8.
Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114–20.
Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29(1):15–21.
Harrow J, Denoeud F, Frankish A, Reymond A, Chen CK, Chrast J, Lagarde J, Gilbert JG, Storey R, Swarbreck D, et al. GENCODE: producing a reference annotation for ENCODE. Genome Biol. 2006;7(S4):1–9.
Harrow J, Frankish A, Gonzalez JM, Tapanari E, Diekhans M, Kokocinski F, Aken BL, Barrell D, Zadissa A, Searle S, et al. GENCODE: the reference human genome annotation for The ENCODE project. Genome Res. 2012;22(9):1760–74.
R Core Team: R (2022) A Language and Environment for Statistical Computing. https://www.R-projectorg/
Love M, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550.
Croft D, Mundo AF, Haw R, Milacic M, Weiser J, Wu G, Caudy M, Garapati P, Gillespie M, Kamdar MR, et al. The Reactome pathway knowledgebase. Nucleic Acids Res. 2014;42:D472-477.
Fabregat A, Jupe S, Matthews L, Sidiropoulos K, Gillespie M, Garapati P, Haw R, Jassal B, Korninger F, May B, et al. The reactome pathway knowledgebase. Nucleic Acids Res. 2018;46(D1):D649–55.
Korotkevich G, Sukhov V, Budin N, Shpak B, Artyomov MN, Sergushichev A: Fast gene set enrichment analysis. bioRxiv 2021:060012.
Gao Q, López-Knowles E, Cheang MC, Morden J, Ribas R, Sidhu K, Evans D, Martins V, Dodson A, Skene A, Holcombe C, et al. Major impact of sampling methodology on gene expression in estrogen receptor-positive breast cancer. JNCI Cancer Spectr. 2018;2(2):005.
Cerami E, Gao J, Dogrusoz U, Gross BE, Sumer SO, Aksoy BA, Jacobsen A, Byrne CJ, Heuer ML, Larsson E, et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2012;2(5):401–4.
de Bruijn I, Kundra R, Mastrogiacomo B, Tran TN, Sikina L, Mazor T, Li X, Ochoa A, Zhao G, Lai B, et al. Analysis and visualization of longitudinal genomic and clinical data from the AACR project GENIE biopharma collaborative in cBioPortal. Cancer Res. 2023;83(23):3861–7.
Gao J, Aksoy BA, Dogrusoz U, Dresdner G, Gross B, Sumer SO, Sun Y, Jacobsen A, Sinha R, Larsson E, et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal. 2013;6(269):11.
Chia SK, Bramwell VH, Tu D, Shepherd LE, Jiang S, Vickery T, Mardis E, Leung S, Ung K, Pritchard KI, et al. A 50-gene intrinsic subtype classifier for prognosis and prediction of benefit from adjuvant tamoxifen. Clin Cancer Res. 2012;18(16):4465–72.
Parker JS, Mullins M, Cheang MC, Leung S, Voduc D, Vickery T, Davies S, Fauron C, He X, Hu Z, et al. Supervised risk predictor of breast cancer based on intrinsic subtypes. J Clin Oncol. 2009;27(8):1160–7.
Gendoo DM, Ratanasirigulchai N, Schroder MS, Pare L, Parker JS, Prat A, Haibe-Kains B. Genefu: an R/Bioconductor package for computation of gene expression-based signatures in breast cancer. Bioinformatics. 2016;32(7):1097–9.
Sjoberg D. ggsankey: A package that makes it easy to implement sankey, alluvial and sankey bump plots in ggplot2. R package version 0.0.99999. 2024. https://github.com/davidsjoberg/ggsankey.
Li H: Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. 2013.
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20(9):1297–303.
Jiang Y, Nathanson KL, Zhang NR. CODEX2: full-spectrum copy number variation detection by high-throughput DNA sequencing. BioRxiv. 2017. https://doiorg.publicaciones.saludcastillayleon.es/10.1101/211698.
Giltnane JM, Hutchinson KE, Stricker TP, Formisano L, Young CD, Estrada MV, Nixon MJ, Du L, Sanchez V, Ericsson PG, et al. Genomic profiling of ER(+) breast cancers after short-term estrogen suppression reveals alterations associated with endocrine resistance. Sci Transl Med. 2017;9(402):eaai7993.
Arpino G, Green SJ, Allred DC, Lew D, Martino S, Osborne CK, Elledge RM. HER-2 amplification, HER-1 expression, and tamoxifen response in estrogen receptor-positive metastatic breast cancer: a southwest oncology group study. Clin Cancer Res. 2004;10(17):5670–6.
Bergholtz H, Lien TG, Swanson DM, Frigessi A, Daidone MG, Tost J, Wärnberg F, Sørlie T. Contrasting DCIS and invasive breast cancer by subtype suggests basal-like DCIS as distinct lesions. NPJ Breast Cancer. 2020;6:26.
Hisada T, Kondo N, Wanifuchi-Endo Y, Osaga S, Fujita T, Asano T, Uemoto Y, Nishikawa S, Katagiri Y, Terada M, et al. Co-expression effect of LLGL2 and SLC7A5 to predict prognosis in ERalpha-positive breast cancer. Sci Rep. 2022;12(1):16515.
Saito Y, Li L, Coyaud E, Luna A, Sander C, Raught B, Asara JM, Brown M, Muthuswamy SK. LLGL2 rescues nutrient stress by promoting leucine uptake in ER(+) breast cancer. Nature. 2019;569(7755):275–9.
Funding
Research reported in this publication was supported by the National Cancer Institute of the National Institutes of Health under Award Numbers U10CA180821, U10CA180882, and U24CA196171 (to the Alliance for Clinical Trials in Oncology), UG1CA232760, UG1CA233253, and UG1CA233329 (https://acknowledgments.alliancefound.org), The Breast Cancer Research Foundation, and a P30 Cancer Center Support Grant (P30CA014236). This work used a high-performance computing facility partially supported by grants 2016-IDG-1013 (“HARDAC + : Reproducible HPC for Next-generation Genomics”) and 2020-IIG-2109 (“HARDAC-M: Enabling memory-intensive computation for genomics”) from the North Carolina Biotechnology Center. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Author information
Authors and Affiliations
Contributions
JM, KO, and ESH planned the study design and analysis. DZ, TH, LS performed the nucleotide extraction and gene sequencing. YYC and AH performed the pathology review and scoring of IHC staining. ESH, TH, IB, JG contributed to patient recruitment and clinical sample collection. EP performed breast imaging review. JS YD, ML, ABS, and KO performed the statistical analysis. ESH obtained funding for the work and was PI of the parent clinical trial. All authors reviewed and approved the final manuscript. Supported by the National Cancer Institute of the National Institutes of Health under Award Nos. U10CA180821 and U10CA180882 (to the Alliance for Clinical Trials in Oncology), and in part by funds from The Breast Cancer Research Foundation (to E.S.H.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Clinical Trial information: NCT01439711.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Marks, J.R., Zhang, D., Hardman, T. et al. Genomic alterations are associated with response to aromatase inhibitor therapy for ER-positive postmenopausal ductal carcinoma in situ: (CALGB 40903, Alliance). Breast Cancer Res 27, 26 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13058-025-01963-5
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13058-025-01963-5