Introduction: Consensus Molecular Subtypes (CMS) represent a well-established molecular stratification framework for colorectal cancer (CRC). Existing methods for CMS classification rely on a representative input cohort as a preprocessing step (CMScaller) or have difficulty generalizing to metastatic samples (CMSclassifier). We developed Tempus CMS to overcome these limitations. Our method normalizes gene expression data from input samples to a static reference cohort to enable single sample classification of both primary and metastatic tumors. We evaluated the performance of this classifier on a large, de-identified CRC cohort comprising 8,489 samples from primary and metastatic sites.
Methods: To normalize input data, our method shifts and scales each gene expression value based on the mean and standard deviation of the expression of that gene in the reference cohort (n=2,787 primary and metastatic CRC). CMS calls are then generated via nearest template prediction similar to CMScaller (Eide et al., 2017). The analysis cohort was selected from clinical biopsies within 30 days of primary or metastatic diagnosis excluding reference cohort samples. CMS calls were assessed by comparing subtype prevalence to reported rates and by testing for known enrichments of pathways and genomic markers.
Results: Subtype prevalences for lower gastrointestinal (GI) biopsies and resections (n=5,090) were comparable to reported rates for primary CRC tumors: CMS1 – 12%, CMS2 – 28%, CMS3 – 19%, and CMS4 – 33% with 7.6% no calls. Biopsies from metastatic tissue showed site-specific changes in prevalence: CMS2 increased to 43% in liver samples (n=1,715) and CMS4 was identified in 49% of samples from metastatic sites excluding the liver (n=1,684). In addition, we found that purity had an effect on prevalence: in lower GI tumors with ≥ 50% tumor purity rates of CMS2 and CMS4 were 34% and 26%, respectively, while rates in tumors with purity between 20% and 50% were 20% CMS2 and 41% CMS4. Across all tissue groups, pathway enrichment showed significant associations, including TGF-ꞵ signaling in CMS4 (adj P ≤ 4.4e-15) and cell differentiation in CMS3 (adj P ≤ 3.0e-27). Expected enrichments of CMS markers were also observed: BRAF mutations and microsatellite instability in CMS1 and KRAS mutations in CMS3. In liver tissue, higher KRAS mutation rates were noted in CMS1 (57%) and CMS3 (83%) compared to GI tissue (35% and 67%, respectively).
Conclusions: In this study, we developed an enhanced, single sample CMS-calling algorithm optimized for primary and metastatic sites. Application of this algorithm in a large cohort recapitulated known biology, which suggests that the tool can be used to support clinical studies requiring robust molecular stratification.
VIEW THE PUBLICATION
VIEW THE POSTER