Journal of Clinical Oncology, Tempus-authored – Background: Tumors of unknown origin occur in approximately 5% of newly diagnosed cancers and are difficult to treat without establishing the tissue type from which they derive. Establishing tumor origin guides standard of care treatment for several NCCN targeted therapy guidelines. Leveraging tissue specificity in gene expression profiles, classification models based on RNA expression offer a promising approach to identify the likely primary cancer site in tumors of unknown origin.
Methods: In this study, we developed a transcriptome-based cancer type classifier trained on over 10,000 tissue samples annotated by pathologists and sequenced for RNA expression to identify conserved patterns of expression characteristic of 30 tumor types across primary and metastatic tissue sites. The classifier probabilistically ranks cancer of origin.
Results: Overall, the accuracy of the most probable cancer prediction was 85%, 88% within primary tumors and 77% within metastatic tumors. The top three cancers types with the highest accuracy were colorectal (accuracy in metastatic: 93%, accuracy in primary tumors: 99%), breast (95%, 96%) and lung (87%, 94%). Classifier performance was lower in low-purity metastatic tumors where the surrounding normal tissue obscures the tumor transcriptional profile, though the classifier still achieves 71% accuracy on metastatic tumors with less than 50% purity.
Conclusions: We present a novel method to probabilistically predict tumor type for cancers of unknown origin using RNA-Seq. Our method achieves robust classification that is applicable to primary and metastatic tumors and demonstrates the value of utilizing RNA-Seq to aid cancer diagnosis and treatment decisions.
View the full publication here.
Authors: Jackson Michuda, Catherine Igartua, Joshua SK Bell, Tim Taxter, Raphael Pelossof, and Kevin White