-
Notifications
You must be signed in to change notification settings - Fork 5
Project 3
Project 3: Compute a small-cell neuroendocrine score to predict cancer aggressiveness across lung cancers
Small cell lung cancer is a very aggressive type of lung cancer with stem cell characteristics. Transdifferentiation--the process of transformation from one cancer type to another--toward a small cell phenotype is known to be a major route to acquiring resistance to therapy. A study (Balanis et al. Cancer Cell 2019; see graphical abstract below) has shown that tumors from other cancer types could undergo such transdifferentiation, and that the degree of small-cellness predicts important clinical characteristics such as survival.
The project aims to check whether some rare lung cancers also acquire such a small cell phenotype, by computing a transcriptomic small cell score to test whether tumors with high scores also exist in these tumor types, and test if it also correlates with survival and known molecular groups.
- weights of genes to compute small cell score (Varimax PCA Loadings of PC1.v from Table S1 of Balanis et al. 2019, https://ars.els-cdn.com/content/image/1-s2.0-S153561081930296X-mmc2.xlsx )
- transcriptomic data for neuroendocrine neoplasms (https://github.com/IARCbioinfo/DRMetrics/tree/NextJournalH/data, file read_counts_all.txt.zip)
Scripting in R
- download the datasets
- normalize the gene expression following the steps in https://nextjournal.com/rarecancersgenomics/a-molecular-map-of-lung-neuroendocrine-neoplasms/ (until data pre-processing)
- create a function to compute the small cell score of a tumor transcriptome
- for each sample, extract the list of genes with non-zero weights from Balanis et al. Table S1
- multiply the gene expression by their weights and sum the total to obtain a score
- apply the function to all neuroendocrine neoplasms
- compare the small cell scores (visualisations and statistical tests) across clinical (histopathological groups, age, sex, ) and molecular characteristics (molecular groups)
- combining different datasets (e.g., gene names from Balanis et al. and the neuroendocrine neoplasm transcriptomic data will probably not match perfectly)
- data interpretation
- SCLC score paper https://www.sciencedirect.com/science/article/pii/S153561081930296X
[email protected] (Nicolas Alcala)