A repository with computational code and outputs to accompany the manuscript Waryah et al. (in preparation), Synthetic epigenetic reprograming of breast cancer hybrid epithelial to mesenchymal states using the CRISPR/dCas9 platform.
Analysis of the RNA-seq and DNAme data was performed by @jcursons and @MomenehForoutan. Analysis of the ChIP-seq and ATAC-seq data was performed by Dr Christian Pflueger. Please see the script folder for further details.
For information on the associated code please contact:
- Dr Joe Cursons (joseph.cursons (at) monash.edu)
- Dr Momeneh (Sepideh) Foroutan (momeneh.foroutan (at) monash.edu)
- Dr Christian Pflueger (christian.pflueger (at) uwa.edu.au)
- Dr Liam Fearnley (fearnley.l (at) wehi.edu.au)
- Dr Ramyar Molania (molania.r (at) wehi.edu.au )
For further information on the manuscript or project please contact:
- Dr Charlene Waryah (charlene.waryah (at) perkins.org.au)
- A/Prof. Pilar Blancafort (pilar.blancafort (at) uwa.edu.au)
The scientific manuscript associated with this repository has been submitted for peer review.
Data generated for this project will be uploaded to GEO over the coming weeks.
ENSEMBL reference for RNA-seq analysis:
A folder containing intermediate output files used in this study.
- 20180518_dCas_pipe_out.txt: off-target gene predictions for gRNAs used in this study; determined with dsNickFury using the Azimuth and Elevation on-/off-target scoring algorithms. For further information please refer to the script and the methods section of the associated manuscript.
- Folder containing different expression results from the RNA-seq.
- Generated by script/rnaseq_-_tximport-voom-limma.RScript
- Uses data from preproc/salmon as well as the ENSEMBL reference file (GRCh38 v89) listed above
- Please contact Joe Cursons for further information on this analyses.
- Folder containing the salmon pre-processed data for the SUM159 and MDA-MB-231 RNA-seq data (Figures NN & GSENNNN).
- Folder containing the pre-processed data for the SUM159 differential expression data (Figures NN & GSENNNNNN).
- Folder containing the subset of TCGA-BRCA data used for this paper (Figures NN).
For further details please see:
- Comprehensive Molecular Portraits of Invasive Lobular Breast Cancer
- Giovanni Ciriello, Michael L. Gatza et al. (2015). Cell.
- Comprehensive molecular portraits of human breast tumours
- TCGA Network (2012). Nature
- Folder containing scripts and functions used for data analysis and figure generation.
- rnaseq_figures.py :: a python script to produce Fig. 5 from the manuscript
- this script has a non-standard dependency for label positions; please see the adjustText documentation for further information
- this script uses several large public data sets that cannot be included within this repository to size constraints - processed subsets are included and information on original file links are listed inside of appropriate functions within this script, and within the 'Data availability/Public data' section above.
- rnaseq_figures.py :: a python script to produce Fig. 5 from the manuscript
- Folder containing scripts and functions used for data analysis
- Unless otherwise stated please contact Assoc. Prof. Pilar Blancafort or Dr Charlene Waryah for further information
- Data subsets used in the generation of Figure 6
- Please contact Sepideh Foroutan for further information
These scripts rely on some external dependencies.
The python based implementation of singscore is used for gene set scoring in this manuscript. Installation instructions are available upon the Wiki: https://github.com/DavisLaboratory/PySingscore/wiki/Tutorial