-
Notifications
You must be signed in to change notification settings - Fork 2
Directory Structure (RNA)
sprokopec edited this page May 15, 2024
·
2 revisions
PROJECT
├── STAR
│ ├── star_bam_config.yaml
│ ├── date_PROJECTNAME_rnaseqc_output.tsv
│ ├── logs
│ ├── RNASeQC
│ ├── SMP-001
│ | ├── SMP-001-T_sorted_markdup.bam
│ │ └── SMP-001-T
│ │ └── Aligned.toTranscriptome.out.bam, Aligned.sortedByCoord.out.bam, Chimeric.out.junction
│ └── SMP-002
│ ├── SMP-002-T1
│ └── SMP-002-T2
├── STAR-Fusion
│ ├── date_PROJECTNAME_star-fusion_for_cbioportal.tsv
│ ├── date_PROJECTNAME_star-fusion_output.tsv
│ ├── logs
│ ├── SMP-001
│ │ └── SMP-001-T
│ │ └── star-fusion.fusion_predictions.abridged.tsv
│ └── SMP-002
│ ├── SMP-002-T1
│ └── SMP-002-T2
├── GATK
│ ├── gatk_bam_config.yaml
│ ├── logs
│ ├── SMP-001
│ │ └── SMP-001-T_realigned_recalibrated.bam
│ └── SMP-002
│ ├── SMP-002-T1_realigned_recalibrated.bam
│ └── SMP-002-T2_realigned_recalibrated.bam
└── logs
└── run_RNA_pipeline_timestamp
├── pughlab_rna_pipeline__run_star
├── pughlab_rna_pipeline__run_star_fusion
├── pughlab_rna_pipeline__run_rsem
└── pughlab_rna_pipeline__run_gatk
-
star.pl
- will use collect_rnaseqc_output.R to collect RNASeQC metrics from all processed samples
- output includes:
- DATE_projectname_rnaseqc_output.tsv (qc metrics)
- DATE_projectname_rnaseqc_Pearson_correlations.tsv (sample-sample correlations)
-
rsem.pl
- will use collect_rsem_output.R to collect expression data from all processed samples
- output includes gene/isoform x sample matrices:
- DATE_projectname_gene_expression_TPM.tsv
- DATE_projectname_mRNA_expression_TPM_for_cbioportal.tsv (RNA expression values in format required by cBioportal. NOT CN/ploidy adjusted!)
- DATE_projectname_mRNA_TPM_zscores_for_cbioportal.tsv (RNA expression zscores in format required by cBioportal. NOT CN/ploidy adjusted!)
- DATE_projectname_rsem_expression_results.RData
-
star_fusion.pl
- will use collect_star-fusion_output.R to collect fusions from all processed samples
- output includes:
- DATE_projectname_star-fusion_output_long.tsv (concatenated output)
- DATE_projectname_star-fusion_output_wide.tsv (fusion x sample matrix)
- DATE_projectname_star-fusion_for_cbioportal.tsv (SVs in format required by cBioportal)
-
arriba.pl
- will use collect_arriba_output.R to collect fusions from all processed samples
- output includes:
- DATE_projectname_arriba_output_long.tsv (concatenated output)
- DATE_projectname_arriba_output_wide.tsv (fusion x sample matrix)
- DATE_projectname_arriba_for_cbioportal.tsv (SVs in format required by cBioportal)
- DATE_projectname_arriba_viral_counts.tsv (species x sample matrix)
-
fusioncatcher.pl
- will use collect_fusioncatcher_output.R to collect fusions from all processed samples
- output includes:
- DATE_projectname_fusioncatcher_output_long.tsv (concatenated output)
- DATE_projectname_fusioncatcher_output_wide.tsv (fusion x sample matrix)
- DATE_projectname_fusioncatcher_for_cbioportal.tsv (SVs in format required by cBioportal)
- DATE_projectname_fusioncatcher_viral_counts.tsv (species x sample matrix)
-
haplotype_caller.pl
- will use collect_snv_output.R to collect high-confidence SNV calls from all processed samples
- output includes:
- DATE_projectname_variant_by_patient.tsv (snv [chr/pos/ref/alt/gene] x sample matrix)
- DATE_projectname_gene_by_patient.tsv (gene x sample matrix)
- DATE_projectname_mutations_for_cbioportal.tsv (SNV and INDEL calls in format required by cBioportal)
-
pughlab_pipeline_auto_report.pl
- final Report.pdf
- plots directory containing:
- qc summary plots and concerns for manual review (qc_concerns.tex)
- expression landscape plots; SNV summary plots; viral summary plots; fusion summary
- detailed methods (methods.tex)