Skip to content

Latest commit



60 lines (54 loc) · 4.1 KB

File metadata and controls

60 lines (54 loc) · 4.1 KB

Output specification

All output filenames keep prefixes from corresponding input filenames. For example. If you have started from REP1.fastq.gz and REP2.fastq.gz then corresponding alignment log for each replicate has a filename of REP1.flagstat.qc and REP2.flagstat.qc, respectively.

Final HTML report (qc.html) and QC json (qc.json) files do not have any prefix.

  1. DNAnexus: Output will be stored on the specified output directory without any subdirectories.

  2. Cromwell: Cromwell will store outputs for each task under cromwell-executions/[WORKFLOW_ID]/call-[TASK_NAME]/shard-[IDX]. For all tasks except two peak calling tasks idr (irreproducible discovery rate) and overlap (naive overlapping peaks), [IDX] means a zero-based index for each replicate. For two tasks idr and overlap, [IDX] stands for a zero-based index for all possible pair of replicates. For example, you have 3 replicates and all possible combination of two replicates are [(rep1,rep2), (rep1,rep3), (rep2,rep3)]. Therefore, call-idr/shard-2 should be an output directory for the pair of replicate 2 and 3.

There can be duplicate output files on execution/ and execution/glob-*/ directories. A file on the latter (execution/glob-*/) is a symbolic link of an actual output file on the former. For Google Cloud Storage bucket (gs://) there is no execution/ directory and files on glob-*/ are actual outputs.

task filename description
trim_adapter * .trim.fastq.gz adapter-trimmed FASTQ
trim_adapter merge_fastqs_R?_*.fastq.gz Merged and adapter-trimmed FASTQ
bowtie2 * .bam Raw BAM
bowtie2 * .bai BAI for Raw BAM
bowtie2 * .align.log Bowtie2 log for mapping
bowtie2 * .flagstat.qc Samtools flagstat log for raw BAM
filter * .nodup.bam Filtered/deduped BAM
filter * .nodup.flagstat.qc Samtools flagstat log for filtered/deduped BAM
filter * .dup.qc Picard/sambamba markdup log
filter * .pbc.qc PBC QC log
bam2ta * .tagAlign.gz TAG-ALIGN generated from filtered BAM
bam2ta * .N.tagAlign.gz Subsampled (N reads) TAG-ALIGN generated from filtered BAM
bam2ta * .tn5.tagAlign.gz TN5-shifted TAG-ALIGN
spr * .pr1.tagAlign.gz 1st pseudo-replicated TAG-ALIGN
spr * .pr2.tagAlign.gz 2nd pseudo-replicated TAG-ALIGN
pool_ta * .tagAlign.gz Pooled TAG-ALIGN from all replciates
xcor * .cc.plot.pdf Cross-correlation plot PDF
xcor * .cc.plot.png Cross-correlation plot PNG
xcor * .cc.qc Cross-correlation analysis score log
xcor * .cc.fraglen.txt Estimated fragment length
macs2 * .narrowPeak.gz NARROWPEAK
macs2 * .bfilt.narrowPeak.gz Blacklist-filtered NARROWPEAK
macs2 * Blacklist-filtered NARROWPEAK in BigBed format
macs2 * .frip.qc Fraction of read (TAG-ALIGN) in peaks (NARROWPEAK)
macs2_signal_track * .pval.signal.bigwig p-val signal BIGWIG
macs2_signal_track * .fc.signal.bigwig fold enrichment signal BIGWIG
count_signal_track * .positive.bigwig Count signal (+ strand) BIGWIG
count_signal_track * .negative.bigwig Count signal (- strand) BIGWIG
idr * .*Peak.gz IDR NARROWPEAK
idr * .bfilt.*Peak.gz Blacklist-filtered IDR NARROWPEAK
idr * .bfilt.* Blacklist-filtered IDR NARROWPEAK in BigBed format
idr * .txt.png IDR plot PNG
idr * .txt.gz Unthresholded IDR output
idr * .log IDR STDOUT log
idr * .frip.qc Fraction of read (TAG-ALIGN) in peaks (IDR NARROWPEAK)
overlap * .*Peak.gz Overlapping NARROWPEAK
overlap * .bfilt.*Peak.gz Blacklist-filtered overlapping NARROWPEAK
overlap * .bfilt.* Blacklist-filtered overlapping NARROWPEAK in BigBed format
overlap * .frip.qc Fraction of read (TAG-ALIGN) in peaks (overlapping NARROWPEAK)
reproducibility * .reproducibility.qc Reproducibililty QC log
reproducibility optimal_peak.gz Optimal final peak file
reproducibility Optimal final peak file in BigBed format
reproducibility conservative_peak.gz Conservative final peak file
reproducibility Conservative final peak file in BigBed format
qc_report qc.html Final HTML QC report
qc_report qc.json Final QC JSON