Skip to content

Commit

Permalink
modifgy so it only takes one reads file and at most on mapped reads file
Browse files Browse the repository at this point in the history
  • Loading branch information
Richard C. Burhans committed Dec 4, 2024
1 parent fe00444 commit 7acbb67
Show file tree
Hide file tree
Showing 4 changed files with 32 additions and 47 deletions.
70 changes: 32 additions & 38 deletions tools/halfdeep/halfdeep.xml
Original file line number Diff line number Diff line change
Expand Up @@ -9,88 +9,82 @@
## Set up the directory structure expected by bam_depth.sh and halfdeep.sh
## See: https://github.com/makovalab-psu/HalfDeep?tab=readme-ov-file#expected-directory-layout
##
#import re
mkdir -p reads halfdeep/ref/mapped_reads &&
##
## reference
##
ln -s '$ref' 'ref.$ref.ext' &&
touch ref.idx &&
#if not $mapped_reads
minimap2 -x map-pb -d ref.idx 'ref.$ref.ext' &&
#else
touch ref.idx &&
#end if
##
## reads
##
#set $reads_dir = "reads"
#set $mapped_reads_dir = "halfdeep/ref/mapped_reads"
mkdir -p '$reads_dir' '$mapped_reads_dir' &&
#for $read in $reads
#set $read_base = re.sub('[^\w\-\s]', '_', str($read.element_identifier))
ln -s '$read' '$reads_dir/${read_base}.$read.ext' &&
echo '$reads_dir/${read_base}.$read.ext' >> input.fofn &&
##
## mapped reads
##
#for $mapped_read in $mapped_reads
ln -s '$mapped_read' "$mapped_reads_dir/${read_base}.bam" &&
ln -s "${read_base}.bam" "$mapped_reads_dir/${read_base}.sort.bam" &&
ln -s '$mapped_read.metadata.bam_index' "$mapped_reads_dir/${read_base}.sort.bam.bai" &&
#end for
#end for
#import re
#set $reads_base = re.sub('[^\w\-\s]', '_', str($reads.element_identifier))
ln -s '$reads' 'reads/${reads_base}.$reads.ext' &&
echo 'reads/${reads_base}.$reads.ext' >> input.fofn &&
##
## mapped reads
##
#if $mapped_reads
ln -s '$mapped_reads' 'halfdeep/ref/mapped_reads/${reads_base}.bam' &&
ln -s '${reads_base}.bam' 'halfdeep/ref/mapped_reads/${reads_base}.sort.bam' &&
ln -s '$mapped_reads.metadata.bam_index' 'halfdeep/ref/mapped_reads/${reads_base}.sort.bam.bai' &&
#end if
##
## run bam_depth.sh
##
#for $line_number in range(1, len($reads) + 1)
bam_depth.sh 'ref.$ref.ext' $line_number &&
#end for
bam_depth.sh 'ref.$ref.ext' 1 &&
##
## run halfdeep.sh
##
halfdeep.sh 'ref.$ref.ext'
]]></command>
<inputs>
<param name="ref" type="data" format="fasta,fasta.gz" label="Genome Assembly" help="A Genome Assembly in FASTA format."/>
<param name="reads" type="data" format="fastqsanger,fastqsanger.bz2,fastqsanger.gz" multiple="true" label="Sequencing Reads" help="Sequencing Reads for the Genome Assembly in FASTQ format."/>
<param name="mapped_reads" type="data" format="bam" multiple="true" label="Aligned Reads" help="Alignments of the Sequencing Reads to the Genome Assembly in BAM format."/>
<param name="reads" type="data" format="fastqsanger,fastqsanger.gz" label="Sequencing Reads" help="Sequencing Reads for the Genome Assembly in FASTQ format."/>
<param name="mapped_reads" type="data" format="bam" value="" optional="true" label="Aligned Reads" help="Alignments of the Sequencing Reads to the Genome Assembly in BAM format."/>
</inputs>
<outputs>
<data name="scaffold_len" format="tabular" from_work_dir="halfdeep/ref/scaffold_lengths.dat" label="Scaffold lengths for ${on_string}"/>
<data name="depth_dat" format="tabular.gz" from_work_dir="halfdeep/ref/depth.dat.gz" label="Depth for ${on_string}"/>
<data name="pct_cmds" format="text" from_work_dir="halfdeep/ref/percentile_commands.sh" label="Percentile to value for ${on_string}"/>
<data name="halfdeep_dat" format="bed" from_work_dir="halfdeep/ref/halfdeep.dat" label="HalfDeep on ${on_string}"/>
</outputs>
<tests>
<test expect_num_outputs="4">
<test expect_num_outputs="1">
<param name="ref" value="ref.fasta.gz" ftype="fasta.gz"/>
<param name="reads" value="reads.fasta.gz" ftype="fasta.gz"/>
<param name="mapped_reads" value="mapped_reads.bam" ftype="bam"/>
<output name="scaffold_len" file="scaffold_lengths.tabular" ftype="tabular"/>
<output name="depth_dat" file="depth.tabular.gz" ftype="tabular.gz"/>
<output name="pct_cmds" file="percentile.txt" ftype="text"/>
<output name="halfdeep_dat" file="halfdeep.bed" ftype="bed"/>
</test>
<test expect_num_outputs="1">
<param name="ref" value="ref.fasta.gz" ftype="fasta.gz"/>
<param name="reads" value="reads.fasta.gz" ftype="fasta.gz"/>
<output name="halfdeep_dat" file="halfdeep.bed" ftype="bed"/>
</test>
</tests>
<help><![CDATA[
HalfDeep identifies genomic regions with half-depth coverage based on sequencing read mappings. These regions may reveal insights into heterogametic sex chromosomes, haplotype-specific variation, or potential assembly errors such as heterotypic duplications.
Given the following three inputs:
Given the following inputs:
1. A genome assembly in FASTA format.
2. Reads in FASTQ format.
3. Mapped reads in BAM format
3. Mapped reads in BAM format (optional)
HalfDeep automates the following tasks:
1. Mapping reads and merging individual mapping files.
2. Calculating per-base read depth.
3. Smoothing read coverage using a defined window with genodsp.
4. Determining the percentile of read coverage.
5. Identifying genomic regions with half-depth coverage based on a specified percentile threshold (e.g., 40–60%) and exporting them in BED file forma
5. Identifying genomic regions with half-depth coverage based on a specified percentile threshold (e.g., 40–60%) and exporting them in BED file format
HalfDeep produces the following outputs:
HalfDeep produces the following output:
1. Scaffold lengths: A tabular file containing the name and legth of each sequence in the genome assembly.
2. Depths: A tabular file containing the read depts.
3. A tabular file containing the name and legth of each sequence in the genome assembly: stuff
4. HalfDeep: BED file containina regions of the genome assembly that are "covered at half depth"
1. HalfDeep: BED file containing regions of the genome assembly that are "covered at half depth"
]]></help>
<expand macro="citations"/>
</tool>
Binary file removed tools/halfdeep/test-data/depth.tabular.gz
Binary file not shown.
6 changes: 0 additions & 6 deletions tools/halfdeep/test-data/percentile.txt

This file was deleted.

3 changes: 0 additions & 3 deletions tools/halfdeep/test-data/scaffold_lengths.tabular

This file was deleted.

0 comments on commit 7acbb67

Please sign in to comment.