Skip to content

Commit

Permalink
FEAT + BUG: cutadapt; allowing disabling demultiplexing and fix par_q…
Browse files Browse the repository at this point in the history
…uality_cutoff_r2 (#69)

* FEAT: Disable cutadapt demultiplexing by default

* Cutadapt: fix --par_quality_cutoff_r2
  • Loading branch information
DriesSchaumont authored Jul 1, 2024
1 parent 3e08b59 commit 1f076bd
Show file tree
Hide file tree
Showing 4 changed files with 54 additions and 5 deletions.
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@

* `pear`: fix component not exiting with the correct exitcode when PEAR fails.

* `cutadapt`: fix `--par_quality_cutoff_r2` argument.

* `cutadapt`: demultiplexing is now disabled by default. It can be re-enabled by using `demultiplex_mode`.

# biobox 0.1.0

## BREAKING CHANGES
Expand All @@ -12,6 +16,7 @@
Viash 0.9.0 in order to avoid issues with the current default separator `:` unintentionally
splitting up certain file paths.


## NEW FEATURES

* `arriba`: Detect gene fusions from RNA-seq data (PR #1).
Expand Down
18 changes: 18 additions & 0 deletions src/cutadapt/config.vsh.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -240,6 +240,24 @@ argument_groups:
Check both the read and its reverse complement for adapter
matches. If match is on reverse-complemented version,
output that one.
####################################################################
- name: "Demultiplexing options"
arguments:
- name: "--demultiplex_mode"
type: string
choices: ["single", "unique_dual", "combinatorial_dual"]
required: false
description: |
Enable demultiplexing and set the mode for it.
With mode 'unique_dual', adapters from the first and second read are used,
and the indexes from the reads are only used in pairs. This implies
--pair_adapters.
Enabling mode 'combinatorial_dual' allows all combinations of the sets of indexes
on R1 and R2. It is necessary to write each read pair to an output
file depending on the adapters found on both R1 and R2.
Mode 'single', uses indexes or barcodes located at the 5'
end of the R1 read (single).
####################################################################
- name: Read modifications
Expand Down
31 changes: 26 additions & 5 deletions src/cutadapt/script.sh
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,7 @@ mod_args=$(echo \
${par_cut_r2:+--cut_r2 "${par_cut_r2}"} \
${par_nextseq_trim:+--nextseq-trim "${par_nextseq_trim}"} \
${par_quality_cutoff:+--quality-cutoff "${par_quality_cutoff}"} \
${par_quality_cutoff_r2:+--quality-cutoff_r2 "${par_quality_cutoff_r2}"} \
${par_quality_cutoff_r2:+-Q "${par_quality_cutoff_r2}"} \
${par_quality_base:+--quality-base "${par_quality_base}"} \
${par_poly_a:+--poly-a} \
${par_length:+--length "${par_length}"} \
Expand Down Expand Up @@ -196,14 +196,35 @@ else
ext="fasta"
fi

if [ $mode = "se" ]; then
demultiplex_mode="$par_demultiplex_mode"
if [[ $mode == "se" ]]; then
if [[ "$demultiplex_mode" == "unique_dual" ]] || [[ "$demultiplex_mode" == "combinatorial_dual" ]]; then
echo "Demultiplexing dual indexes is not possible with single-end data."
exit 1
fi
prefix="trimmed_"
if [[ ! -z "$demultiplex_mode" ]]; then
prefix="{name}_"
fi
output_args=$(echo \
--output "$output_dir/{name}_001.$ext" \
--output "$output_dir/${prefix}001.$ext" \
)
else
demultiplex_indicator_r1='{name}_'
demultiplex_indicator_r2=$demultiplex_indicator_r1
if [[ "$demultiplex_mode" == "combinatorial_dual" ]]; then
demultiplex_indicator_r1='{name1}_{name2}_'
demultiplex_indicator_r2='{name1}_{name2}_'
fi
prefix_r1="trimmed_"
prefix_r2="trimmed_"
if [[ ! -z "$demultiplex_mode" ]]; then
prefix_r1=$demultiplex_indicator_r1
prefix_r2=$demultiplex_indicator_r2
fi
output_args=$(echo \
--output "$output_dir/{name}_R1_001.$ext" \
--paired-output "$output_dir/{name}_R2_001.$ext" \
--output "$output_dir/${prefix_r1}R1_001.$ext" \
--paired-output "$output_dir/${prefix_r2}R2_001.$ext" \
)
fi

Expand Down
5 changes: 5 additions & 0 deletions src/cutadapt/test.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
#!/bin/bash

set -e
set -eo pipefail

#############################################
# helper functions
Expand Down Expand Up @@ -57,6 +58,7 @@ EOF
--adapter ADAPTER \
--input example.fa \
--fasta \
--demultiplex_mode single \
--no_match_adapter_wildcards \
--json

Expand Down Expand Up @@ -101,6 +103,7 @@ EOF
--output "out_test1/*.fasta" \
--adapter ADAPTER \
--input example.fa \
--demultiplex_mode single \
--fasta \
--no_match_adapter_wildcards \
--json
Expand Down Expand Up @@ -160,6 +163,7 @@ EOF
--adapter AAAAA \
--adapter_fasta adapters1.fasta \
--adapter_fasta adapters2.fasta \
--demultiplex_mode single \
--input example.fa \
--fasta \
--json
Expand Down Expand Up @@ -224,6 +228,7 @@ EOF
--input example_R1.fastq \
--input_r2 example_R2.fastq \
--quality_cutoff 20 \
--demultiplex_mode unique_dual \
--json \
---cpus 1

Expand Down

0 comments on commit 1f076bd

Please sign in to comment.