Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rsem-calculate-expression #93

Merged
merged 25 commits into from
Sep 18, 2024
Merged
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
38f586b
initial commit dedup
emmarousseau Apr 11, 2024
271108c
Merge branch 'viash-hub:main' into main
emmarousseau Apr 11, 2024
2c26968
Revert "initial commit dedup"
emmarousseau Apr 11, 2024
5ea8c78
Merge branch 'viash-hub:main' into main
emmarousseau Apr 13, 2024
897cd89
Merge branch 'viash-hub:main' into main
emmarousseau May 5, 2024
ea0383c
Merge branch 'viash-hub:main' into main
emmarousseau May 23, 2024
44b3fcc
Merge branch 'viash-hub:main' into main
emmarousseau Jul 1, 2024
6cc4f94
Merge branch 'viash-hub:main' into main
emmarousseau Jul 2, 2024
c9613d1
Merge branch 'viash-hub:main' into main
emmarousseau Jul 8, 2024
1679c59
Merge branch 'viash-hub:main' into main
emmarousseau Jul 9, 2024
dc275da
three rsem components initial commit
emmarousseau Jul 18, 2024
9745e4a
update container setup
emmarousseau Jul 23, 2024
9f9602a
Simplified container configuration
emmarousseau Jul 24, 2024
84fe94d
temporarily remove version recording from config
emmarousseau Jul 24, 2024
7a8772f
Complete config file
emmarousseau Aug 11, 2024
aef65a1
add tests and complete config file
emmarousseau Aug 12, 2024
2918fb4
change test dataset
emmarousseau Aug 20, 2024
c584b93
functional test, adjustements to scripts
emmarousseau Aug 21, 2024
e055c97
Update changelog
emmarousseau Aug 21, 2024
d23ee7b
Simplified test data and help.txt contents
emmarousseau Aug 25, 2024
bd1075e
suggested changes, typos
emmarousseau Sep 5, 2024
de23085
simplify, get rid of test_data folder
emmarousseau Sep 18, 2024
601518c
Update CHANGELOG.md
rcannood Sep 18, 2024
cf46beb
Merge branch 'main' into rsem
rcannood Sep 18, 2024
41598f0
Update CHANGELOG.md
rcannood Sep 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,8 @@
* `bedtools`:
- `bedtools_getfasta`: extract sequences from a FASTA file for each of the
intervals defined in a BED/GFF/VCF file (PR #59).
* `rsem`:
- `rsem_calculate_expression`: Calculate expression levels (PR #93).

## MINOR CHANGES

Expand Down
481 changes: 481 additions & 0 deletions src/rsem/rsem_calculate_expression/config.vsh.yaml

Large diffs are not rendered by default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be empty

Empty file.
103 changes: 103 additions & 0 deletions src/rsem/rsem_calculate_expression/script.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
#!/bin/bash

## VIASH START
## VIASH END

set -eo pipefail

function clean_up {
rm -rf "$tmpdir"
}
trap clean_up EXIT

tmpdir=$(mktemp -d "$meta_temp_dir/$meta_functionality_name-XXXXXXXX")

if [ "$par_strandedness" == 'forward' ]; then
strandedness='--strandedness forward'
elif [ "$par_strandedness" == 'reverse' ]; then
strandedness="--strandedness reverse"
else
strandedness=''
fi

IFS=";" read -ra input <<< $par_input

INDEX=$(find -L $meta_resources_dir/$par_index -name "*.grp" | sed 's/\.grp$//')

unset_if_false=( par_paired par_quiet par_no_bam_output par_sampling_for_bam par_no_qualities
par_alignments par_bowtie2 par_star par_hisat2_hca par_append_names
par_single_cell_prior par_calc_pme par_calc_ci par_phred64_quals
par_solexa_quals par_star_gzipped_read_file par_star_bzipped_read_file
par_star_output_genome_bam par_estimate_rspd par_keep_intermediate_files
par_time par_run_pRSEM par_cap_stacked_chipseq_reads par_sort_bam_by_read_name )

for par in ${unset_if_false[@]}; do
test_val="${!par}"
[[ "$test_val" == "false" ]] && unset $par
done

rsem-calculate-expression \
${par_quiet:+-q} \
${par_no_bam_output:+--no-bam-output} \
${par_sampling_for_bam:+--sampling-for-bam} \
${par_no_qualities:+--no-qualities} \
${par_alignments:+--alignments} \
${par_bowtie2:+--bowtie2} \
${par_star:+--star} \
${par_hisat2_hca:+--hisat2-hca} \
${par_append_names:+--append-names} \
${par_single_cell_prior:+--single-cell-prior} \
${par_calc_pme:+--calc-pme} \
${par_calc_ci:+--calc-ci} \
${par_phred64_quals:+--phred64-quals} \
${par_solexa_quals:+--solexa-quals} \
${par_star_gzipped_read_file:+--star-gzipped-read-file} \
${par_star_bzipped_read_file:+--star-bzipped-read-file} \
${par_star_output_genome_bam:+--star-output-genome-bam} \
${par_estimate_rspd:+--estimate-rspd} \
${par_keep_intermediate_files:+--keep-intermediate-files} \
${par_time:+--time} \
${par_run_pRSEM:+--run-pRSEM} \
${par_cap_stacked_chipseq_reads:+--cap-stacked-chipseq-reads} \
${par_sort_bam_by_read_name:+--sort-bam-by-read-name} \
${par_counts_gene:+--counts-gene "$par_counts_gene"} \
${par_counts_transcripts:+--counts-transcripts "$par_counts_transcripts"} \
${par_stat:+--stat "$par_stat"} \
${par_bam_star:+--bam-star "\$par_bam_star"} \
${par_bam_genome:+--bam-genome "$par_bam_genome"} \
${par_bam_transcript:+--bam-transcript "$par_bam_transcript"} \
${par_fai:+--fai "$par_fai"} \
${par_seed:+--seed "$par_seed"} \
${par_seed_length:+--seed-length "$par_seed_length"} \
${par_bowtie_n:+--bowtie-n "$par_bowtie_n"} \
${par_bowtie_e:+--bowtie-e "$par_bowtie_e"} \
${par_bowtie_m:+--bowtie-m "$par_bowtie_m"} \
${par_bowtie_chunkmbs:+--bowtie-chunkmbs "$par_bowtie_chunkmbs"} \
${par_bowtie2_mismatch_rate:+--bowtie2-mismatch-rate "$par_bowtie2_mismatch_rate"} \
${par_bowtie2_k:+--bowtie2-k "$par_bowtie2_k"} \
${par_bowtie2_sensitivity_level:+--bowtie2-sensitivity-level "$par_bowtie2_sensitivity_level"} \
${par_tag:+--tag "$par_tag"} \
${par_fragment_length_min:+--fragment-length-min "$par_fragment_length_min"} \
${par_fragment_length_max:+--fragment-length-max "$par_fragment_length_max"} \
${par_fragment_length_mean:+--fragment-length-mean "$par_fragment_length_mean"} \
${par_fragment_length_sd:+--fragment-length-sd "$par_fragment_length_sd"} \
${par_num_rspd_bins:+--num-rspd-bins "$par_num_rspd_bins"} \
${par_gibbs_burnin:+--gibbs-burnin "$par_gibbs_burnin"} \
${par_gibbs_number_of_samples:+--gibbs-number-of-samples "$par_gibbs_number_of_samples"} \
${par_gibbs_sampling_gap:+--gibbs-sampling-gap "$par_gibbs_sampling_gap"} \
${par_ci_credibility_level:+--ci-credibility-level "$par_ci_credibility_level"} \
${par_ci_number_of_samples_per_count_vector:+--ci-number-of-samples-per-count-vector "$par_ci_number_of_samples_per_count_vector"} \
${par_temporary_folder:+--temporary-folder "$par_temporary_folder"} \
${par_chipseq_peak_file:+--chipseq-peak-file "$par_chipseq_peak_file"} \
${par_chipseq_target_read_files:+--chipseq-target-read-files "$par_chipseq_target_read_files"} \
${par_chipseq_control_read_files:+--chipseq-control-read-files "$par_chipseq_control_read_files"} \
${par_chipseq_read_files_multi_targets:+--chipseq-read-files-multi-targets "$par_chipseq_read_files_multi_targets"} \
${par_chipseq_bed_files_multi_targets:+--chipseq-bed-files-multi-targets "$par_chipseq_bed_files_multi_targets"} \
${par_n_max_stacked_chipseq_reads:+--n-max-stacked-chipseq-reads "$par_n_max_stacked_chipseq_reads"} \
${par_partition_model:+--partition-model "$par_partition_model"} \
$strandedness \
${par_paired:+--paired-end} \
${input[*]} \
$INDEX \
$par_id

39 changes: 39 additions & 0 deletions src/rsem/rsem_calculate_expression/test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
#!/bin/bash

echo ">>> Testing $meta_executable"

test_dir="${meta_resources_dir}/test_data"

wget https://raw.githubusercontent.com/nf-core/test-datasets/rnaseq3/reference/rsem.tar.gz
gunzip -k rsem.tar.gz
tar -xf rsem.tar

mv $test_dir/rsem $meta_resources_dir

echo ">>> Test 1: Paired-end reads using STAR to align reads"
"$meta_executable" \
--star \
--star_gzipped_read_file \
--paired \
--input "$test_dir/SRR6357070_1.fastq.gz;$test_dir/SRR6357070_2.fastq.gz" \
--index rsem \
--id WT_REP1 \
--seed 1 \
--quiet

echo ">>> Checking whether output exists"
[ ! -f "WT_REP1.genes.results" ] && echo "Gene level expression counts file does not exist!" && exit 1
[ ! -s "WT_REP1.genes.results" ] && echo "Gene level expression counts file is empty!" && exit 1
[ ! -f "WT_REP1.isoforms.results" ] && echo "Transcript level expression counts file does not exist!" && exit 1
[ ! -s "WT_REP1.isoforms.results" ] && echo "Transcript level expression counts file is empty!" && exit 1
[ ! -d "WT_REP1.stat" ] && echo "Stats file does not exist!" && exit 1

echo ">>> Check wheter output is correct"
diff $test_dir/ref.genes.results WT_REP1.genes.results || { echo "Gene level expression counts file is incorrect!"; exit 1; }
diff $test_dir/ref.isoforms.results WT_REP1.isoforms.results || { echo "Transcript level expression counts file is incorrect!"; exit 1; }
diff $test_dir/ref.cnt WT_REP1.stat/WT_REP1.cnt || { echo "Stats file is incorrect!"; exit 1; }

#####################################################################################################

echo "All tests succeeded!"
exit 0
emmarousseau marked this conversation as resolved.
Show resolved Hide resolved
Binary file not shown.
Binary file not shown.
7 changes: 7 additions & 0 deletions src/rsem/rsem_calculate_expression/test_data/ref.cnt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
26 54 0 80
49 5 5
59 3
0 26
1 49
2 5
Inf 0
126 changes: 126 additions & 0 deletions src/rsem/rsem_calculate_expression/test_data/ref.genes.results
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
gene_id transcript_id(s) length effective_length expected_count TPM FPKM
Gfp_transgene Gfp_transgene 729.00 556.81 0.00 0.00 0.00
HRA1 HRA1 564.00 391.81 0.00 0.00 0.00
YAL001C YAL001C 3483.00 3310.81 0.00 0.00 0.00
YAL002W YAL002W 3825.00 3652.81 0.00 0.00 0.00
YAL003W YAL003W 621.00 448.81 0.00 0.00 0.00
YAL004W YAL004W 648.00 475.81 0.00 0.52 0.27
YAL005C YAL005C 1929.00 1756.81 8.00 162878.02 85918.88
YAL007C YAL007C 648.00 475.81 1.00 75173.23 39654.22
YAL008W YAL008W 597.00 424.81 0.00 0.00 0.00
YAL009W YAL009W 780.00 607.81 0.00 0.00 0.00
YAL010C YAL010C 1482.00 1309.81 0.00 0.00 0.00
YAL011W YAL011W 1878.00 1705.81 0.00 0.00 0.00
YAL012W YAL012W 1185.00 1012.81 0.00 0.00 0.00
YAL013W YAL013W 1218.00 1045.81 0.00 0.00 0.00
YAL014C YAL014C 768.00 595.81 0.00 0.00 0.00
YAL015C YAL015C 1200.00 1027.81 0.00 0.00 0.00
YAL016C-A YAL016C-A 315.00 142.91 0.00 0.00 0.00
YAL016C-B YAL016C-B 186.00 30.47 0.00 0.00 0.00
YAL016W YAL016W 1908.00 1735.81 0.00 0.00 0.00
YAL017W YAL017W 4071.00 3898.81 0.00 0.00 0.00
YAL018C YAL018C 978.00 805.81 0.00 0.00 0.00
YAL019W YAL019W 3396.00 3223.81 2.00 22190.06 11705.35
YAL019W-A YAL019W-A 570.00 397.81 0.00 0.00 0.00
YAL020C YAL020C 1002.00 829.81 0.00 0.00 0.00
YAL021C YAL021C 2514.00 2341.81 0.00 0.00 0.00
YAL022C YAL022C 1554.00 1381.81 1.00 25885.06 13654.49
YAL023C YAL023C 2280.00 2107.81 0.00 0.00 0.00
YAL024C YAL024C 4308.00 4135.81 0.00 0.00 0.00
YAL025C YAL025C 921.00 748.81 0.00 0.00 0.00
YAL026C YAL026C 4068.00 3895.81 1.00 9181.21 4843.13
YAL026C-A YAL026C-A 438.00 265.81 0.00 0.00 0.00
YAL027W YAL027W 786.00 613.81 0.00 0.00 0.00
YAL028W YAL028W 1587.00 1414.81 0.00 0.00 0.00
YAL029C YAL029C 4416.00 4243.81 0.00 0.00 0.00
YAL030W YAL030W 354.00 181.81 0.00 0.00 0.00
YAL031C YAL031C 2283.00 2110.81 0.00 0.00 0.00
YAL031W-A YAL031W-A 309.00 137.04 0.00 0.00 0.00
YAL032C YAL032C 1140.00 967.81 0.00 0.00 0.00
YAL033W YAL033W 522.00 349.81 0.00 0.00 0.00
YAL034C YAL034C 1242.00 1069.81 0.00 0.00 0.00
YAL034C-B YAL034C-B 354.00 181.81 0.00 0.00 0.00
YAL034W-A YAL034W-A 870.00 697.81 0.00 0.00 0.00
YAL035W YAL035W 3009.00 2836.81 1.00 12608.62 6651.10
YAL036C YAL036C 1110.00 937.81 0.00 0.00 0.00
YAL037C-A YAL037C-A 93.00 0.00 0.00 0.00 0.00
YAL037C-B YAL037C-B 975.00 802.81 0.00 0.00 0.00
YAL037W YAL037W 804.00 631.81 0.00 0.00 0.00
YAL038W YAL038W 1503.00 1330.81 6.00 161262.27 85066.56
YAL039C YAL039C 810.00 637.81 0.00 0.00 0.00
YAL040C YAL040C 1743.00 1570.81 0.00 0.00 0.00
YAL041W YAL041W 2565.00 2392.81 0.00 0.00 0.00
YAL042C-A YAL042C-A 378.00 205.81 0.00 0.00 0.00
YAL042W YAL042W 1248.00 1075.81 0.00 0.00 0.00
YAL043C YAL043C 2358.00 2185.81 0.00 0.00 0.00
YAL044C YAL044C 513.00 340.81 0.00 0.00 0.00
YAL044W-A YAL044W-A 333.00 160.81 0.00 0.00 0.00
YAL045C YAL045C 309.00 137.04 0.00 0.00 0.00
YAL046C YAL046C 357.00 184.81 0.00 0.00 0.00
YAL047C YAL047C 1869.00 1696.81 0.00 0.00 0.00
YAL047W-A YAL047W-A 330.00 157.81 0.00 0.00 0.00
YAL048C YAL048C 1989.00 1816.81 0.00 0.00 0.00
YAL049C YAL049C 741.00 568.81 0.00 0.00 0.00
YAL051W YAL051W 3144.00 2971.81 0.00 0.00 0.00
YAL053W YAL053W 2352.00 2179.81 0.00 0.00 0.00
YAL054C YAL054C 2142.00 1969.81 0.00 0.00 0.00
YAL055W YAL055W 543.00 370.81 0.00 0.00 0.00
YAL056C-A YAL056C-A 351.00 178.81 0.00 0.00 0.00
YAL056W YAL056W 2643.00 2470.81 0.00 0.00 0.00
YAL058W YAL058W 1509.00 1336.81 0.00 0.00 0.00
YAL059C-A YAL059C-A 423.00 250.81 0.00 0.00 0.00
YAL059W YAL059W 639.00 466.81 0.00 0.00 0.00
YAL060W YAL060W 1149.00 976.81 0.00 0.00 0.00
YAL061W YAL061W 1254.00 1081.81 0.00 0.00 0.00
YAL062W YAL062W 1374.00 1201.81 0.00 0.00 0.00
YAL063C YAL063C 3969.00 3796.81 0.00 0.00 0.00
YAL063C-A YAL063C-A 291.00 119.72 0.00 0.00 0.00
YAL064C-A YAL064C-A 381.00 208.81 0.00 0.00 0.00
YAL064W YAL064W 285.00 113.94 0.00 0.00 0.00
YAL064W-B YAL064W-B 381.00 208.81 0.00 0.00 0.00
YAL065C YAL065C 387.00 214.81 0.00 0.00 0.00
YAL066W YAL066W 309.00 137.04 0.00 0.00 0.00
YAL067C YAL067C 1782.00 1609.81 0.00 0.00 0.00
YAL067W-A YAL067W-A 228.00 63.08 0.00 0.00 0.00
YAL068C YAL068C 363.00 190.81 0.00 0.00 0.00
YAL068W-A YAL068W-A 255.00 86.02 0.00 0.00 0.00
YAL069W YAL069W 315.00 142.91 0.00 0.00 0.00
YAR002C-A YAR002C-A 660.00 487.81 0.00 0.00 0.00
YAR002W YAR002W 1620.00 1447.81 0.00 0.00 0.00
YAR003W YAR003W 1281.00 1108.81 0.00 0.00 0.00
YAR007C YAR007C 1866.00 1693.81 0.00 0.00 0.00
YAR008W YAR008W 828.00 655.81 0.00 0.00 0.00
YAR009C YAR009C 3591.00 3418.81 24.00 251092.71 132452.52
YAR010C YAR010C 1323.00 1150.81 9.00 279728.29 147557.92
YAR014C YAR014C 2130.00 1957.81 0.00 0.00 0.00
YAR015W YAR015W 921.00 748.81 0.00 0.00 0.00
YAR018C YAR018C 1308.00 1135.81 0.00 0.00 0.00
YAR019C YAR019C 2925.00 2752.81 0.00 0.00 0.00
YAR019W-A YAR019W-A 333.00 160.81 0.00 0.00 0.00
YAR020C YAR020C 168.00 18.92 0.00 0.00 0.00
YAR023C YAR023C 540.00 367.81 0.00 0.00 0.00
YAR027W YAR027W 708.00 535.81 0.00 0.00 0.00
YAR028W YAR028W 705.00 532.81 0.00 0.00 0.00
YAR029W YAR029W 225.00 60.64 0.00 0.00 0.00
YAR030C YAR030C 342.00 169.81 0.00 0.00 0.00
YAR031W YAR031W 897.00 724.81 0.00 0.00 0.00
YAR033W YAR033W 705.00 532.81 0.00 0.00 0.00
YAR035C-A YAR035C-A 81.00 0.00 0.00 0.00 0.00
YAR035W YAR035W 2064.00 1891.81 0.00 0.00 0.00
YAR042W YAR042W 3567.00 3394.81 0.00 0.00 0.00
YAR047C YAR047C 321.00 148.81 0.00 0.00 0.00
YAR050W YAR050W 4614.00 4441.81 0.00 0.00 0.00
YAR053W YAR053W 297.00 125.49 0.00 0.00 0.00
YAR060C YAR060C 336.00 163.81 0.00 0.00 0.00
YAR061W YAR061W 204.00 43.64 0.00 0.00 0.00
YAR062W YAR062W 597.00 424.81 0.00 0.00 0.00
YAR064W YAR064W 300.00 128.38 0.00 0.00 0.00
YAR066W YAR066W 612.00 439.81 0.00 0.00 0.00
YAR068W YAR068W 486.00 313.81 0.00 0.00 0.00
YAR069C YAR069C 294.00 122.60 0.00 0.00 0.00
YAR070C YAR070C 300.00 128.38 0.00 0.00 0.00
snR18 snR18 102.00 0.00 0.00 0.00 0.00
tA(UGC)A tA(UGC)A 73.00 0.00 0.00 0.00 0.00
tL(CAA)A tL(CAA)A 82.00 0.00 0.00 0.00 0.00
tP(UGG)A tP(UGG)A 72.00 0.00 0.00 0.00 0.00
tS(AGA)A tS(AGA)A 82.00 0.00 0.00 0.00 0.00
Loading