-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Kallisto index #149
Merged
Merged
Add Kallisto index #149
Changes from 16 commits
Commits
Show all changes
17 commits
Select commit
Hold shift + click to select a range
38f586b
initial commit dedup
emmarousseau 271108c
Merge branch 'viash-hub:main' into main
emmarousseau 2c26968
Revert "initial commit dedup"
emmarousseau 5ea8c78
Merge branch 'viash-hub:main' into main
emmarousseau 897cd89
Merge branch 'viash-hub:main' into main
emmarousseau ea0383c
Merge branch 'viash-hub:main' into main
emmarousseau 44b3fcc
Merge branch 'viash-hub:main' into main
emmarousseau 6cc4f94
Merge branch 'viash-hub:main' into main
emmarousseau c9613d1
Merge branch 'viash-hub:main' into main
emmarousseau 1679c59
Merge branch 'viash-hub:main' into main
emmarousseau 3af66f8
Merge branch 'viash-hub:main' into main
emmarousseau b7c4ecd
Merge branch 'viash-hub:main' into kallisto_index
emmarousseau 1177dc1
test data, complete config, help, changelog update
emmarousseau 41960e3
check test output contents
emmarousseau f75d6db
remove extra files and clean up test script
emmarousseau f6c2cdc
Merge branch 'main' into kallisto_index
emmarousseau ece9dbb
unset bool arguments, add missing arguments to script
emmarousseau File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
name: kallisto_index | ||
namespace: kallisto | ||
description: | | ||
Build a Kallisto index for the transcriptome to use Kallisto in the mapping-based mode. | ||
keywords: [kallisto, index] | ||
links: | ||
homepage: https://pachterlab.github.io/kallisto/about | ||
documentation: https://pachterlab.github.io/kallisto/manual | ||
repository: https://github.com/pachterlab/kallisto | ||
issue_tracker: https://github.com/pachterlab/kallisto/issues | ||
references: | ||
doi: https://doi.org/10.1038/nbt.3519 | ||
license: BSD 2-Clause License | ||
|
||
argument_groups: | ||
- name: "Input" | ||
arguments: | ||
- name: "--input" | ||
type: file | ||
description: | | ||
Path to a FASTA-file containing the transcriptome sequences, either in plain text or | ||
compressed (.gz) format. | ||
required: true | ||
- name: "--d_list" | ||
type: file | ||
description: | | ||
Path to a FASTA-file containing sequences to mask from quantification. | ||
|
||
- name: "Output" | ||
arguments: | ||
- name: "--kallisto_index" | ||
type: file | ||
direction: output | ||
must_exist: false | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a reason for this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No I think I meant to remove that. |
||
example: Kallisto_index | ||
|
||
- name: "Options" | ||
arguments: | ||
- name: "--kmer_size" | ||
type: integer | ||
description: | | ||
Kmer length passed to indexing step of pseudoaligners (default: '31'). | ||
example: 31 | ||
- name: "--make_unique" | ||
type: boolean_true | ||
description: | | ||
Replace repeated target names with unique names. | ||
- name: "--aa" | ||
type: boolean_true | ||
description: | | ||
Generate index from a FASTA-file containing amino acid sequences. | ||
- name: "--distiguish" | ||
type: boolean_true | ||
description: | | ||
Generate index where sequences are distinguished by the sequence names. | ||
- name: "--min_size" | ||
alternatives: ["-m"] | ||
type: integer | ||
description: | | ||
Length of minimizers (default: automatically chosen). | ||
- name: "--ec_max_size" | ||
alternatives: ["-e"] | ||
type: integer | ||
description: | | ||
Maximum number of targets in an equivalence class (default: no maximum). | ||
|
||
resources: | ||
- type: bash_script | ||
path: script.sh | ||
|
||
test_resources: | ||
- type: bash_script | ||
path: test.sh | ||
- path: test_data | ||
|
||
engines: | ||
- type: docker | ||
image: ubuntu:22.04 | ||
setup: | ||
- type: docker | ||
run: | | ||
apt-get update && \ | ||
apt-get install -y --no-install-recommends wget && \ | ||
wget --no-check-certificate https://github.com/pachterlab/kallisto/releases/download/v0.50.1/kallisto_linux-v0.50.1.tar.gz && \ | ||
tar -xzf kallisto_linux-v0.50.1.tar.gz && \ | ||
mv kallisto/kallisto /usr/local/bin/ | ||
runners: | ||
- type: executable | ||
- type: nextflow |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
``` | ||
kallisto index | ||
``` | ||
kallisto 0.50.1 | ||
Builds a kallisto index | ||
|
||
Usage: kallisto index [arguments] FASTA-files | ||
|
||
Required argument: | ||
-i, --index=STRING Filename for the kallisto index to be constructed | ||
|
||
Optional argument: | ||
-k, --kmer-size=INT k-mer (odd) length (default: 31, max value: 31) | ||
tverbeiren marked this conversation as resolved.
Show resolved
Hide resolved
|
||
-t, --threads=INT Number of threads to use (default: 1) | ||
-d, --d-list=STRING Path to a FASTA-file containing sequences to mask from quantification | ||
--make-unique Replace repeated target names with unique names | ||
--aa Generate index from a FASTA-file containing amino acid sequences | ||
--distinguish Generate index where sequences are distinguished by the sequence name | ||
-T, --tmp=STRING Temporary directory (default: tmp) | ||
tverbeiren marked this conversation as resolved.
Show resolved
Hide resolved
|
||
-m, --min-size=INT Length of minimizers (default: automatically chosen) | ||
-e, --ec-max-size=INT Maximum number of targets in an equivalence class (default: no maximum) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
#!/bin/bash | ||
|
||
## VIASH START | ||
## VIASH END | ||
|
||
set -eo pipefail | ||
|
||
if [ -n "$par_kmer_size" ]; then | ||
if [[ "$par_kmer_size" -lt 1 || "$par_kmer_size" -gt 31 || $(( par_kmer_size % 2 )) -eq 0 ]]; then | ||
echo "Error: Kmer size must be an odd number between 1 and 31." | ||
exit 1 | ||
fi | ||
fi | ||
|
||
kallisto index \ | ||
-i "${par_kallisto_index}" \ | ||
${par_kmer_size:+--kmer-size $par_kmer_size} \ | ||
${par_make_unique:+--make-unique} \ | ||
${par_aa:+--aa} \ | ||
${par_distinguish:+--distinguish} \ | ||
${par_min_size:+--min-size $par_min_size} \ | ||
${par_ec_max_size:+--ec-max-size $par_ec_max_size} \ | ||
${par_d_list:+--d-list "${par_d_list}"} \ | ||
"${par_input}" | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
#!/bin/bash | ||
|
||
echo ">>>Test1: Testing $meta_functionality_name with non-default k-mer size" | ||
|
||
"$meta_executable" \ | ||
--input "$meta_resources_dir/test_data/transcriptome.fasta" \ | ||
--kallisto_index Kallisto \ | ||
--kmer_size 21 | ||
|
||
|
||
echo ">>> Checking whether output exists and is correct" | ||
[ ! -f "Kallisto" ] && echo "Kallisto index does not exist!" && exit 1 | ||
[ ! -s "Kallisto" ] && echo "Kallisto index is empty!" && exit 1 | ||
|
||
kallisto inspect Kallisto 2> test.txt | ||
grep "number of k-mers: 2,978" test.txt || { echo "The content of the index seems to be incorrect." && exit 1; } | ||
|
||
################################################################################ | ||
|
||
echo ">>>Test2: Testing $meta_functionality_name with d_list argument" | ||
|
||
"$meta_executable" \ | ||
--input "$meta_resources_dir/test_data/transcriptome.fasta" \ | ||
--kallisto_index Kallisto \ | ||
--d_list "$meta_resources_dir/test_data/d_list.fasta" | ||
|
||
echo ">>> Checking whether output exists and is correct" | ||
[ ! -f "Kallisto" ] && echo "Kallisto index does not exist!" && exit 1 | ||
[ ! -s "Kallisto" ] && echo "Kallisto index is empty!" && exit 1 | ||
|
||
kallisto inspect Kallisto 2> test.txt | ||
grep "number of k-mers: 3,056" test.txt || { echo "The content of the index seems to be incorrect." && exit 1; } | ||
|
||
echo "All tests succeeded!" | ||
exit 0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
>YAL067W-A CDS=1-228 | ||
ATGCCAATTATAGGGGTGCCGAGGTGCCTTATAAAACCCTTTTCTGTGCCTGTGACATTTCCTTTTTCGG | ||
TCAAAAAGAATATCCGAATTTTAGATTTGGACCCTCGTACAGAAGCTTATTGTCTAAGCCTGAATTCAGT | ||
CTGCTTTAAACGGCTTCCGCGGAGGAAATATTTCCATCTCTTGAATTCGTACAACATTAAACGTGTGTTG | ||
GGAGTCGTATACTGTTAG |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
>YAL069W CDS=1-315 | ||
ATGATCGTAAATAACACACACGTGCTTACCCTACCACTTTATACCACCACCACATGCCATACTCACCCTC | ||
ACTTGTATACTGATTTTACGTACGCACACGGATGCTACAGTATATACCATCTCAAACTTACCCTACTCTC | ||
AGATTCCACTTCACTCCATGGCCCATCTCTCACTGAATCAGTACCAAATGCACTCACATCATTATGCACG | ||
GCACTTGCCTCAGCGGTCTATACCCTGTGCCATTTACCCATAACGCCCATCATTATCCACATTTTGATAT | ||
CTATATCTCATTCGGCGGTCCCAAATATTGTATAA | ||
>YAL068W-A CDS=1-255 | ||
ATGCACGGCACTTGCCTCAGCGGTCTATACCCTGTGCCATTTACCCATAACGCCCATCATTATCCACATT | ||
TTGATATCTATATCTCATTCGGCGGTCCCAAATATTGTATAACTGCCCTTAATACATACGTTATACCACT | ||
TTTGCACCATATACTTACCACTCCATTTATATACACTTATGTCAATATTACAGAAAAATCCCCACAAAAA | ||
TCACCTAAACATAAAAATATTCTACTTTTCAACAATAATACATAA | ||
>YAL068C CDS=1-363 | ||
ATGGTCAAATTAACTTCAATCGCCGCTGGTGTCGCTGCCATCGCTGCTACTGCTTCTGCAACCACCACTC | ||
TAGCTCAATCTGACGAAAGAGTCAACTTGGTGGAATTGGGTGTCTACGTCTCTGATATCAGAGCTCACTT | ||
AGCCCAATACTACATGTTCCAAGCCGCCCACCCAACTGAAACCTACCCAGTCGAAGTTGCTGAAGCCGTT | ||
TTCAACTACGGTGACTTCACCACCATGTTGACCGGTATTGCTCCAGACCAAGTGACCAGAATGATCACCG | ||
GTGTTCCATGGTACTCCAGCAGATTAAAGCCAGCCATCTCCAGTGCTCTATCCAAGGACGGTATCTACAC | ||
TATCGCAAACTAG | ||
>YAL067W-A CDS=1-228 | ||
ATGCCAATTATAGGGGTGCCGAGGTGCCTTATAAAACCCTTTTCTGTGCCTGTGACATTTCCTTTTTCGG | ||
TCAAAAAGAATATCCGAATTTTAGATTTGGACCCTCGTACAGAAGCTTATTGTCTAAGCCTGAATTCAGT | ||
CTGCTTTAAACGGCTTCCGCGGAGGAAATATTTCCATCTCTTGAATTCGTACAACATTAAACGTGTGTTG | ||
GGAGTCGTATACTGTTAG |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why
--kallisto-index
instead of--index
?