-
Notifications
You must be signed in to change notification settings - Fork 1
Workflow examples
1. You can run whole "pipeline" with default settings with a single command and process all FASTA and FASTQ files in the working directory:
barapost-prober.py && barapost-local.py && barapost-binning.py
2. You can try Barapost on test dataset named test_reads.fastq.gz
in examples
directory (there are 100 reads in this file):
Classify 10 (-b 10
) reads with "barapost-prober.py". Two requests will be sent to NCBI BLAST server, each containing 5 (-p 5
) reads. Search among Pseudomonas, Rhodococcus and Escherichia (-g 286,1827,561
, correspondingly) reference sequences:
barapost-prober.py test_reads.fastq.gz -b 10 -p 5 -g 286,1827,561 -o classif_dir
Download reference genome sequences "discovered" by "barapost-prober.py", create a database on local machine and classify remaining reads using recently created database:
barapost-local.py test_reads.fastq.gz -r classif_dir
Sort classified reads and place binned files to directory some_binned_reads
:
barapost-binning.py test_reads.fastq.gz -r classif_dir -o some_binned_reads
Once FAST5 file raw_signal.fast5
is basecalled and result file reads.fastq
is generated, the latter can be classified with "barapost-prober.py" and/or "barapost-local.py":
barapost-prober.py reads.fastq -o fastq_classification
barapost-local.py reads.fastq -r fastq_classification
Then source FAST5 file can be binned according to classification of FASTQ file:
barapost-binning.py raw_signal.fast5 -r fastq_classification -o fast5_binned
Once FAST5 files raw_signal<1...N>.fast5
are basecalled and result files reads<1...M>.fastq
are generated, the latter can be classified with "barapost-prober.py" and/or "barapost-local.py":
barapost-prober.py reads*.fastq -o fastq_classification
barapost-local.py reads*.fastq -r fastq_classification
Then we try to bin source FAST5 data:
barapost-binning.py raw_signal*.fast5 -r fastq_classification -o fast5_binned
The process ends with error message like this:
Read <read_ID> not found in TSV file containing taxonomic annotation.
Try running barapost-binning with '-u' (--untwist-fast5') flag.
We will follow this suggestion and run:
barapost-binning.py raw_signal*.fast5 -r fastq_classification -o fast5_binned -u
And now everything should be right. :)