-
Notifications
You must be signed in to change notification settings - Fork 1
IonTorrent Analysis
The latest documentation on running the IonTorrent pipeline should be found on the pipeline's repo here
# change to the pipeline's directory
cd /ifs/data/molecpathlab/IonTorrent_reporter/pipeline
# check for runs
code/get_server_run_list.sh
# Make a Sample sheet
code/make_samplesheet.py -p <analysis_ID_1> -p <analysis_ID_2> -n <analysis_ID>
# Check your Sample sheet
cat samplesheets/<analysis_ID>.tsv
# download run data
code/run_samplesheet.py samplesheets/<analysis_ID>.tsv -d
# annotate VCF files
code/run_samplesheet.py samplesheets/<analysis_ID>.tsv -aq
# wait for jobs to finish...
# creat reports & snapshots
code/run_samplesheet.py samplesheets/<analysis_ID>.tsv -pq
# manually review in CyberDuck
# mail the results if it looks good
code/mail_analysis_report.sh <analysis_ID_1> <analysis_ID_2>
First, change to the directory holding the IonTorrent analysis reporting pipeline
cd /ifs/data/molecpathlab/IonTorrent_reporter/pipeline
Check for the latest runs
code/get_server_run_list.sh
If you want to process several runs easily, you can make a sample sheet. Paired analyses should go on the same line (tab-separated), unpaired analyses go on separate lines.
You can use the samplesheet creation script to make a tab-separated sample sheet:
code/make_samplesheet.py -p <analysis_ID_1> -p <analysis_ID_2> -n <analysis_ID>
You should check your sample sheet with the cat
command to make sure it looks right:
$ cat samplesheets/<analysis_ID>.tsv
For example, to make a sample sheet for the pair of analyses Auto_user_SN2-271-IT17-19-1_327_355
and Auto_user_SN2-272-IT17-19-2_329_356
, you can use this command:
$ code/make_samplesheet.py -p Auto_user_SN2-271-IT17-19-1_327_355 -p Auto_user_SN2-272-IT17-19-2_329_356 -n IT17-19
Single IDs:
[]
Paired IDs:
['Auto_user_SN2-271-IT17-19-1_327_355', 'Auto_user_SN2-272-IT17-19-2_329_356']
New samplesheet file:
samplesheets/IT17-19.tsv
To check the samplesheet samplesheets/IT17-19.tsv
, you would run this:
$ cat samplesheets/IT17-19.tsv
Auto_user_SN2-271-IT17-19-1_327_355 Auto_user_SN2-272-IT17-19-2_329_356
To make a sample sheet that includes non-paired analysis runs, simply add the single analysis ID's without any flags:
$ code/make_samplesheet.py -p Auto_user_SN2-271-IT17-19-1_327_355 -p Auto_user_SN2-272-IT17-19-2_329_356 -n mixed_runs Auto_user_SN2-231-IT16-056-1_290_319 SN2-211-IT16-048-2_11-08-2016_300
Single IDs:
['Auto_user_SN2-231-IT16-056-1_290_319', 'SN2-211-IT16-048-2_11-08-2016_300']
Paired IDs:
['Auto_user_SN2-271-IT17-19-1_327_355', 'Auto_user_SN2-272-IT17-19-2_329_356']
New samplesheet file:
samplesheets/mixed_runs.tsv
$ cat samplesheets/mixed_runs.tsv
Auto_user_SN2-231-IT16-056-1_290_319
SN2-211-IT16-048-2_11-08-2016_300
Auto_user_SN2-271-IT17-19-1_327_355 Auto_user_SN2-272-IT17-19-2_329_356
To run the sample sheet, you would next do the following, replacing samplesheets/IT17-18.tsv
with the path to your sample sheet file:
- download the files for the analyses
code/run_samplesheet.py samplesheets/<analysis_ID>.tsv -d
- annotate the VCF's and make variant tables (submit job to the cluster)
code/run_samplesheet.py samplesheets/<analysis_ID>.tsv -aq
- generate paired-analysis snapshots and reports (submit job to the cluster)
code/run_samplesheet.py samplesheets/<analysis_ID>.tsv -pq
# download
code/run_samplesheet.py samplesheets/IT17-18.tsv -d
# annotate
code/run_samplesheet.py samplesheets/IT17-18.tsv -aq
# wait for jobs to finish...
# report
code/run_samplesheet.py samplesheets/IT17-18.tsv -pq
If you don't want to use a samplesheet to run the analysis, you can run each step yourself by calling the desired scripts, followed by the ID's of the analyses to be run.
The analysis report(s) should be manually reviewed; use a desktop program such as CyberDuck or WinSCP for this
Currently, only the overview_report.html
is used.
Some things to look for in the report:
- make sure that the number of variants present in the variant table matches the number of IGV snapshot entires (compare the table row numbers with the Table of Contents section numbers)
- make sure the IGV snapshots loaded BAM files match the given sample (read the little tiny filename on the left side of the image)
- if a control is included on the lower panel of the IGV snapshot, make sure it is from the correct pair of analysis runs
If everything looks good, then mail the results:
code/mail_analysis_report.sh <analysis_ID_1> <analysis_ID_2>
code/mail_analysis_report.sh Auto_user_SN2-269-IT17-18-1_325_353 Auto_user_SN2-270-IT17-18-2_326_354
-
File download steps are always run in the current terminal session and may take several minutes to complete. Output messages may not immediately be visible. As such, it is a good idea to run in
screen
, or be sure that you do not terminate the current session or process while the download is running. -
The reporting pipeline currently requires Python 2.7. If you try to run it without using Python 2.7, you'll probably get errors like this:
Running pipeline with the following parameters:
Traceback (most recent call last):
File "code/run_samplesheet.py", line 37, in <module>
print "Samplesheet file: {:>29}".format(samplesheet_file)
ValueError: zero length field name in format
- To check if you have Python 2.7 loaded, run this command:
python --version
- For example:
# GOOD
$ python --version
Python 2.7.3
# BAD
$ python --version
Python 2.6.6
- If you don't have Python 2.7 loaded, run this command to set it to automatically load on login, then exit the Terminal, log back in, and check again:
echo 'module load python/2.7' >> ~/.bashrc
[Full reporting pipeline documentation is found here]