-
Notifications
You must be signed in to change notification settings - Fork 1
Demultiplexing (Manual)
ProjectID="<put_the_ID_here>"
# NGS 580
/ifs/data/molecpathlab/scripts/demultiplex-NGS580-WES.sh "$ProjectID"
# OR
# Archer
/ifs/data/molecpathlab/scripts/demultiplex-archer.sh "$ProjectID"
# wait for jobs to finish...
# update the run index
/ifs/data/molecpathlab/scripts/sequencer_index.py
[Archer Sample Sheet Template] [NGS580 Sample Sheet Template]
Next-Gen Sequencing outputs raw base call files (.bcl) for each lane in the flow cell. These need to be converted into FASTQ format (.fastq) and the reads for each sample separated. This process is called 'demultiplexing', and can be run once sequencing has finished.
In order to perform demultiplexing, you will need to log into phoenix, the NYULMC HPC server. If you don't already have an HPC account, refer to the New User's Page.
The NextSeq run data is located in this directory on phoenix:
/ifs/data/molecpathlab/quicksilver
You can see which runs are present in that directory by using the ls -l
command:
ls -l /ifs/data/molecpathlab/quicksilver
The output looks like this; items in the right-most column are directories that hold NextSeq runs.
total 136
drwxrws--- 9 pinnej01 molecpathlab 549 Mar 20 20:33 170308_NB501073_0004_AHHFKYBGX2
lrw-rw---- 1 kellys04 molecpathlab 31 Mar 20 20:20 ArcherRun -> 170308_NB501073_0004_AHHFKYBGX2
drwxrws--- 18 kellys04 molecpathlab 576 Mar 6 13:17 NS17-01-35154137
drwxrws--- 3 pinnej01 molecpathlab 7761 Mar 28 15:35 NS17-02
A samplesheet file is required to demultiplex a run; it should have been prepared during sequencing setup, and is usually titled SampleSheet.csv
. An example sample sheet file for an Archer run can be found here (right-click & Save As).
This sample sheet file needs to be present in the Data/Intensities/BaseCalls
directory for the run, for example:
/ifs/data/molecpathlab/quicksilver/NS17-02/Data/Intensities/BaseCalls/SampleSheet.csv
To check if a sample sheet file is present, you can use the ls
command again on the BaseCalls
directory;
ls -l /ifs/data/molecpathlab/quicksilver/NS17-02/Data/Intensities/BaseCalls
If its not already there, you should copy it to that location. For copying files from your desktop computer to phoenix, you might need a program like CyberDuck.
Use one of the following demultiplexing script that matches the type of sequencing which was performed.
/ifs/data/molecpathlab/scripts/demultiplex-archer.sh
/ifs/data/molecpathlab/scripts/demultiplex-NGS580-WES.sh
- All demultiplexing scripts are run in the same fashion.
Once the Sample Sheet is in place in the run directory, you can start the demultiplexing script by calling the script, and passing it the ID of the run to be processed. For example:
/ifs/data/molecpathlab/scripts/demultiplex-archer.sh "NS17-02"
This will submit a job to the compute cluster, which will run the demultiplexer program bcl2fastq
, amongst other things. You can check the status of your qsub cluster jobs with the qstat
command.
Demultiplexed output always gets written to the Unaligned
directory within a run, for example:
/ifs/data/molecpathlab/quicksilver/NS17-02/Data/Intensities/BaseCalls/Unaligned
After the run is finished, you should check it for quality and completion.
In the Data/Intensities/BaseCalls/Unaligned
directory for the run you should find a file called Demultiplex_Stats.htm
. Open this in your web browser, and it will show a table with demultiplexing stats for the samples in the run.
Also in this directory, you will find files with names like bcl2fastq.217.sh.o387612
. These are log files for your demultiplexing qsub jobs. Each one contains program output messages, including error messages. If something goes wrong, always check these files first. You can open them in any text editor (such as Notepad or TextEdit), or print them to the terminal screen with the cat
command.
An index of all runs that have been demultiplexed on phoenix is kept at /ifs/data/molecpathlab/quicksilver/run_index/index.csv
When you are done demultiplexing, you should re-build the index by running the following command:
/ifs/data/molecpathlab/scripts/sequencer_index.py