Skip to content

Demultiplexing (Manual)

Stephen Kelly edited this page May 25, 2017 · 1 revision

Synopsis

ProjectID="<put_the_ID_here>"

# NGS 580
/ifs/data/molecpathlab/scripts/demultiplex-NGS580-WES.sh "$ProjectID"

# OR

# Archer
/ifs/data/molecpathlab/scripts/demultiplex-archer.sh "$ProjectID"


# wait for jobs to finish...
# update the run index
/ifs/data/molecpathlab/scripts/sequencer_index.py

[Archer Sample Sheet Template] [NGS580 Sample Sheet Template]

Overview

Next-Gen Sequencing outputs raw base call files (.bcl) for each lane in the flow cell. These need to be converted into FASTQ format (.fastq) and the reads for each sample separated. This process is called 'demultiplexing', and can be run once sequencing has finished.

In order to perform demultiplexing, you will need to log into phoenix, the NYULMC HPC server. If you don't already have an HPC account, refer to the New User's Page.

The NextSeq run data is located in this directory on phoenix:

/ifs/data/molecpathlab/quicksilver

You can see which runs are present in that directory by using the ls -l command:

ls -l /ifs/data/molecpathlab/quicksilver

The output looks like this; items in the right-most column are directories that hold NextSeq runs.

total 136
drwxrws---  9 pinnej01 molecpathlab  549 Mar 20 20:33 170308_NB501073_0004_AHHFKYBGX2
lrw-rw----  1 kellys04 molecpathlab   31 Mar 20 20:20 ArcherRun -> 170308_NB501073_0004_AHHFKYBGX2
drwxrws--- 18 kellys04 molecpathlab  576 Mar  6 13:17 NS17-01-35154137
drwxrws---  3 pinnej01 molecpathlab 7761 Mar 28 15:35 NS17-02

Setup

A samplesheet file is required to demultiplex a run; it should have been prepared during sequencing setup, and is usually titled SampleSheet.csv. An example sample sheet file for an Archer run can be found here (right-click & Save As).

This sample sheet file needs to be present in the Data/Intensities/BaseCalls directory for the run, for example:

/ifs/data/molecpathlab/quicksilver/NS17-02/Data/Intensities/BaseCalls/SampleSheet.csv

To check if a sample sheet file is present, you can use the ls command again on the BaseCalls directory;

ls -l /ifs/data/molecpathlab/quicksilver/NS17-02/Data/Intensities/BaseCalls

If its not already there, you should copy it to that location. For copying files from your desktop computer to phoenix, you might need a program like CyberDuck.

Run

Use one of the following demultiplexing script that matches the type of sequencing which was performed.

Archer Script

/ifs/data/molecpathlab/scripts/demultiplex-archer.sh

NGS 580 Exome Sequencing Script

/ifs/data/molecpathlab/scripts/demultiplex-NGS580-WES.sh
  • All demultiplexing scripts are run in the same fashion.

Usage

Once the Sample Sheet is in place in the run directory, you can start the demultiplexing script by calling the script, and passing it the ID of the run to be processed. For example:

/ifs/data/molecpathlab/scripts/demultiplex-archer.sh "NS17-02"

This will submit a job to the compute cluster, which will run the demultiplexer program bcl2fastq, amongst other things. You can check the status of your qsub cluster jobs with the qstat command.

Demultiplexed output always gets written to the Unaligned directory within a run, for example:

/ifs/data/molecpathlab/quicksilver/NS17-02/Data/Intensities/BaseCalls/Unaligned

Check Results

After the run is finished, you should check it for quality and completion.

In the Data/Intensities/BaseCalls/Unaligned directory for the run you should find a file called Demultiplex_Stats.htm. Open this in your web browser, and it will show a table with demultiplexing stats for the samples in the run.

Also in this directory, you will find files with names like bcl2fastq.217.sh.o387612. These are log files for your demultiplexing qsub jobs. Each one contains program output messages, including error messages. If something goes wrong, always check these files first. You can open them in any text editor (such as Notepad or TextEdit), or print them to the terminal screen with the cat command.

Update Index

An index of all runs that have been demultiplexed on phoenix is kept at /ifs/data/molecpathlab/quicksilver/run_index/index.csv

When you are done demultiplexing, you should re-build the index by running the following command:

/ifs/data/molecpathlab/scripts/sequencer_index.py