A collection of next-gen sequencing visualisation scripts. Click a script's
name to go to it's subdirectory which will contain a detailed README.md
file with examples and instructions.
Please Note that many of these scripts have been superseded by MultiQC (http://multiqc.info). if you're not familiar with it we recommend having a look there first before using the tools below.
Most of these scripts are written in Python. Those within the stand_alone
are generally run on the command line. The rest can either be run on the command
line or imported as part of the ngi_visualizations
package. See
below for instructions on how to use the python package.
- Count Biotypes
- Uses HTSeq to plot read overlaps with different feature biotype flags
- preseq Complexity Curves
- Subsampled Gene Observations
- Group of scripts to plot the number of observed genes at varying sample subsampling proportions. Can give an impression of library complexity on a biological level.
- Qualimap Plots
- Scripts to generate Coverage and Insert Size histograms. Both of these plots are already produced by Qualimap. These just look nicer in our reports and have some extra plotting options.
- snpEff Effect Type Plot
- Script to create bar charts of SNP Effect counts, generated by snpEff.
- Gene Body Coverage
- Alignment Summaries
- Two scripts to parse log files containing alignment stats from bowtie, bowtie 2 or tophat and generate overview HTML reports
- Bismark Summary Report
- Script to parse lots of bismark output reports and generate a single HTML summary report.
- Bismark Coverage Analysis
- Bismark Coverage Curves - Plots the proportion of cytosines meeting increasing coverage thresholds
- Bismark Window Sizes - Plots the proportion of windows passing observation thresholds with increasing window sizes
See below for example outputs. Click an image to go to that script.
Count Biotypes | |
---|---|
preseq Complexity Curves | Subsampled Gene Observations |
---|---|
Qualimap Plots | |
---|---|
snpEff Effect Plots | |
---|---|
Gene Body Coverage | FPKM Scatter Plot |
---|---|
Alignment Summaries | Bismark Summaries |
---|---|
Bismark Coverage Curves | Bismark Window Sizes |
---|---|
For using stand alone packages see the README.md
file in that package's subdirectory.
To use the ngi_visulaizations
package, download or clone the repository.
Then, to install the package, run:
python setup.py install
If you intend to make any changes to the package, swap install
for develop
,
else you will have to reinstall the package each time you change the source code.
Once installed, you can import the script from the relevant subdirectory. For instance, to use the Qualimap Insert Size histogram you would use:
from ngi_visulaizations.qualimap import insert_size
The functions within this script are then available in that namespace. For instance, you could now generate the histograms by running:
insert_size.plot_insert_size_histogram(input_fn)
If you would like to add a visualization script to this repository, please read the contributing notes first. These describe the steps required in adding your script to the repository.
These scripts were written for use at the National Genomics Infrastructure at SciLifeLab in Stockholm, Sweden. For more information, please get in touch with Phil Ewels.