GitHub - Sondr11/cfDNA-end-selection: End selection in cell-free DNA enhances noninvasive prenatal testing and cancer diagnosis

End selection in cell-free DNA enhances noninvasive prenatal testing and cancer diagnosis

This repository contains the scripts and related files for Ju et al. Distributed under the CC BY-NC-ND 4.0 license for personal and academic usage only.

Required annotation files

Note that all the annotation files should be built for NCBI GRCh38 (hg38) reference genome.

Chromosome size information, we had provided a hg38.info under anno directory;
Nucleosome tracks. The files are too big to be uploaded to GitHub, so the users need to build these files themselves using the following commands:

wget https://download.cncb.ac.cn/nucmap/organisms/v1/Homo_sapiens/byDataType/Nucleosome_peaks_DANPOS/Homo_sapiens.hsNuc0390101.nucleosome.DANPOSPeak.bed.gz
zcat Homo_sapiens.hsNuc0390101.nucleosome.DANPOSPeak.bed.gz | perl -lane '$c=($F[1]+$F[2])>>1; print join("\t", $F[0], $c-73, $c+1+73, $F[3])' | sort -k1,1 -k2,2n | gzip >hsNuc0390101.DANPOSPeak.sorted.bed.gz

wget https://download.cncb.ac.cn/nucmap/organisms/v1/Homo_sapiens/byDataType/Nucleosome_peaks_DANPOS/Homo_sapiens.hsNuc0260501.nucleosome.DANPOSPeak.bed.gz
zcat Homo_sapiens.hsNuc0260501.nucleosome.DANPOSPeak.bed.gz | perl -lane '$c=($F[1]+$F[2])>>1; print join("\t", $F[0], $c-73, $c+1+73, $F[3])' | sort -k1,1 -k2,2n | gzip >hsNuc0260501.DANPOSPeak.sorted.bed.gz

The files generated from these codes (hsNuc0390101.DANPOSPeak.sorted.bed.gz for GM12878 cell line, and hsNuc0260501.DANPOSPeak.sorted.bed.gz for K562 cell line) could be directly used as inputs for the following scripts.

Noninvasive prenatal testing scripts under `NIPT` directory

The main program is nipt. You can run it without parameters to see the usage:

Usage: NIPT/nipt <nucleosome.track.bed[.gz]> <bed.list>

The "bed.list" file should contains 3 columns: sid /path/to/bed[.gz] category
The bed files could be gzipped, but must be Single-End data
The category should be either "control" or "testing" for all samples

There are 2 compulsory paramters:

nucleosome.track.bed contains the nucleosome track annotations;
SE.bed.list file must contains 3 columns: sampleIDs, path to the bed files, and a category (either control or testing). An example file example.bed.list.nipt was provided.

The bed files could be plain text or gzipped. We provide a script se_sam2bed.pl for converting bam file to bed format. There would be 2 output files: Z-score.standard and Z-score.with.end.selection, each contains Z-scores for samples with "testing" label.

Note that the current script for NIPT only support single-end data.

Cancer diagnosis scripts under `Cancer.diagnosis` directory

The main program is calc.N-index. Call it without parameters to see the usage:

Usage: calc.N-index <genome.info> <nucleosome.center.bed[.gz]> <PE.bed.list> [extend=73] [thread=4] [autosome.only=y|n]

Written by Kun Sun ([email protected]). (c) Shenzhen Bay Laboratory.

This program is designed to calculate N-index for each sample in 'bed.list',
which should contain (at least) 2 columns: sampleID and /path/to/bed[.gz].
The current program only supports Paired-End data; gzipped bed file is supported.
Output: Sid Total Within N-index

There are 3 compulsory paramters:

genome.info contains the size information for each chromosome in human genome;
nucleosome.track.bed contains the nucleosome track annotations;
PE.bed.list file contains sampleIDs and path to the bed files in 2-column format. An example file example.bed.list.cancer was provided.

The bed files could be plain text or gzipped. We provide a script pe_sam2bed.pl for converting bam file to bed format. The results would be written to standard output, which contains 4 columns: the 1st column is sampleID in PE.bed.list and the 4th column is N-index.

Note that the current script for cancer diagnosis only support paired-end data.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Cancer.diagnosis		Cancer.diagnosis
NIPT		NIPT
anno		anno
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

End selection in cell-free DNA enhances noninvasive prenatal testing and cancer diagnosis

Required annotation files

Noninvasive prenatal testing scripts under `NIPT` directory

Cancer diagnosis scripts under `Cancer.diagnosis` directory

About

Releases

Packages

Languages

Sondr11/cfDNA-end-selection

Folders and files

Latest commit

History

Repository files navigation

End selection in cell-free DNA enhances noninvasive prenatal testing and cancer diagnosis

Required annotation files

Noninvasive prenatal testing scripts under NIPT directory

Cancer diagnosis scripts under Cancer.diagnosis directory

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Noninvasive prenatal testing scripts under `NIPT` directory

Cancer diagnosis scripts under `Cancer.diagnosis` directory

Packages