Skip to content

Project 1

Nicolas edited this page Dec 1, 2020 · 33 revisions

Project 1: Building a transcriptomic map of small intestine neuroendocrine tumors

Background

The panNENomics project aims to unveil the molecular pathways underlying the development of the understudied neuroendocrine neoplasms (NENs) from all body sites. Although we have recently integrated all transcriptomic studies of lung NENs (Gabriel, Mathian et al. Gigascience, 2020), a comprehensive molecular map spanning NENs from all body sites, including gastro-intestinal NENs (Alvarez et al. Nat Genet 2018) has yet to be generated.

Data

Requirements

Light understanding of the two practicals:

  • Q1 of practical 1 (launching a nextflow pipeline)
  • analysing data with PCA and UMAP

Steps

  1. download and convert to fastq (see fastq-dump or fasterq-dump from the SRAtoolkit; https://github.com/ncbi/sra-tools/wiki/HowTo:-fasterq-dump)
  2. process the data with pipelines from the github IARCbioinfo platform to ensure smooth integration with the other data, using parameters files (see nextflow option -params-file)
  1. perform unsupervised analyses with R (dimensionality reduction with PCA and UMAP, clustering), and assess the distribution of specific neuroendocrine markers (NEUROD1, NEUROG3, CHGA, SYP, INSM1, HES6, DDC, UCHL1, NCAM1, CALCA, SSTR2).

Expected difficulties

Storage (100+ RNA-seq ~1Tb) and computation (STAR requires 40-50Gb RAM, HPC Required, long processing time-parallelization is key).

Tips:

  • downloading and processing by small batches necessary, prioritizing primary tumors
  • compressing (gzip) fastq files from the SRA is an option if they are kept a long time; removing intermediate files can also save space
  • adapt memory and cpu specifications to the cluster (see params files; also see -qs and -bg nextflow options)
  • might be better to explore the processed read counts given as supplement of the dataset while the pipelines are running (available https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE98894)

Resources

Clone this wiki locally