-
Notifications
You must be signed in to change notification settings - Fork 5
Home
In this course, you will learn the basics of medical genomics, with a special focus on cancer genomics, through lectures, practicals, and projects
Fig 1. Schematic of computational analyses used in the Rare Cancers Genomics Initiative
- medical genomic concepts
- knowledge of resources (references, databases, workflow repositories)
- sequencing techniques
- basics of molecular biology
- basics of next-generation sequencing and chip sequencing
- R scripting
- basics of programming
Introduction: course objectives and organization
- Genomics: germline and somatic variation (SNVs, indels, structural variants, mutational signatures, cf DNA), resources (genome references, annotation, databases), sequencing strategies (whole-genome sequencing, whole-exome sequencing, arrays)
- Transcriptomics, multi-omics and beyond: heterogeneity and microenvironment, resources (tissue expression reference databases), sequencing strategies (bulk, single-cell), deconvolution, multi-omic integration, deep learning and integration with image analysis
- Metabolomics: Overview of metabolome and biomarkers, mass-spectrometry, metabolomics data processing and analysis, metabolite identification and metabolic pathway analysis, resources (databases, workflow repositories)
- Epigenomics: chromatin and histone modification, resources (annotations and databases for tissue-specific profiles), ATAC-seq, bisulfite sequencing, methylation arrays, methylation quantification, peak calling, differentially methylated positions and regions, deconvolution and identification of cell types, inference of environmental risk factors
- TP1: Developing and deploying an open-source medical genomic bioinformatic workflow
- TP2: Performing a multi-omic analysis of cancer data with R
Several projects will be proposed to process (bioinformatic workflow development) and analyze cancer data, related to the interests of researchers of the International Agency for Research on Cancer - WHO. Students will work in small groups (~3-4 people). Weekly meetings (in person or remotely) will take place with the supervisor.
The code used to perform the analyses will be annotated and given to the supervisor (e.g., R code for projects 1, 2, and 4, nextflow code for project 3). A final project restitution and debriefing will be held at the end of the module (January the 18th at 10 am), consisting of 10 min presentations by each group. Grades will be given to each group averaging:
- a project grade given by the supervisor, accounting for 50% of the final grade. It is based on the supervisor's assessment of how students addressed the project and related issues (focusing on the process rather than the end results).
- a presentation grade given by all supervisors, accounting for 50% of the final grade. It is based on the results from the project and the students' ability to clearly communicate them
Projects are the following:
- Project 1: Identifying tumor evolutionary trajectories from genomic data
- Project 2: Build a "sarcoma index" and assess its clinical relevance using RNA-seq data
- Project 3: Compare the transcriptomic profiles of adult and pediatric sarcomas using RNA-seq data
- Project 4: Implement a bioinformatic workflow to detect viral infections in tumors
- Project 5: Characterizing malignant pleural mesothelioma at the single-cell resolution
- Project 6: Machine learning for metabolomics
- GATK Best practices
- nextflow: nf-core, IARCbioinfo
- snakemake: snakemake-workflows
- wdl: GATK
- jupyter notebook: jupyter
- binder binder
[email protected] (Nicolas Alcala)