-
Notifications
You must be signed in to change notification settings - Fork 5
Home
In this course, you will learn the basics of medical genomics, with a special focus on cancer genomics, through lectures, practicals, and projects
Fig 1. Schematic of computational analyses used in the Rare Cancers Genomics Initiative
- medical genomic concepts
- knowledge of resources (references, databases, workflow repositories)
- sequencing techniques
- basics of molecular biology
- basics of next-generation sequencing and chip sequencing
- R scripting
- basics of programming
Introduction: course objectives and organization
- Genomics: germline and somatic variation (SNVs, indels, structural variants, mutational signatures, cf DNA), resources (genome references, annotation, databases), sequencing strategies (whole-genome sequencing, whole-exome sequencing, arrays)
- Transcriptomics, multi-omics and beyond: heterogeneity and microenvironment, resources (tissue expression reference databases), sequencing strategies (bulk, single-cell), deconvolution, multi-omic integration, deep learning and integration with image analysis
- Epigenomics: chromatin and histone modification, resources (annotations and databases for tissue-specific profiles), ATAC-seq, bisulfite sequencing, methylation arrays, methylation quantification, peak calling, differentially methylated positions and regions, deconvolution and identification of cell types, inference of environmental risk factors
- Metabolomics: Overview of metabolome and biomarkers, mass-spectrometry, metabolomics data processing and analysis, metabolite identification and metabolic pathway analysis, resources (databases, workflow repositories)
- TP1: Developing and deploying an open-source medical genomic bioinformatic workflow
- TP2: Performing a multi-omic analysis of cancer data with R
Several projects will be proposed to process (bioinformatic workflow development) and analyze cancer data, related to the interests of researchers of the International Agency for Research on Cancer - WHO. Students will work in small groups (~3-4 people). Weekly meetings (in person or remotely) will take place with the supervisor.
The code used to perform the analyses will be annotated and given to the supervisor (e.g., R code or nextflow code depending on the project). A final project restitution and debriefing will be held at the end of the module, consisting of 10 min presentations by each group. Grades will be given to each group averaging:
- a project grade given by the supervisor, accounting for 50% of the final grade. It is based on the supervisor's assessment of how students addressed the project and related issues (focusing on the process rather than the end results).
- a presentation grade given by all supervisors, accounting for 50% of the final grade. It is based on the results from the project and the students' ability to clearly communicate them
Projects are the following:
- Project 1: Genomic analysis of the hallmarks of cancer
- Project 2: Assessing the ability of the WHO classification of tumors to account for inter-patient molecular variation
- Project 3: TBD
- Project 4: Implement a bioinformatic workflow to reconstruct cell phylogenies from single-cell sequencing data
- Project 5: Machine learning for metabolomics
- GATK Best practices
- nextflow: nf-core, IARCbioinfo
- snakemake: snakemake-workflows
- wdl: GATK
- jupyter notebook: jupyter
- binder binder
[email protected] (Nicolas Alcala)