Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Init #1

Open
8 of 27 tasks
hoelzer opened this issue Sep 22, 2023 · 5 comments
Open
8 of 27 tasks

Init #1

hoelzer opened this issue Sep 22, 2023 · 5 comments

Comments

@hoelzer
Copy link
Member

hoelzer commented Sep 22, 2023

This is the main issue for adding metagenomics content to the workshop material

The plan is that we have most of the material first in this repo and then we fork it for the GHPP SGS course end of September.

The rough plan is:

image

We can make a ToDo list here to keep track (and edit it). In bold who is responsible:

General tasks

  • Make an OSF to place example data, invite Andele and Hugues Martin
  • Collect example Zymo data set (from SOFO AS, the one on MinION w/o AS active) Martin
  • Prepare and test a standard kraken database for taxonomic read assignment, (try to download the standard index from here Martin
  • Prepare and test a user-defined kraken database holding only the Zymo species and leaving one out Martin
  • we need to check if it is at all possible to run all the tools on laptops...
  • put the taxonomy folder for NCBI kraken2 build on a external drive (40 GB)

Day 01 (Linux & intros)

  • Introduction talk about RKI MF1 and what we're doing, maybe adapting the one I have for FU? Martin
  • Introduction talk about metagenomics, Hugues MetaSub and Martin India?
  • Get ready & Linux for Bioinformatics recap -- should be almost done, we can also start w/ a polleverywhere icebreaker Martin
  • Hands-on of this day should introduce the example data in the OSF and how to create the necessary mamba environments and get the kraken db(s) Martin
  • Slide deck about Metagenome introduction workshop-nanopore-bioinformatics#11 highlighting the importance of correct metagenome analyses (the recent cancer microbiome Salzberg paper). Maybe we can show this at the end of Day01 as a motivation also for the next days? ???

Day 02 (ONT data & QC)

  • Should be almost ready from MISSion course, double-check and adjust hands-on part for the Zymo example data Martin
  • let's also do mapping here, and download of the reference genomes, etc... so day03 can directly start with kraken2 basically
  • Slide deck missing! Let's check if we have smt ready or need to merge e.g. some ONT files/QC and mapping slides we already have

Day 03 (tax read classification)

  • Slide deck needed introducing the methods, can be basically following this best practice Nature protocol paper ???
    • we could start with simple Blast? To see what we hit?
    • And then introducing mapping shortly
    • then transition to more effective approaches using kmers --> kraken2
  • Introduce tax read classification
  • Speak about taxonomical binning as well (and microbial source tracking).
  • hands-on needs download of kraken DBs, how to run it on example data, visualize results via Krona and Pavian maybe ???

Day 04 (MAGs, binning, annotation)

  • prepare a list of literature references for the day on binning and metagenome assemblies
  • which tool for binning? --> good q, would take smt simple but we can introduce the diff binning approaches
  • Slide deck metagenome assembly+binning, using ANVIO comic pictures Martin
  • also add information on QC of MAGs! CheckM, BUSCO maybe, ....
  • Slide deck bacteria gene annotation ???
  • Rapid classification of MAG contigs via Sourmash and GTDB? --> could be also an Bonus hands-on excersice
  • prepare hands-on part

Day 05 (Misc)

  • Slide deck about misc stuff, interesting tools, ...
  • Introducing a fully automated pipeline? E.g. MUFFIN, paper

Good resources

@hoelzer
Copy link
Member Author

hoelzer commented Sep 22, 2023

I made a new branch long-read-mags for all changes @huguesrichard Remember to pull & push regularly when working on the same branch ;)

I suggest that we change the https://github.com/rki-mf1/workshop-nanopore-bioinformatics/blob/long-read-mags/day-nanopore/hands-on-mag.md from the day-nanopore to basically do the same but w/ some metagenomic long read FASTQs as example. I would then keep a "normal" hands-on.md file for bacteria isolates in the folder day-nanopore but just also added a hands-on-mag.md where we can change the example data.

I also started an OSF repository where we can upload example data (unfortunately, the limit is 5 GB I think). Can be found here:

let me know your OSF account and I can also add you as a contributor to the OSF.

https://osf.io/fs53c/

Please also see the main README https://github.com/rki-mf1/workshop-nanopore-bioinformatics/tree/long-read-mags, where I suggested two new days for the special metagenomics part.

I think for the first days we can mainly recycle what we have (linux, nanopore intro, qc, mapping, IGV, ....)

I also started two (empty) slide decks:

we can split it up more, but maybe it's also fine to have most of the things in one slide deck.

Finally, I started to look a bit into one Zymo mock log-distributed R9.4.1 sample we did in the context of the AS SOFO (8x bacteria, 2x fungi). Maybe we can down-sample it and distribute it then via the OSF as example file for everyone to play with and for the hands-on. Then, what they actually sequenced in the first week can be used to re-do the steps on their own.

I am also downloading a few kraken2 databases (standard (viruses, bacteria, archea), standard+fungi, and dereplicated GTDB) to see what we can classify with which DB. Probably we will then also make our own small DB as an example, using e.g. genomes for the eight bacteria as reference.

@hoelzer
Copy link
Member Author

hoelzer commented Sep 22, 2023

@huguesrichard slide deck for introducing ourself and example metagenomics projects:

https://docs.google.com/presentation/d/1CmkZ28PHgQ8dVwmm-U82tjOR0sPEcOcTWkjQzWZd98o/edit?usp=sharing

  • MetaSUB
  • AquaDiva groundwater
  • India Ganges

We need to see that its not getting to long ;)

Edit: maybe we move then our short intro slides to the general MF1 intro slides

@hoelzer
Copy link
Member Author

hoelzer commented Sep 22, 2023

@hoelzer
Copy link
Member Author

hoelzer commented Sep 22, 2023

Hey @huguesrichard see what I just found :D

https://denbi-metagenomics-workshop.readthedocs.io/en/latest/index.html

(but Illumina, but might take a few things and adapt from here for Day04)

@hoelzer
Copy link
Member Author

hoelzer commented Sep 22, 2023

MF1 (Innovations) brief intro: https://docs.google.com/presentation/d/1piem2rL5s3ViyDOSpZDoqQ8k82iV4GC2irHBHo52qSM/edit?usp=sharing

could be maybe enriched w/ recent Stephan slides about GCC MF1

Should be fine, but I decided to rm this slide deck from google and keep it local. It's not important that students have that and there is more sensitive information, maybe. I have it in my cloud for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant