Skip to content

Commit

Permalink
Adding bibliography for the human pathogen genomics domain page
Browse files Browse the repository at this point in the history
  • Loading branch information
bianchini88 committed Oct 22, 2024
1 parent 11fbf58 commit 9633ac2
Show file tree
Hide file tree
Showing 2 changed files with 50 additions and 4 deletions.
44 changes: 43 additions & 1 deletion _bibliography/references.bib
Original file line number Diff line number Diff line change
@@ -1,4 +1,46 @@
@article{walsh2021Author,
@article{gut2021B1MG,
title = {{{B1MG D3}}.1 - {{Quality}} Metrics for Sequencing},
author = {Gut, Ivo and Estelles, Lucia and Cuppen, Edwin and Wirta, Valtteri and Hovig, Eivind and Volkert, Pim and Matthijs, Gert},
year = {2021},
month = jun,
publisher = {Zenodo},
doi = {10.5281/ZENODO.5018495},
urldate = {2024-10-22},
copyright = {Creative Commons Attribution 4.0 International, Open Access},
langid = {english},
}

@article{bush2020Evaluation,
title = {Evaluation of Methods for Detecting Human Reads in Microbial Sequencing Datasets},
author = {Bush, Stephen J. and Connor, Thomas R. and Peto, Tim E.A. and Crook, Derrick W. and Walker, A. Sarah},
year = {2020},
month = jul,
journal = {Microbial Genomics},
volume = {6},
number = {7},
issn = {2057-5858},
doi = {10.1099/mgen.0.000393},
urldate = {2024-10-22},
copyright = {http://creativecommons.org/licenses/by/4.0/},
langid = {english},
}

@article{stevens2020Ten,
title = {Ten Simple Rules for Annotating Sequencing Experiments},
author = {Stevens, Irene and Mukarram, Abdul Kadir and H{\"o}rtenhuber, Matthias and Meehan, Terrence F. and Rung, Johan and Daub, Carsten O.},
editor = {Markel, Scott},
year = {2020},
month = oct,
journal = {PLOS Computational Biology},
volume = {16},
number = {10},
issn = {1553-7358},
doi = {10.1371/journal.pcbi.1008260},
urldate = {2024-10-22},
langid = {english},
}

@article{walsh2021Author,
title = {Author {{Correction}}: {{DOME}}: Recommendations for Supervised Machine Learning Validation in Biology},
shorttitle = {Author {{Correction}}},
author = {Walsh, Ian and Fishman, Dmytro and {Garcia-Gasulla}, Dario and Titma, Tiina and Pollastri, Gianluca and {ELIXIR Machine Learning Focus Group} and Capriotti, Emidio and Casadio, Rita and {Capella-Gutierrez}, Salvador and Cirillo, Davide and Del Conte, Alessio and Dimopoulos, Alexandros C. and Del Angel, Victoria Dominguez and Dopazo, Joaquin and Fariselli, Piero and Fern{\'a}ndez, Jos{\'e} Maria and Huber, Florian and Kreshuk, Anna and Lenaerts, Tom and Martelli, Pier Luigi and Navarro, Arcadi and Broin, Pilib {\'O} and Pi{\~n}ero, Janet and Piovesan, Damiano and Reczko, Martin and Ronzano, Francesco and Satagopam, Venkata and Savojardo, Castrense and Spiwok, Vojtech and Tangaro, Marco Antonio and Tartari, Giacomo and Salgado, David and Valencia, Alfonso and Zambelli, Federico and Harrow, Jennifer and Psomopoulos, Fotis E. and Tosatto, Silvio C. E.},
Expand Down
10 changes: 7 additions & 3 deletions pages/your_domain/human_pathogen_genomics.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ While the objects of interest in this domain are pathogens, the data is usually

#### Sequencing experiments
* Good practices for genome experiments suggest that the documentation, at a minimum, should describe the design of the study or surveillance program, the collected specimens and how the samples were prepared, the experimental setup and protocols, and the analysis workflow.
* Adopt specific genomics and pathogen genomics recommendations such as [Ten simple rules for annotating sequencing experiments](https://doi.org/10.1371/journal.pcbi.1008260).
* Adopt specific genomics and pathogen genomics recommendations such as those outlined in Stevens2020{%cite "stevens2020Ten" %}.
* Refer to the general guidance on providing [documentation and metadata](metadata_management) during your project.
* Adopt standards, conventions and robust protocols to maximise the reuse potential of the data in parallel initiatives and your future projects.
* The Genomic Standards Consortium (GSC) develops and maintains the {% tool "mixs" %} and the {% tool "migs-mims" %}set of core and extended descriptors for genomes and metagenomes with associated samples and their environment to guide scientists on how to capture the metadata essential for high-quality research.
Expand Down Expand Up @@ -95,7 +95,7 @@ While the objects of interest in this domain are pathogens, the data is usually
#### Filtering genomic reads corresponding to human DNA fragments

* Data files with reads produced by sequencing experiments sometimes contain fragments of the host organism’s DNA. When the host is a human research subject or patient, these fragments can be masked or removed to produce files that could potentially be handled with fewer restrictions. The approach chosen to mask the host-associated reads leads to different trade-offs. Make sure to include this as a factor in your risk assessment.
* Mapping to (human) host reference genomes [can inadvertently leave some host-associated reads unmasked](https://doi.org/10.1099%2Fmgen.0.000393).
* Mapping to (human) host reference genomes can inadvertently leave some host-associated reads unmasked {% cite "bush2020Evaluation" %}.
* Mapping to pathogens reference genomes can inadvertently mask some pathogen-associated reads and still leave some host-associated reads unmasked
* [Removal of human reads from SARS-CoV-2 sequencing data \| Galaxy training](https://training.galaxyproject.org/training-material/topics/sequence-analysis/tutorials/human-reads-removal/tutorial.html)

Expand All @@ -111,7 +111,7 @@ While the objects of interest in this domain are pathogens, the data is usually
#### Generating genomic data
* Establish protocols and document the steps taken in the lab to process the sample and in the computational workflow to prepare the resulting data. Make sure to keep information from quality assurance procedures and strive to make your labwork and computational process as reproducible as possible.
* [High-Throughput Sequencing \| LifeScienceRDMLookUp](https://elixir.no/rdm-lookup/sequencing)
* The [Beyond 1 Million genomes project](https://b1mg-project.eu/) provides guidelines that cover the minimum [quality requirements](https://zenodo.org/record/5018495) for the generation of genome sequencing data.
* The [Beyond 1 Million genomes project](https://b1mg-project.eu/) provides guidelines that cover the minimum quality requirements {% cite "gut2021B1MG" %} for the generation of genome sequencing data.
* Data repositories generally have information about recommended [data file formats](data_publication) and [metadata](metadata_management)
* The {% tool "fair-cookbook" %} provides instructions on [validation of file formats](https://faircookbook.elixir-europe.org/content/recipes/interoperability/fastq-file-format-validators.html)
* A good place to look for scientific and technical information about data quality validation software tools for pathogenomics is {% tool "bio-tools" %}.
Expand Down Expand Up @@ -144,3 +144,7 @@ While the objects of interest in this domain are pathogens, the data is usually
* Investigate if there are [national resources](national_resources) or a [data brokering](data_brokering) organisation available to facilitate data sharing.
* {% tool "pathogens-portal" %} Data Hubs network for sensitive data.
* {% tool "covid-19-data-portal" %}

## Bibliography

{% bibliography --cited %}

0 comments on commit 9633ac2

Please sign in to comment.