Skip to content

Commit

Permalink
Merge branch 'galaxyproject:main' into master
Browse files Browse the repository at this point in the history
  • Loading branch information
Delphine-L authored May 4, 2023
2 parents d543b81 + 682aa79 commit 167290f
Show file tree
Hide file tree
Showing 32 changed files with 12,896 additions and 113 deletions.
4 changes: 4 additions & 0 deletions CONTRIBUTORS.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1374,6 +1374,10 @@ Sofokli5:
orcid: 0000-0002-4833-4726
joined: 2022-05

sophia120199:
name: Sophia Hampe
joined: 2023-04

stephanierobin:
name: Stéphanie Robin
email: [email protected]
Expand Down
12 changes: 12 additions & 0 deletions SECURITY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Security Policy

## Supported Versions

The GTN does not necessarily have a notion of 'supported versions' as our site is completely static and does not go through any sort of managed release cycle, unlike Galaxy.
All dynamic features implemented in our site are strictly client-side, which generally provides acceptable security guarantees and low risk of user-facing exploitation.

There are occasionally vulnerabilities in the tools we use to build the site, however those vulnerabilities generally do not affect us, nor the output produced by jekyll and related tools.

## Reporting a Vulnerability

Please report any vulnerabilities using [GitHub private reporting](https://github.com/galaxyproject/training-material/security/advisories/new). One of the maintainers will acknowledge your report within 5 business days (European).
2 changes: 1 addition & 1 deletion faqs/galaxy/interactive_tools_jupyter_launch.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ contributors: [annefou,shiltemann,nomadscientist]
>
> > <hands-on-title>Run JupyterLab</hands-on-title>
> >
> > 1. {% tool [Interactive Jupyter Notebook](interactive_tool_jupyter_notebook) %}:
> > 1. {% tool [Interactive Jupyter Notebook](interactive_tool_jupyter_notebook) %}. Note that on some Galaxies this is called {% tool [Interactive JupyTool and notebook](interactive_tool_jupyter_notebook) %}:
> > 2. Click Run Tool
> > 3. The tool will start running and will stay running permanently
> >
Expand Down
6 changes: 2 additions & 4 deletions learning-pathways/intro-to-galaxy-and-ecology.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,7 @@ title: Introduction to Galaxy and Ecological data analysis
description: |
This learning path aims to teach you the basics of Galaxy and analysis of ecological data.
You will learn how to use Galaxy for analysis, and will be guided through the common
steps of biodiversity data analysis; download, check and filter GBIF data and analyze abundance data through modeling
sequences.
steps of biodiversity data analysis: download, check, filter and explore biodiversity data and analyze abundance data through modeling.
priority: 1

Expand All @@ -31,8 +30,7 @@ pathway:
topic: ecology
- name: gbif_cleaning
topic: ecology
- name: PAMPA-toolsuite-tutorial
topic: ecology


- section: "Module 3: Basics of Biodiversity abundance data analysis"
description: Working on abundance data, you often want to analyze it through modeling to compute and analyze biodiversity metrics.
Expand Down
4 changes: 2 additions & 2 deletions topics/admin/tutorials/ansible-galaxy/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -1779,7 +1779,7 @@ Galaxy is now configured with an admin user, a database, and a place to store da
> > Mar 16 01:15:15 gat systemd[1]: galaxy-gunicorn.service: Consumed 3.381s CPU time.
> > ```
> >
> > Check your /srv/galaxy/config/galaxy.yml and ensure that it lines up exactly with what you expect.
> > Check your /srv/galaxy/config/galaxy.yml and ensure that it lines up exactly with what you expect. You might observe a warning that `Dynamic handlers are configured in Gravity but Galaxy is not configured to assign jobs to handlers dynamically`. We will address this [below](#job-configuration), and you can disregard it for now.
> {: .tip}
>
> 6. Some things to note:
Expand Down Expand Up @@ -1814,7 +1814,7 @@ With this we have:
- PostgreSQL running
- Galaxy running (managed by Gravity + systemd)

Although Gunicorn can server HTTP for us directly, a reverse proxy in front of Gunicorn can automatically compress selected content, and we can easily apply caching headers to specific types of content like CSS or images. It is also necessary if we want to serve multiple sites at once, e.g. with a group website at `/` and Galaxy at `/galaxy`. Lastly, it can provide authentication as well, as noted in the [External Authentication]({{ site.baseurl }}/topics/admin/tutorials/external-auth/tutorial.html) tutorial.
Although Gunicorn can serve HTTP for us directly, a reverse proxy in front of Gunicorn can automatically compress selected content, and we can easily apply caching headers to specific types of content like CSS or images. It is also necessary if we want to serve multiple sites at once, e.g. with a group website at `/` and Galaxy at `/galaxy`. Lastly, it can provide authentication as well, as noted in the [External Authentication]({{ site.baseurl }}/topics/admin/tutorials/external-auth/tutorial.html) tutorial.

For this, we will use NGINX (pronounced "engine X" /ˌɛndʒɪnˈɛks/ EN-jin-EKS). It is possible to configure Galaxy with Apache and potentially other webservers but this is not the configuration that receives the most testing. We recommend NGINX unless you have a specific need for Apache.

Expand Down
10 changes: 5 additions & 5 deletions topics/admin/tutorials/interactive-tools/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ We will use several Ansible roles for this tutorial. In order to avoid repetetiv
>
> ```yaml
> - src: geerlingguy.docker
> version: 2.6.0
> version: 6.1.0
> - src: usegalaxy_eu.gie_proxy
> version: 0.0.2
> ```
Expand Down Expand Up @@ -409,7 +409,7 @@ As we use Let's Encrypt in staging mode, the wildcard certificates generated wit
> #certbot_auth_method: --webroot
> ```
>
> Although this is not explicitly required (setting `cerbot_dns_provider` as we do overrides this setting), doing so is less confusing in the future, since it makes it clear that the "webroot" method for Let's Encrypt WEB-01 challenges is no longer in use for this server.
> Although this is not explicitly required (setting `certbot_dns_provider` as we do overrides this setting), doing so is less confusing in the future, since it makes it clear that the "webroot" method for Let's Encrypt WEB-01 challenges is no longer in use for this server.
>
> - Add the following lines to your `group_vars/galaxyservers.yml` file:
>
Expand Down Expand Up @@ -515,7 +515,7 @@ As we use Let's Encrypt in staging mode, the wildcard certificates generated wit
> #certbot_auth_method: --webroot
> ```
>
> Although this is not explicitly required (setting `cerbot_dns_provider` as we do overrides this setting), doing so is less confusing in the future, since it makes it clear that the "webroot" method for Let's Encrypt WEB-01 challenges is no longer in use for this server.
> Although this is not explicitly required (setting `certbot_dns_provider` as we do overrides this setting), doing so is less confusing in the future, since it makes it clear that the "webroot" method for Let's Encrypt WEB-01 challenges is no longer in use for this server.
>
> - Add the following lines to your `group_vars/galaxyservers.yml` file:
>
Expand Down Expand Up @@ -586,7 +586,7 @@ A few Interactive Tool wrappers are provided with Galaxy, but they are [commente
> </toolbox>
> ```
>
> 2. We need to modify `job_conf.xml` to instruct Galaxy on how run Interactive Tools (and specifically, how to run them in Docker). We will begin with a basic job conf:
> 2. We need to modify `job_conf.xml` to instruct Galaxy on how to run Interactive Tools (and specifically, how to run them in Docker). We will begin with a basic job conf:
>
> Create `templates/galaxy/config/job_conf.xml.j2` with the following contents:
>
Expand Down Expand Up @@ -644,7 +644,7 @@ A few Interactive Tool wrappers are provided with Galaxy, but they are [commente
> ```
> {% endraw %}
>
> Next, inform `galaxyproject.galaxy` of where you would like the `job_conf.xml` to reside, that GxITs should be enabled, and where the GxIT map database can be found:
> Next, inform `galaxyproject.galaxy` of where you would like the `job_conf.xml` to reside, that GxITs should be enabled, and where the GxIT map database can be found. Watch for other conflicting configurations from previous tutorials (e.g. `job_config: ...`):
>
> {% raw %}
> ```yaml
Expand Down
4 changes: 2 additions & 2 deletions topics/genome-annotation/tutorials/apollo/slides.html
Original file line number Diff line number Diff line change
Expand Up @@ -83,9 +83,9 @@

- A Human finds problems algorithms can't

.pull-left[.image-90[![Schema showing how automated annotation, experimental evidences (cDNAs, HMM domain searches, RNASeq, similarity with other species), and human analysis are used by Apollo to manually curate an annotation](../../images/apollo/apollo_workflow.png)]]
.pull-left[.image-75[![Schema showing how automated annotation, experimental evidences (cDNAs, HMM domain searches, RNASeq, similarity with other species), and human analysis are used by Apollo to manually curate an annotation](../../images/apollo/apollo_workflow.png)]]

.pull-right[.image-40[![Apollo screenshot showing how RNASeq reads align mostly within some exons limits, but not perfectly](../../images/apollo/rnaseq_cov.png)]]
.pull-right[.image-25[![Apollo screenshot showing how RNASeq reads align mostly within some exons limits, but not perfectly](../../images/apollo/rnaseq_cov.png)]]

???

Expand Down
17 changes: 17 additions & 0 deletions topics/metagenomics/faqs/kraken.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
---
title: Kraken2 and the k-mer approach for taxonomy classification
area: format
box_type: details
layout: faq
contributors: [bebatut]
---

In the $$k$$-mer approach for taxonomy classification, we use a database containing DNA sequences of genomes whose taxonomy we already know. On a computer, the genome sequences are broken into short pieces of length $$k$$ (called $$k$$-mers), usually 30bp.

**Kraken** examines the $$k$$-mers within the query sequence, searches for them in the database, looks for where these are placed within the taxonomy tree inside the database, makes the classification with the most probable position, then maps $$k$$-mers to the lowest common ancestor (LCA) of all genomes known to contain the given $$k$$-mer.

![Kraken2]({{site.baseurl}}/topics/metagenomics/faqs/images/kmers-kraken.jpg "Kraken sequence classification algorithm. To classify a sequence, each k-mer in the sequence is mapped to the lowest common ancestor (LCA, i.e. the lowest node) of the genomes that contain that k-mer in the database. The taxa associated with the sequence's k-mers, as well as the taxa's ancestors, form a pruned subtree of the general taxonomy tree, which is used for classification. In the classification tree, each node has a weight equal to the number of k-mers in the sequence associated with the node's taxon. Each root-to-leaf (RTL) path in the classification tree is scored by adding all weights in the path, and the maximal RTL path in the classification tree is the classification path (nodes highlighted in yellow). The leaf of this classification path (the orange, leftmost leaf in the classification tree) is the classification used for the query sequence. Source: {% cite Wood2014 %}")

__Kraken2__ uses a compact hash table, a probabilistic data structure that allows for faster queries and lower memory requirements. It applies a spaced seed mask of _s_ spaces to the minimizer and calculates a compact hash code, which is then used as a search query in its compact hash table; the lowest common ancestor (LCA) taxon associated with the compact hash code is then assigned to the k-mer.

You can find more information about the __Kraken2__ algorithm in the paper [_Improved metagenomic analysis with Kraken 2_](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1891-0).
2 changes: 2 additions & 0 deletions topics/metagenomics/faqs/taxon.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@ Family | Felidae
Genus | *Felis*
Species | *F. catus*

From this classification, one can generate a tree of life, also known as a phylogenetic tree. It is a rooted tree that describes the relationship of all life on earth. At the root sits the "last universal common ancestor" and the three main branches (in taxonomy also called domains) are bacteria, archaea and eukaryotes. Most important for this is the idea that all life on earth is derived from a common ancestor and therefore when comparing two species, you will -sooner or later- find a common ancestor for all of them.

Let's explore taxonomy in the Tree of Life, using [Lifemap](https://lifemap.univ-lyon1.fr/)

<iframe id="Lifemap" src="https://lifemap.univ-lyon1.fr/explore.html" frameBorder="0" width="100%" height="600px"></iframe>
10 changes: 1 addition & 9 deletions topics/metagenomics/tutorials/beer-data-analysis/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -317,15 +317,7 @@ One of the main aims in microbiome data analysis is to identify the organisms se

Taxonomic assignment or classification is the process of assigning an **Operational Taxonomic Unit** (OTUs, that is, groups of related individuals / taxon) to sequences. To assign an OTU to a sequence it is compared against a database, but this comparison can be done in different ways, with different bioinformatics tools. Here we will use **Kraken2** ({% cite wood2019improved %}).

> <details-title>Kraken2 and the k-mer approach for taxonomy classification</details-title>
>
> In the $$k$$-mer approach for taxonomy classification, we use a database containing DNA sequences of genomes whose taxonomy we already know. On a computer, the genome sequences are broken into short pieces of length $$k$$ (called $$k$$-mers), usually 30bp.
>
> **Kraken** examines the $$k$$-mers within the query sequence, searches for them in the database, looks for where these are placed within the taxonomy tree inside the database, makes the classification with the most probable position, then maps $$k$$-mers to the lowest common ancestor (LCA) of all genomes known to contain the given $$k$$-mer.
>
> ![Kraken2](../../images/metagenomics-nanopore/kmers-kraken.jpg "Kraken sequence classification algorithm. To classify a sequence, each k-mer in the sequence is mapped to the lowest common ancestor (LCA, i.e. the lowest node) of the genomes that contain that k-mer in the database. The taxa associated with the sequence's k-mers, as well as the taxa's ancestors, form a pruned subtree of the general taxonomy tree, which is used for classification. In the classification tree, each node has a weight equal to the number of k-mers in the sequence associated with the node's taxon. Each root-to-leaf (RTL) path in the classification tree is scored by adding all weights in the path, and the maximal RTL path in the classification tree is the classification path (nodes highlighted in yellow). The leaf of this classification path (the orange, leftmost leaf in the classification tree) is the classification used for the query sequence. Source: {% cite Wood2014 %}")
>
{: .details}
{% snippet topics/metagenomics/faqs/kraken.md %}

> <hands-on-title>Kraken2</hands-on-title>
>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -264,13 +264,7 @@ One of the key steps in metagenomic data analysis is to identify the taxon to wh

To perform the taxonomic classification we will use __Kraken2__ ({% cite Wood2019 %}). This tool uses the minimizer method to sample the k-mers (all the read's subsequences of length _k_) in a deterministic fashion in order to reduce memory constumption and processing time. In addition, it masks low-complexity sequences from reference sequences by using __dustmasker__.


> <comment-title></comment-title>
> __Kraken2__ uses a compact hash table, a probabilistic data structure that allows for faster queries and lower memory requirements. It applies a spaced seed mask of _s_ spaces to the minimizer and calculates a compact hash code, which is then used as a search query in its compact hash table; the lowest common ancestor (LCA) taxon associated with the compact hash code is then assigned to the k-mer.
> You can find more information about the __Kraken2__ algorithm in the paper [_Improved metagenomic analysis with Kraken 2_](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1891-0).
{: .comment}

![Taxonomic classification](../../images/metagenomics-nanopore/kmers-kraken.jpg "Kraken2 sequence classification algorithm. To classify a sequence, each l-mer is mapped to the lowest common ancestor (LCA) of the genomes that contain that l-mer in a database. In the classification tree, each node has a weight equal to the number of l-mers in the sequence associated with the node’s taxon. Image originally published in {% cite Wood2014 %}.")
{% snippet topics/metagenomics/faqs/kraken.md %}

For this tutorial, we will use the __SILVA database__ ({% cite Quast2012 %}). It includes over 3.2 million 16S rRNA sequences from the _Bacteria_, _Archaea_ and _Eukaryota_ domains.

Expand Down
Loading

0 comments on commit 167290f

Please sign in to comment.