Integration of human single cell datasets across the gastrointestinal tract.
Welcome to the Human Pan-Gastrointestinal Cell Atlas GitHub repository! Here you will find notebooks and scripts related to the publication along with a tutorial of how to map new data to the atlas. You can explore and download the atlas at gutcellatlas.org.
For a comprehensive explanation of the atlas, please check out the paper. In brief, the Pan-GI atlas brings together 25 single-cell RNA sequencing datasets from across the gastrointestinal (GI) tract and the human lifespan. The resource encompasses a healthy reference atlas of ~1.1 million cells from GI samples of 189 healthy controls and an extended atlas with an additional 500k cells from 5 GI diseases. The data was processed uniformly using a newly developed automated quality control approach (scAutoQC), described in the paper. The atlas was utilised to investigate cellular changes in inflammatory intestinal diseases (coeliac disease, ulcerative colitis and Crohn’s disease) and identified metaplastic cells with inflammatory signalling circuits. Users of the atlas can investigate their own hypotheses within the existing datasets in the atlas and can map newly generated data to help annotate and contextualise with the Pan-GI atlas.
To explore and download the atlas, please go to gutcellatlas.org.
Browse: There is a cellxgene browser on gutcellatlas.org to explore the main atlas of 1.6 million cells. Currently under construction are cellxgene browsers for each cell lineage, in order to browse the data in more detail. Please check back for updates on when they are available, likely early 2025, in the meantime you can use the cellxgene selecting and subsetting cells function.
Download: There are multiple download links on the website for the core atlas, the extended atlas and the lineage specific objects (.h5ad and .rds). There are also links for the scANVI-based reference models to map and annotate your own data to the atlas and CellTypist models to annotate new data.
Map and annotate: In order to map and annotate newly generated data you will need to download objects and/or models, depending on the research question and method. To map and annotate using scANVI-based reference models, please see our tutorial. To annotate using CellTypist, please see existing tutorials on the CellTypist website. Any single-cell data can be mapped and annotated using the atlas, however for the most accurate results it is recommended to use scAutoQC (link to github page and tutorial).
Please submit an issue to this GitHub repository.
notebooks
: this folder contains all notebooksscripts
: this folder contains scripts
Core datasets (remapped using STAR v2.7.9a, human reference Cell Ranger 2020-A using STARsolo pipeline v1.0):
- Williams et al., Cell, 2021, DOI
- Caetano et al., eLife, 2021, DOI
- Huang et al., Cell, 2019, DOI
- Jaeger et al., Nat. Comm., 2021, DOI
- Elmentaite et al., Dev. Cell, 2020, DOI
- Wang et al., J Exp Med, 2020; DOI
- Kinchen et al., Cell, 2018, DOI
- Parikh et al., Nature, 2019, DOI
- James et al., Nat. Immunol., 2020, DOI
- Lee et al., Nature Genetics, 2020, DOI
- Uzzan et al., Nat Med, 2022, DOI
- Che et al., Cell Discov., 2021, DOI
- Domínguez Conde et al., Science, 2022, DOI
- Elmentaite et al., Nature, 2021, DOI
- Holloway et al., Cell Stem Cell, 2020, DOI
- Yu et al., Cell, 2021, DOI
- Li et al., Nat Immunol., 2019, DOI
- Martin et al., Cell, 2019 DOI
- He et al., Genome Biol., 2020 DOI
- Jeong et al., Clin. Cancer Res., 2021 DOI
- Kim et al., NPJ Precis. Oncol., 2022 DOI
- Madisoon et al., Genome Biol., 2019 DOI
- Pagella et al., iScience, 2021 DOI
- Chen et al., J. Dent. Res., 2022 DOI
- Costa-da-Silva et al., iScience, 2022 DOI
Extended atlas (original count matrices used):