Skip to content

boll3/metaboliteIDmapping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The metabolitIDmapping R package

The metaboliteIDmapping AnnotationHub package provides a comprehensive ID mapping for various metabolite ID formats. Within this annotation package, nine different ID formats and metabolite common names are merged in one large mapping table. ID formats include Comptox Chemical Dashboard IDs (DTXCID, DTXSID), Pubchem IDs (CID, SID), CAS Registry numbers (CAS-RN), Human Metabolome Database (HMDB), Chemical Entities of Biological Interest (ChEBI), KEGG Compounds (KEGG), and Drugbank (Drugbank)

The metabolite IDs and names were retrieved from four different publicly available sources and merged into one mapping table by means of the R script that is distributed alongside the AnnotationHub package.

For detailed information about the data sources please have a look in the vignette at our Bioconductor page

Installation

It is recommended to install the metaboliteIDmapping package via Bioconductor. Therefore, start R (version 4.0) and enter:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

# The following initializes usage of Bioc devel
BiocManager::install(version='devel')

BiocManager::install("metaboliteIDmapping")

Usage

There are two different ways to load the mapping ID table from this package.

First, simply load the metaboliteIDmapping package into your R session. When the package is loaded, the data will be available as tibble:

library( metaboliteIDmapping)

metabolitesMapping

Second, search for the mapping table in the AnnotationHub resource interface:

library( AnnotationHub)

ah <- AnnotationHub()
datasets <- query( ah, "metaboliteIDmapping")

Currently, there are two versions of the mapping table.

  • AH79817 represents the original ID mapping containing 9 different ID formats
  • AH83115 is the current mapping table which also includes common names for each compound

For implanting this data in your code, it is recommended to use the AHid for retrieval:

data <- ah[["AH83115"]]

LICENSE

Copyright (C) 2011 - 2020 Helmholtz Centre for Environmental Research UFZ.

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the UFZ License document for more details: https://github.com/yigbt/metaboliteIDmapping/blob/master/LICENSE.md

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages