Skip to content

Annotating Sample terms using OntoMaton

Yasset Perez-Riverol edited this page Jul 2, 2021 · 3 revisions

The current IDF is generated from the submission.px file format in ProteomeXchange. The submitters will be annotating it at the moment of submission using the ProteomeXchange Submission Tool. However, the new SDRF for proteomics (the file format that contains the metadata about the samples and the RAW files) MUST be annotated manually by the PX submitters.

By July 2021, the ProteomeXchange consortiums still haven't developed an annotation tool for SDRF proteomics files. However, some existing tools that have been used in the past to annotate SDRFs in transcriptomics can be used to annotate SDRF-proteomics files.

OntoMaton

OntoMaton facilitates ontology search and tagging functionalities within Google Spreadsheets. Because SDRF-Proteomics is in a nutshell a spreadsheet where the first line (name of the column) is the name of each property of each sample (characteristics) or data file (comment), the OntoMaton tool can be used to annotate the samples.

OntoMaton has the following advantages:

  • All the functionalities of Google Spreadsheets including copy-paste values, move values from one column to another, delete values, etc.
  • OntoMaton enables to search and find terms in the OLS without leaving the Google Spreadsheet. This facilitates to find the proper term for the property of the sample.

Installation

With the new add on infrastructure, installation is very easy.

  1. Click on the 'Add-ons' menu item in your Google Spreadsheet:
  1. Click on 'Get add-ons...' and then search for 'OntoMaton':

You should get the following result:

Here you can click on the image and read more about OntoMaton:

  1. To install, click on '+FREE'. You will need to authorise OntoMaton Add-on to access your spreadsheets and to connect to external services (the ontology search services we support):
  1. You'll then have the OntoMaton app installed.

You can access it through the 'Add On' menu option.

Ontology Search

From OntoMaton, you can search three different services within one tool: the NCBO Bioportal, Linked Open Vocabularies and EBI Ontology Lookup Service, and insert the terms in your Google Spreadsheet directly. Full term provenance is recorded for you and later downstream analysis.

Ontology Tagging

With OntoMaton, you can select a number of spreadsheet cells and then 'tag' them. This means that OntoMaton will take the terms in the cells and send them to BioPortal's Annotator service. The results will come back as a list of the free text terms, showing for each all matches in BioPortal.

Configuring OntoMaton - Settings

From the settings screen, you can configure:

  • How terms should be inserted in to the spreadsheet when not in 'ISA mode' (where the next columns aren't named 'Term Source REF' or 'Term Source Accession'). The two options are as either as a hyperlink to the term in Bioportal/OLS/LOV or as a term name with the hyperlink in parentheses.
  • Restrictions, which specify for zero or more columns (with a name in the first cell), restrictions that should be placed on the search space per each of the ontology lookup services we use (Bioportal/OLS/LOV) E.g. the column 'Label' is restricted to terms from the Chemincal Entities of Biomedical Interest ontology (ChEBI). Please, note that for instance if a column has a restriction over the BioPortal service, the restiction will not have an effect if searching terms with OLS.

Restricting OntoMaton's search space

When you add a restriction using the 'Settings' panel for the first time, a 'Restrictions' sheet will be added automatically. This sheet will have the following column headers: Column Name | Ontology | Branch | Version | Ontology Name | Service. Then you may define for a particular column header in your spreadsheet what ontology should be searched (or list of ontologies) over what service (BioPortal, OLS or LOV). A restriction will only apply if using the corresponding service for search.

Additionally, within one ontology restriction, for BioPortal searches, you can restrict to a particular branch of an ontology, providing a way to further restrict the search space.

An example of a google spreadsheet with such functionality can be viewed here: https://docs.google.com/spreadsheet/ccc?key=0Al5WvYyk0zzmdDNLeEcxWHZJX042dS0taXJPNXpJMHc

Video Tutorial

Access the video tutorial showing how to install and use OntoMaton (version 1) here.

Questions

If you have any queries, please email us at link. For bug reports, please use the issue page here.