Skip to content

Commit

Permalink
New tool mzxmlshaper
Browse files Browse the repository at this point in the history
Tool that enables the conversion of files to XCMS-compatible open formats
  • Loading branch information
Quentin Ruin committed Sep 20, 2024
1 parent eca29d4 commit 189ec17
Show file tree
Hide file tree
Showing 7 changed files with 224,074 additions and 0 deletions.
57 changes: 57 additions & 0 deletions tools/mzxmlshaper/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# mzXML Shaper


Metadata
-----------

* **@name**: mz(X)ML Shaper
* **@galaxyID**: mzxmlshaper
* **@version**: 1.0.0
* **@authors**: Original code: Quentin Ruin (Contractual engineer - PFEM - INRAE - MetaboHUB) - Maintainer: Melanie Petera (PFEM - INRAE - MetaboHUB)
* **@init date**: 2024, September
* **@main usage**: This tool enables the conversion netCDF, mzML or mzXML files into W4M's XCMS mz(X)ML supported file formats


Context
-----------

The tool was created to cope with unsupported file formats that may not be read by W4M's Galaxy XCMS.

It can be used for any purpose necessitating standardized mzML or mzXML files, be it visualization in a third-party software, local workflows or W4M XCMS workflows.

User interface is based on the Galaxy platform (homepage: https://galaxyproject.org/). It is an open, web-based platform for data intensive biomedical research.
Whether on the free public server or your own instance, you can perform, reproduce, and share complete analyses.


Configuration
-----------

### Requirement:
* R software: version = 4.3.3 recommended
* bioconductor-msdata = 0.42.0
* bioconductor-mzr = 2.36.0
* r-w4mrutils = 1.0.0

Technical description
-----------

Main files:

- mzXMLShaper.R: R function (core script)
- mzXMLShaper.xml: XML wrapper (interface for Galaxy)
- 111-1_POS01.CDF (test file): Riker Metabolome Database (http://metabobank.riken.jp/metabo/db/plantMetabolomics/http:/metadb.riken.jp/db/plantMetabolomics/0.1/File/RPMM0054_111-1)
- BlancFin_POS_RA1_1_6869.mzML (test file): Internal blank sample, PFEM - INRAE (https://pfem.isc.inrae.fr/)
- example.mzXML (test file): PRIDE Toolsuite (https://github.com/PRIDE-Toolsuite/inspector-example-files/blob/master/peak-files/example.mzXML.gz)


Services provided
-----------

* Help and support: https://community.france-bioinformatique.fr/c/workflow4metabolomics/10



License
-----------

* GPL-3.0-or-later
49 changes: 49 additions & 0 deletions tools/mzxmlshaper/mzXMLShaper.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
#!/usr/bin/env Rscript

# R version 4.3.3
# Conversion tool mzXMLShaper.R (former cdf2mzml)
# Author: Quentin RUIN
# Creation: 25/04/2023
# Last updated: 09/09/2024

cat("\nJob starting time:\n", format(Sys.time(), "%a %d %b %Y %X"), "\n\n")

library(mzR)
library(msdata)
library("W4MRUtils")
library("tools")

args <- W4MRUtils::parse_args(args = commandArgs())

cat("\n\n--------------------------------------------------------------------",
"\nParameters used by the 'mz(X)MLShaper' tool:\n\n")
cat("--------------------------------------------------------------------\n\n")
print(args)

inputfilename <- args[[1]]
outputfileformat <- args[[2]]
outputfilename <- args[[3]]
spectrum <- mzR::openMSfile(inputfilename)

## Get the spectra
pks <- mzR::spectra(spectrum)

## Get the header
hdr <- mzR::header(spectrum)

if (outputfileformat == 'mzml')
{
writeMSData(pks, file = outputfilename, outformat = 'mzml', header = hdr)
}

if (outputfileformat == 'mzXml')
{
writeMSData(pks, file = outputfilename, outformat = 'mzxml', header = hdr)
}


cat("\n--------------------------------------------------------------------",
"\nInformation about R (version, Operating System, attached or loaded packages):\n\n")
sessionInfo()
cat("--------------------------------------------------------------------\n",
"\nJob ending time:\n", format(Sys.time(), "%a %d %b %Y %X"))
152 changes: 152 additions & 0 deletions tools/mzxmlshaper/mzXMLShaper.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
<tool id="mzxmlshaper" name="Mz(X)ML Shaper" version="1.0.0+galaxy0" profile="21.01">
<description>
enables the conversion of open format files into XCMS-supported mz(X)ML file formats
</description>
<requirements>
<requirement type="package" version="4.3.3">r-base</requirement>
<requirement type="package" version="0.42.0">bioconductor-msdata</requirement>
<requirement type="package" version="2.36.0">bioconductor-mzr</requirement>
<requirement type="package" version="1.0.0">r-w4mrutils</requirement>
</requirements>
<command detect_errors="aggressive"><![CDATA[
Rscript '$__tool_directory__/mzXMLShaper.R'
inputfilename '$infile'
outputfileformat '$outformat'
#if str($outformat) == 'mzml':
outputfilename '$mzml_file'
#end if
#if str($outformat) == 'mzXml':
outputfilename '$mzXml_file'
#end if
]]></command>
<inputs>
<param name="infile" type="data" format="mzml,mzxml,netcdf" label="File(s) to convert" help="file or collection"/>
<param name="outformat" type="select" label="Desired output format">
<option value="mzml">mzML</option>
<option value="mzXml">mzXML</option>
</param>
</inputs>
<outputs>
<data format="mzml" label="${infile.name.rsplit('.',1)[0]}.mzml" name="mzml_file">
<filter> outformat=="mzml"</filter>
</data>
<data format="mzxml" label="${infile.name.rsplit('.',1)[0]}.mzXml" name="mzXml_file">
<filter> outformat=="mzXml"</filter>
</data>
</outputs>
<tests>
<test expect_num_outputs="1">
<param name="infile" value="original_data/111-1_POS01.CDF"/>
<param name="outformat" value="mzml"/>
<output name="mzml_file" file="conversion_result_cdf_to_mzml.mzml" />
</test>
<test expect_num_outputs="1">
<param name="infile" value="original_data/111-1_POS01.CDF"/>
<param name="outformat" value="mzXml"/>
<output name="mzXml_file" file="conversion_result_cdf_to_mzxml.mzxml"/>
</test>
<test expect_num_outputs="1">
<param name="infile" value="original_data/example.mzXML"/>
<param name="outformat" value="mzml"/>
<output name="mzml_file" file="conversion_result_mzxml_to_mzml.mzml" />
</test>
<test expect_num_outputs="1">
<param name="infile" value="original_data/example.mzXML"/>
<param name="outformat" value="mzXml"/>
<output name="mzXml_file" file="conversion_result_mzxml_to_mzxml.mzxml"/>
</test>
<test expect_num_outputs="1">
<param name="infile" value="original_data/BlancFin_POS_RA1_1_6869.mzML"/>
<param name="outformat" value="mzml"/>
<output name="mzml_file" file="conversion_result_mzml_to_mzml.mzml" />
</test>
<test expect_num_outputs="1">
<param name="infile" value="original_data/BlancFin_POS_RA1_1_6869.mzML"/>
<param name="outformat" value="mzXml"/>
<output name="mzXml_file" file="conversion_result_mzml_to_mzxml.mzxml"/>
</test>
</tests>
<help><![CDATA[
.. class:: infomark
**Credits**
| **Original tool wrapping:** Quentin Ruin - INRAE - MetaboHUB
| **Tool maintainer:** Mélanie Pétéra - INRAE - MetaboHUB
.. class:: infomark
**Help and support:** https://community.france-bioinformatique.fr/c/workflow4metabolomics/10
---------------------------------------------------
******************
Mz(X)ML Shaper
******************
===========
DESCRIPTION
===========
This tool unables the user to convert any mass spectrometry file from netCDF (.cdf), .mzml or .mzxml format to "Pwiz-style" .mzml or .mzXml, as these are more often supported by the latest software. Files must be standard MS format (see Note below).
It is based on the R package mzR. This package (https://github.com/sneumann/mzR/) allows the user to access, read and copy the content of common MS file formats such as netCDF, mzML and mzXML. The output copy can be made in a different format, thus converting the file.
For more information about this package, see the related documentation (https://bioconductor.org/packages/release/bioc/html/mzR.html)
---------------------------------------------------
==========================
ALIGNMENT WITH OTHER TOOLS
==========================
-----------
INPUT FILES
-----------
Open format MS files (.netCDF, .mzML, .mzXML)
------------
OUTPUT FILES
------------
A file (or collection of files) converted in mzML or mzXML with the same name as the input file (except for the extension)
----------------------------
EXAMPLE OF WORKFLOW POSITION
----------------------------
Used as first step in any workflow supporting mzML or mzXML formats just before ReadMSData.
---------------------------------------------------
===============
TOOL PARAMETERS
===============
- Desired output format: mzML or MZXML
---------------------------------------------------
==================
OUTPUT DESCRIPTION
==================
Standardized MzML or mzXML files (depending on the user's configuration of the tool)
---------------------------------------------------
==============
EXAMPLE OF USE
==============
.netCDF file upload -> Mz(X)MLShaper -> ReadMSData -> any workflow
------------
KNOWN ISSUES
------------
MzR does not read Bruker's DataAnalysis mzML or mzXML files, as they are not "standard" or "Pwiz-style" files (August 2024). This might also be the case for other constructor-converted mzML/mzXML, in that case an "unsupported file format" error will happen. However, this tool is functional for ProteoWizard's MSConvert mzML/mzXML files.
]]>
</help>
<citations>
<citation type="doi">10.18129/B9.bioc.mzR</citation>
</citations>
</tool>
8 changes: 8 additions & 0 deletions tools/mzxmlshaper/shed.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
categories: [Metabolomics]
description: 'Convert mzML, mzXML or netCDF files to mzML or mzXML'
homepage_url: https://rdrr.io/bioc/mzR/
long_description: 'The R script uses the R package mzR to convert MS files into formats compatible
with XCMS-like workflows and software, especially XCMS tools of W4M'
name: mzxmlshaper
owner: workflow4metabolomics
remote_repository_url: https://github.com/workflow4metabolomics/tools-metabolomics
Binary file added tools/mzxmlshaper/test-data/111-1_POS01.CDF
Binary file not shown.
59,662 changes: 59,662 additions & 0 deletions tools/mzxmlshaper/test-data/BlancFin_POS_RA1_1_6869.mzML

Large diffs are not rendered by default.

Loading

0 comments on commit 189ec17

Please sign in to comment.