Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for EAD parameters #359

Open
michaelwitting opened this issue Sep 14, 2022 · 16 comments
Open

Support for EAD parameters #359

michaelwitting opened this issue Sep 14, 2022 · 16 comments

Comments

@michaelwitting
Copy link
Contributor

Dear all,

I'm preparing for the upload of some EAD spectra. MassBank is missing fields for different parameters important for EAD.
This would be:

  • kinetic energy
  • electron current
  • reaction time

I'm suggesting the following subtags under AC$MASS_SPECTROMETRY

KINETIC_ENERGY
ELECTRON_CURRENT
REACTION_TIME

These should be only used if EAD is used in FRAGMENTATION_MODE (compare with #347)

Best,

Michael

@meier-rene
Copy link
Contributor

Hi @michaelwitting,
can you already provide an preliminary example file for unit tests, so that I can implement that in the code and also in the documentation?
Regards, Rene

@michaelwitting
Copy link
Contributor Author

This would be a potential example:

MSBNK-MPC-RPStd00000408.txt

I have an open PR (#347) with the updated documentation for adding EAD, shall I also add the parameter?

@meier-rene
Copy link
Contributor

Thanks for the example. I would like to integrate that to massbank. We try to increase the usage of controlled vocabulary from the respective domain specific ontology. Im not familiar with this technique and resources I find are limited. In principle EAD should be a child of dissociation method, right? As far as I understand this method, its fragmenting molecules in an electron beam with adjustable level of kinetic energy. Maybe EAD is in principle just a marketing term of a vendor and its actually ECD. The problem with ECD in MS ontology is its annotation which limits this method to multiple protonated precursors in pos mode. I need to clarify if we want a new method EAD here or if we want to change description of ECD in the ontology. Please give me a little time to figure this out.

@michaelwitting
Copy link
Contributor Author

You are right: EAD is a type of ECD. It uses a different range of energy compared to ECD, this it is why it is considered a different method. At the moment Sciex is the only vendor offering it.

@meier-rene
Copy link
Contributor

Ok, we have a pull request with EAD and the ZenoTOF 7600 instrument pending at the psi-ms-CV. Thats good. So I think I can go on with this.

@meier-rene
Copy link
Contributor

Hi Micheal,
EAD is official accepted in MS ontology as well as your instrument. As explained in #361 I would like to modify your request a bit. Do you think you could prepare your planed submission with the terms in the CV term structure as explained there?

But please do not start right away. I would like to wait a little bit for comments from the community.

@michaelwitting
Copy link
Contributor Author

Hi Rene,

no problem. Just send me a message when I can start. I will change in the meantime my R script generating the MassBank records accordingly.

@michaelwitting
Copy link
Contributor Author

Hi all,

I'm now preparing several EAD spectra and would like to export them to MassBank format as well.
What is our status at the moment? The CV term discussion didn't really start.

Can I basically use the following subtags under AC$MASS_SPECTROMETRY

KINETIC_ENERGY
ELECTRON_CURRENT
REACTION_TIME

until we have another solution? I don't want to do an own flavour or style of MassBank to be able to export the stuff.
If it is okay for you I would append the changes to my open PR on the addition of EAD.

@meier-rene
Copy link
Contributor

Hi Michael,
there is some progress, but I couldn't completely finish it. I was about to contact you to go with a non 100% perfect solution, so that you can already start. Now that you raised this question its a good opportunity to actually do it.
Whats already there:

  • I included EAD in the psi-ms ontology
  • We discussed how we want to format ontology terms in MassBank record format and I included a section about that in the record format spec. At the moment its still in the dev branch.
  • I implemented a parser for thic CV term format, which successfully parses the syntax. Its included in the dev branch of the software.

Whats still missing:

  • I can not validate the content of these CV terms atm.

@michaelwitting
Copy link
Contributor Author

Hi Rene,

that is really nice, but how long will it still take?
We also need to adjust then other software like MsBackendMassBank.

But the subtags are fine for the moment, or?

@meier-rene
Copy link
Contributor

meier-rene commented Jan 25, 2023

My proposal, so that you can continue:

You provide an example above. The EAD section would just change a little bit to

AC$MASS_SPECTROMETRY: FRAGMENTATION_MODE [MS, MS:1003294, electron activated dissociation, ]
AC$MASS_SPECTROMETRY: KINETIC_ENERGY 15 eV
AC$MASS_SPECTROMETRY: ELECTRON_CURRENT 6500 nA
AC$MASS_SPECTROMETRY: REACTION_TIME 65 ms

Is that possible for you? or do we need to substitute "EAD" with "[MS, MS:1003294, electron activated dissociation, ]" after you created your content?

I will need to merge the current state of the software to main to make that work finally on our side. I will do that right now, but I would like to test it a bit before I merge it.

@michaelwitting
Copy link
Contributor Author

Thanks I can work like this!

@meier-rene
Copy link
Contributor

meier-rene commented Jan 25, 2023

Ok, I made some testing and went through the example file you provided. I identified some issues in this file. I fixed them and will attach the result of this here:

MSBNK-MPC-RPSTD00000408.txt

And here is an explanation of the issues:

  • we only want capitals in the ACCESSION RPStd00000408 -> RPSTD00000408
  • TITLE uses ";" as separator and follows the syntax: "Name"; "AC$INSTRUMENT_TYPE"; "AC$MASS_SPECTROMETRY: MS_TYPE"; "your additional information"
    your example will become: RECORD_TITLE: deoxycholic acid; LC-ESI-QTOF; MS2; [M+FA-H]-, EAD
  • DATE uses specific format like DATE: 2016.01.15 so your example would be: DATE: 2022.09.14
  • LICENSE not LICENSE: CC-BY-SA 4.0 but LICENSE: CC BY-SA
  • SPLASH is missing
  • MS$FOCUSED_ION: PRECURSOR_INT 100 should be MS$FOCUSED_ION: PRECURSOR_INTENSITY 100
  • we need a PK$NUM_PEAK: 90 line
  • no empty line at the end of the file

I hope you can solve this issues in your pipeline. If you have problems please contact me and we can try to find a solution.

One more thing. Your peaklist looks like profile data. If you are able to process the spectrum to centroid mode, we would prefer centroid mode.

The current record spec document can be found here record spec

@michaelwitting
Copy link
Contributor Author

In the documentation it is PRECURSOR_INT. I used MsBackendMassBank for the export. Other things can be easily updated.

@michaelwitting
Copy link
Contributor Author

All issues should be solved by now. I will start another round of processing with my pipeline later today.

@meier-rene
Copy link
Contributor

In the documentation it is PRECURSOR_INT. I used MsBackendMassBank for the export. Other things can be easily updated.

You are absolutely right. Didn't know this. In the past we had different terms for the same thing, in this case PRECURSOR_INTENSITY and PRECURSOR_INT. I made this uniform sometime in the past and probably missed to update the manual. Thanks for spotting this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants