Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XCMS-fillChromPeaks: more options for the remaining NA #141

Open
melpetera opened this issue Jul 29, 2019 · 5 comments
Open

XCMS-fillChromPeaks: more options for the remaining NA #141

melpetera opened this issue Jul 29, 2019 · 5 comments

Comments

@melpetera
Copy link
Member

Hi there,

Here is a suggestion concerning XCMS step, on how to deal with NA that stay NA even after fillChromPeaks. Currently, we have the option to leave this NA as 'NA' or to convert them into '0'. The idea would be to provide a third choice that provide a controled random value instead of 0.

This random value provided to replace the NA could be define as an integer randomly selected between inf.range et sup.range where:

  • inf.range is the minimum of the random range, specified by the user (with default to 0), coded between 0 and 1, with 0 meaning 0 and otherwise the proportion of the ion minimum value. Example: if the minimum value of a given ion is 1500, if the user set inf.range to 0.5, then the minimum of the random range will be 750.
  • sup.range is the maximum of the random range, specified by the user also (with default to 1), coded the same way as inf.range. Example: setting it to 0.5 means that the maximum of random range would be 750 if the minimum value of a given ion is 1500.

Note: since it is based on random, it is necessary to provide a "seed" option if needed by the user to obtain similar result if re-run.

@lecorguille do not hesitate to ask if this request is not clear!

Have a nice day,
@jfrancoismartin and @melpetera

@melpetera
Copy link
Member Author

melpetera commented Jul 29, 2019

Efficient code from @jfrancoismartin

Note:

  • function used to convert NA by random value in an interval between 0 and min non NA value using runif; this function is called using apply to execute for all columns of dataMatrix
  • idm = given_ion
  • the example is a call for a dataMatrix DM; NA replaced by random value in an interval between 0 and min for each var (columns)

imputNA <- function(idm,inf.range,sup.range) {

if (anyNA(idm)) {
nbNA <- sum(is.na(idm))
minVal <- min(idm[!is.na(idm)])
idm[is.na(idm)] <- runif(nbNA,min=inf.range*min(idm,na.rm=TRUE),max=sup.range*min(idm,na.rm=TRUE))

return(idm)
}
}
DM <- apply(X = DM ,MARGIN = 2, imputNA)

@lecorguille
Copy link
Member

Hum, for me, it's typically something that should be integrated to the xcms main package : https://github.com/sneumann/xcms
What do you think about that?

@jfrancoismartin
Copy link
Member

hum hum...actually, xcms fillpeaks try to replace NA by a value in extracted from the raw MS file. It is an analytical replacement. If fillpeaks can't find a value then it becomes a statistical issue not in the field of xcms. And we can propose these kind of NA imputation which is more elegant than just 0 replacement.

@lecorguille
Copy link
Member

I guess that one of the purposes of xcms is to become an input for statistic analysis.
So it could a xcms issue :)

What do you think about that @sneumann and @jorainer?

My idea is to reduce the code in the wrapper. If it's not something that is interesting to add in XCMS, we should add this code in our future utils R package?

@jorainer
Copy link

Note, xcms has already some imputation functionality: xcms::imputeRowMinRand and xcms::imputeRowMin. Nothing spectacular though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants