-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ncbi_egapx: form seems to work and generates what looks like valid yaml #29
Conversation
needs tests but cannot even run let alone test here - no machine with 120GB or 31 cores - because of the resource requirements baked into the docker image
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Attention: deployment failure! https://github.com/richard-burhans/galaxytools/actions/runs/10781909881 |
@fubar2 or @richard-burhans = we need to add an option that would allow EGAPx to take a file with Protein FASTA as annotation source (see https://github.com/ncbi/egapx?tab=readme-ov-file#input-data-format) |
@nekrut: If that protein fasta is independent of the NCBI, then it may make sense to use it in the HMM. If any NCBI protein fasta exists for a taxon, it is AFAIK an output from running the internal NCBI pipeline that has become egapx. So predicting proteins using egapx, relying on information from a fasta that has been predicted by the father of egapx, may yield biased and uninterpretable results AFAIK because of lack of statistical independence in some of the inputs used for prediction? Not an expert but this is a good question for one of the NCBI authors. |
needs tests but cannot even run let alone test here - no machine with 120GB or 31 cores - because of the resource requirements baked into the docker image