Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some conflicts and still can not run #3

Open
poursalavati opened this issue Jun 26, 2021 · 1 comment
Open

some conflicts and still can not run #3

poursalavati opened this issue Jun 26, 2021 · 1 comment

Comments

@poursalavati
Copy link

Hi,
I'm trying to run this tool on our HPC. But unfortunately, there are still problems (I fixed some of them that I mentioned below. Maybe the code needs to be modified):

1- This download address has changed, please replace it:

ftp://ftp.ncbi.nih.gov/pub/taxonomy/obsolete/gi_taxid_prot.dmp.gz

(instead of: ftp://ftp.ncbi.nih.gov/pub/taxonomy/gi_taxid_prot.dmp.gz)

2- Errors related to loadTaxonomy.pl execution:

  • According to line 17 and 29; This argument must be added to the execution: -acc_wgs acc2taxid.nucl

  • These two lines should be added to the GetOptions(): (I added in lines 30 and 31) otherwise the script will not run. Also, add these two arguments when running.

"dead_prot=s"=> \$data_dead_acc_prot,
"dead_nucl=s"=> \$data_dead_acc_nucl,
  • Add another}at the end of the file after exit(1). Its absence is a conflict.

  • Finally, with this example, you can run the script:

./loadTaxonomy.pl -struct taxonomyStructure.sql -index taxonomyIndex.sql -acc_prot acc2taxid.prot -acc_nucl acc2taxid.nucl -names names.dmp -nodes nodes.dmp -gi_prot gi_taxid_prot.dmp -acc_wgs acc2taxid.nucl -dead_nucl dead_nucl.accession2taxid -dead_prot dead_prot.accession2taxid
And this is the messages you receive:

2021/06/26 12:23:20  INFO> loadTaxonomy.pl-bac:122 main::_create_sqlite_db - Creating database.
2021/06/26 12:23:22  INFO> loadTaxonomy.pl-bac:78 main::_insertingCSVDataInDatabase - Inserting tables into database...
2021/06/26 12:23:22  INFO> loadTaxonomy.pl-bac:80 main::_insertingCSVDataInDatabase - nodes
2021/06/26 12:23:22  INFO> loadTaxonomy.pl-bac:80 main::_insertingCSVDataInDatabase - nucl_accession2taxid
2021/06/26 12:23:22  INFO> loadTaxonomy.pl-bac:80 main::_insertingCSVDataInDatabase - prot_accession2taxid
2021/06/26 12:23:22  INFO> loadTaxonomy.pl-bac:80 main::_insertingCSVDataInDatabase - names
2021/06/26 12:23:22  INFO> loadTaxonomy.pl-bac:80 main::_insertingCSVDataInDatabase - gi_prot

But unfortunately, after fixing all the cases, the taxonomy.tmp.sqlite still has 80 kb!

3- The other fix is about the PFAM taxonomy in the manual.

This section needs to be modified: mkdir pfam should be mkdir fasta

And unfortunately after executing this code:

ls -1 pfam*.FASTA | sed 's,^\(.*\)\.FASTA,./gi2taxonomy.pl -i & -o \1.tax.txt -db taxonomy.tmp.sqlite -r,' | bash
Gives too many of these error messages:

WARN - tax_id not found for gi: ########

This is probably due to a problem with the taxonomy.tmp.sqlite in the previous section, which was not fully created.

Thank you for your help in resolving this issue, and make changes if the code needs to be modified.
Sincerely yours,
Naser

@marieBvr
Copy link
Contributor

marieBvr commented Jul 5, 2021

Hi Naser,
Thank you for pointing out all these issues.
It seems you are using the new documentation but posting the issue on an old repository. I suggest you use the current project at https://github.com/marieBvr/virAnnot. Be careful to use the slurm-branch if your server uses slurm.

Nevertheless, I will check and update the current project and its documentation thanks to your suggestions

Let me know (on the other repository) if you still face issues.
Sincerely yours,
Marie

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants