You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I run or rerun speciesprimer (on a docker container with 15.51 gb RAM allocated to it) a single results file is not successfully created. I can not determine from the logs what the problem is. Any help would be appreciated. ( I noticed that the BLAST DB is 250 gb not 60 gb and I don't know why. Is this maybe part of the problem?)
Settings are as follows:
{'blastseqs': 500, 'skip_tree': False, 'minsize': 75, 'path': '/primerdesign', 'mfethreshold': 90, 'nolist': False, 'ignore_qc': False, 'maxsize': 150, 'probe': False, 'offline': False, 'nontargetlist': [...], 'assemblylevel': ['complete'], 'skip_download': False, 'target': 'Azotobacter_chroococcum', 'intermediate': False, 'qc_gene': ['rRNA'], 'exception': [], 'mpprimer': -3.5, 'blastdbv5': False, 'customdb': None, 'mfold': -3.0}
The following problem shows up in the logs:
Run: run_blast - Start BLAST
27 Jun 2023 05:18:42: Run blastn -task blastn-short -num_threads 4 -query primer.part-0 -evalue 500 -out primer_0_results.xml -outfmt 5 -db nt
27 Jun 2023 14:41:00: Run blastn -task blastn-short -num_threads 4 -query primer.part-1 -evalue 500 -out primer_1_results.xml -outfmt 5 -db nt
27 Jun 2023 23:47:50: Run blastn -task blastn-short -num_threads 4 -query primer.part-2 -evalue 500 -out primer_2_results.xml -outfmt 5 -db nt
28 Jun 2023 09:20:30: Run blastn -task blastn-short -num_threads 4 -query primer.part-3 -evalue 500 -out primer_3_results.xml -outfmt 5 -db nt
28 Jun 2023 18:47:13: Run blastn - speciesprimer_2023_06_25.log
task blastn-short -num_threads 4 -query primer.part-4 -evalue 500 -out primer_4_results.xml -outfmt 5 -db nt
29 Jun 2023 03:47:32: Run blastn -task blastn-short -num_threads 4 -query primer.part-5 -evalue 500 -out primer_5_results.xml -outfmt 5 -db nt
29 Jun 2023 13:16:20: Run blastn -task blastn-short -num_threads 4 -query primer.part-6 -evalue 500 -out primer_6_results.xml -outfmt 5 -db nt
29 Jun 2023 22:27:04: Run blastn -task blastn-short -num_threads 4 -query primer.part-7 -evalue 500 -out primer_7_results.xml -outfmt 5 -db nt
30 Jun 2023 07:32:29: > Blast duration: 3 days, 2:13:47
30 Jun 2023 07:32:29: Run: run_blastparser(Azotobacter_chroococcum), primer
30 Jun 2023 07:32:29: Run: blast_parser
30 Jun 2023 07:32:29: Run: blastresults_files(Azotobacter_chroococcum)
30 Jun 2023 07:32:46: > A problem with the BLAST results file /primerdesign/Azotobacter_chroococcum/Pangenome/results/primer/primerblast/primer_4_results.xml was detected. Please check if the file was removed and start the run again
30 Jun 2023 07:32:46: ['fatal error while working on', 'Azotobacter_chroococcum', 'check logfile', '/primerdesign/speciesprimer_2023_06_25.log']
fatal error while working on Azotobacter_chroococcum
Traceback (most recent call last):
File "/pipeline/speciesprimer.py", line 4168, in main
run_pipeline_for_target(target, config)
File "/pipeline/speciesprimer.py", line 4082, in run_pipeline_for_target
config, primer_dict).run_primer_qc()
File "/pipeline/speciesprimer.py", line 3537, in run_primer_qc
self.call_blastparser.run_blastparser("primer")
File "/pipeline/speciesprimer.py", line 2588, in run_blastparser
align_dict = self.blast_parser(self.primerblast_dir)
File "/pipeline/speciesprimer.py", line 2518, in blast_parser
align_dict = self.bp_parse_xml_files(blast_dir)
File "/pipeline/speciesprimer.py", line 2485, in bp_parse_xml_files
blastrecords = self.parse_BLASTfile(filename)
File "/pipeline/speciesprimer.py", line 2155, in parse_BLASTfile
record_list = list(blast_records)
File "/usr/local/lib/python3.5/dist-packages/Bio/Blast/NCBIXML.py", line 824, in parse
expat_parser.Parse(NULL, True) # End of XML record
xml.parsers.expat.ExpatError: no element found: line 3874641, column 0
30 Jun 2023 07:32:46: > Error report:
30 Jun 2023 07:32:46: > for target Azotobacter_chroococcum
30 Jun 2023 07:32:46: > Error 1:
30 Jun 2023 07:32:46: > A problem with the BLAST results file /primerdesign/Azotobacter_chroococcum/Pangenome/results/primer/primerblast/primer_4_results.xml was detected. Please check if the file was removed and start the run again
30 Jun 2023 07:32:46: > for target Azotobacter_chroococcum
30 Jun 2023 07:32:46: > Error 2:
30 Jun 2023 07:32:46: > fatal error while working on Azotobacter_chroococcum check logfile /primerdesign/speciesprimer_2023_06_25.log
I attached the broken file 4 and a working file 3 for comparison. Renamed to txt so github will let me upload. primer.part-4.txt primer.part-3.txt
The text was updated successfully, but these errors were encountered:
Hi,
From the log it looks like the blast output file is not complete. This may be due to a lot of results and not enough RAM, even 15 GB should be enough.
There is a chance that it would work if you reduce the blastseqs to a value below 500.
You may try to remove the primerblast directory, change the configuration to blastseqs below 500 and try to re-run the pipeline.
Another option may be to use the ref_prok_rep_genomes database, as there is way less redundancy of sequences.
For the size of your current nt database it looks as it grew a lot in size in recent years and the actual size seems legitimate.
Please tell me if it is working or not, I may need to change the output of the blast results from .xml to .csv/.txt as there I can select the actual data (columns) that are written to the output file, and this may reduce the required RAM.
Cheers
Hi,
When I run or rerun speciesprimer (on a docker container with 15.51 gb RAM allocated to it) a single results file is not successfully created. I can not determine from the logs what the problem is. Any help would be appreciated. ( I noticed that the BLAST DB is 250 gb not 60 gb and I don't know why. Is this maybe part of the problem?)
Settings are as follows:
{'blastseqs': 500, 'skip_tree': False, 'minsize': 75, 'path': '/primerdesign', 'mfethreshold': 90, 'nolist': False, 'ignore_qc': False, 'maxsize': 150, 'probe': False, 'offline': False, 'nontargetlist': [...], 'assemblylevel': ['complete'], 'skip_download': False, 'target': 'Azotobacter_chroococcum', 'intermediate': False, 'qc_gene': ['rRNA'], 'exception': [], 'mpprimer': -3.5, 'blastdbv5': False, 'customdb': None, 'mfold': -3.0}
The following problem shows up in the logs:
Run: run_blast - Start BLAST
27 Jun 2023 05:18:42: Run blastn -task blastn-short -num_threads 4 -query primer.part-0 -evalue 500 -out primer_0_results.xml -outfmt 5 -db nt
27 Jun 2023 14:41:00: Run blastn -task blastn-short -num_threads 4 -query primer.part-1 -evalue 500 -out primer_1_results.xml -outfmt 5 -db nt
27 Jun 2023 23:47:50: Run blastn -task blastn-short -num_threads 4 -query primer.part-2 -evalue 500 -out primer_2_results.xml -outfmt 5 -db nt
28 Jun 2023 09:20:30: Run blastn -task blastn-short -num_threads 4 -query primer.part-3 -evalue 500 -out primer_3_results.xml -outfmt 5 -db nt
28 Jun 2023 18:47:13: Run blastn -
speciesprimer_2023_06_25.log
task blastn-short -num_threads 4 -query primer.part-4 -evalue 500 -out primer_4_results.xml -outfmt 5 -db nt
29 Jun 2023 03:47:32: Run blastn -task blastn-short -num_threads 4 -query primer.part-5 -evalue 500 -out primer_5_results.xml -outfmt 5 -db nt
29 Jun 2023 13:16:20: Run blastn -task blastn-short -num_threads 4 -query primer.part-6 -evalue 500 -out primer_6_results.xml -outfmt 5 -db nt
29 Jun 2023 22:27:04: Run blastn -task blastn-short -num_threads 4 -query primer.part-7 -evalue 500 -out primer_7_results.xml -outfmt 5 -db nt
30 Jun 2023 07:32:29: > Blast duration: 3 days, 2:13:47
30 Jun 2023 07:32:29: Run: run_blastparser(Azotobacter_chroococcum), primer
30 Jun 2023 07:32:29: Run: blast_parser
30 Jun 2023 07:32:29: Run: blastresults_files(Azotobacter_chroococcum)
30 Jun 2023 07:32:46: > A problem with the BLAST results file /primerdesign/Azotobacter_chroococcum/Pangenome/results/primer/primerblast/primer_4_results.xml was detected. Please check if the file was removed and start the run again
30 Jun 2023 07:32:46: ['fatal error while working on', 'Azotobacter_chroococcum', 'check logfile', '/primerdesign/speciesprimer_2023_06_25.log']
fatal error while working on Azotobacter_chroococcum
Traceback (most recent call last):
File "/pipeline/speciesprimer.py", line 4168, in main
run_pipeline_for_target(target, config)
File "/pipeline/speciesprimer.py", line 4082, in run_pipeline_for_target
config, primer_dict).run_primer_qc()
File "/pipeline/speciesprimer.py", line 3537, in run_primer_qc
self.call_blastparser.run_blastparser("primer")
File "/pipeline/speciesprimer.py", line 2588, in run_blastparser
align_dict = self.blast_parser(self.primerblast_dir)
File "/pipeline/speciesprimer.py", line 2518, in blast_parser
align_dict = self.bp_parse_xml_files(blast_dir)
File "/pipeline/speciesprimer.py", line 2485, in bp_parse_xml_files
blastrecords = self.parse_BLASTfile(filename)
File "/pipeline/speciesprimer.py", line 2155, in parse_BLASTfile
record_list = list(blast_records)
File "/usr/local/lib/python3.5/dist-packages/Bio/Blast/NCBIXML.py", line 824, in parse
expat_parser.Parse(NULL, True) # End of XML record
xml.parsers.expat.ExpatError: no element found: line 3874641, column 0
30 Jun 2023 07:32:46: > Error report:
30 Jun 2023 07:32:46: > for target Azotobacter_chroococcum
30 Jun 2023 07:32:46: > Error 1:
30 Jun 2023 07:32:46: > A problem with the BLAST results file /primerdesign/Azotobacter_chroococcum/Pangenome/results/primer/primerblast/primer_4_results.xml was detected. Please check if the file was removed and start the run again
30 Jun 2023 07:32:46: > for target Azotobacter_chroococcum
30 Jun 2023 07:32:46: > Error 2:
30 Jun 2023 07:32:46: > fatal error while working on Azotobacter_chroococcum check logfile /primerdesign/speciesprimer_2023_06_25.log
I attached the broken file 4 and a working file 3 for comparison. Renamed to txt so github will let me upload.
primer.part-4.txt
primer.part-3.txt
The text was updated successfully, but these errors were encountered: