primer pairs left after secondary amplicon QC #14

Longhx1112 · 2021-11-01T09:00:33Z

core genes: 2017
single copy core genes: 1836
Number of conserved sequences: 1812
species specific conserved sequences: 536
potential primer pair(s): 4578
primer pairs with good target binding: 4260
primer pairs left after non-target QC: 615
primer pairs left after secondary amplicon QC: 0
primer pairs left after mfold: 0
primer pairs left after primer QC: 0

Hi,
I have a question when using speciesprimer.
"primer pairs left after secondary amplicon QC" is zero, which parameters can be modified?
I have already tried "ignore_qc" and“skip_tree”, but it didn't work.
Looking forward to your reply, thank you very much!

biologger · 2021-11-01T09:46:18Z

Hi,
The ignore_qc and skip_tree options only affect the input quality control not the primer quality control.
The secondary amplicon check takes 10 input assemblies and uses MFEprimer to check if only one amplicon is created for each assembly with the primer pairs.
You can check the results in the /primerdesign/your_target/Pangenome/results/primer/primerQC/MFEprimer_assembly.csv file.
A less stringent selection can be achieved using a lower mfethreshold for the --mfethreshold option. Default is 90, you could try also 85 or 80, I would not recommend to go below 70.
If the MFEprimer_assembly.csv is empty there is probably a problem with the database.

lanying · 2021-11-07T12:34:37Z

I want to know why just takes 10 input assemblies to check secondary amplicon check?

biologger · 2021-11-07T18:00:06Z

It is a matter of speed and computing power.
In cases where we have for example 500 input assemblies the MFEprimer database would get too large and the QC would take forever.
The pipeline selects the 10 assemblies according to the completeness: Complete Genomes > Chromosome > Scaffolds > Contigs

The number of assemblies could be changed by changing the speciesprimer.py script in the PrimerQualityControl class:

class PrimerQualityControl:
    def __init__(self, configuration):
         self.referencegenomes = 10 <-- change this number

To check more than 20 input assemblies I would recommend to split the assemblies in several DBs, to speed up the database indexing, however it will still take a lot of time.

Maybe an additional option to define the number of assemblies can be implemented in a future version.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

primer pairs left after secondary amplicon QC #14

primer pairs left after secondary amplicon QC #14

Longhx1112 commented Nov 1, 2021

biologger commented Nov 1, 2021

lanying commented Nov 7, 2021

biologger commented Nov 7, 2021

primer pairs left after secondary amplicon QC #14

primer pairs left after secondary amplicon QC #14

Comments

Longhx1112 commented Nov 1, 2021

biologger commented Nov 1, 2021

lanying commented Nov 7, 2021

biologger commented Nov 7, 2021