Input database "./GCA_019458185.1.faa" has the wrong type (Generic) #923

fengqingling · 2024-12-18T02:50:42Z

I want to use mmseqs to annotate PFAM for an faa file.
I can sure the faa file is fasta format, and it's Aminoacid.

Steps to Reproduce (for bugs)

First of all, I download pfam_seed.
mmseqs databases Pfam-A.seed pfam_seed/pfam tmp --threads 10

Then, create the index.
mmseqs createindex pfam_seed/pfam tmp -k 5 -s 7

But when I run mmseqs to annotation PFAM, some bug are generated.
mmseqs search ./GCA_019458185.1.faa pfam_seed/pfam mmseq_result.txt tmp

MMseqs Output (for bugs)

MMseqs Version: 15.6f452
Substitution matrix aa:blosum62.out,nucl:nucleotide.out
Add backtrace false
Alignment mode 2
Alignment mode 0
Allow wrapped scoring false
E-value threshold 0.001
Seq. id. threshold 0
Min alignment length 0
Seq. id. mode 0
Alternative alignments 0
Coverage threshold 0
Coverage mode 0
Max sequence length 65535
Compositional bias 1
Compositional bias 1
Max reject 2147483647
Max accept 2147483647
Include identical seq. id. false
Preload mode 0
Pseudo count a substitution:1.100,context:1.400
Pseudo count b substitution:4.100,context:5.800
Score bias 0
Realign hits false
Realign score bias -0.2
Realign max seqs 2147483647
Correlation score weight 0
Gap open cost aa:11,nucl:5
Gap extension cost aa:1,nucl:2
Zdrop 40
Threads 128
Compressed 0
Verbosity 3
Seed substitution matrix aa:VTML80.out,nucl:nucleotide.out
Sensitivity 5.7
k-mer length 0
Target search mode 0
k-score seq:2147483647,prof:2147483647
Alphabet size aa:21,nucl:5
Max results per query 300
Split database 0
Split mode 2
Split memory limit 0
Diagonal scoring true
Exact k-mer matching 0
Mask residues 1
Mask residues probability 0.9
Mask lower case residues 0
Minimum diagonal score 15
Selected taxa
Spaced k-mers 1
Spaced k-mer pattern
Local temporary path
Rescore mode 0
Remove hits by seq. id. and coverage false
Sort results 0
Mask profile 1
Profile E-value threshold 0.1
Global sequence weighting false
Allow deletions false
Filter MSA 1
Use filter only at N seqs 0
Maximum seq. id. threshold 0.9
Minimum seq. id. 0.0
Minimum score per column -20
Minimum coverage 0
Select N most diverse seqs 1000
Pseudo count mode 0
Min codons in orf 30
Max codons in length 32734
Max orf gaps 2147483647
Contig start mode 2
Contig end mode 2
Orf start mode 1
Forward frames 1,2,3
Reverse frames 1,2,3
Translation table 1
Translate orf 0
Use all table starts false
Offset of numeric ids 0
Create lookup 0
Add orf stop false
Overlap between sequences 0
Sequence split mode 1
Header split mode 0
Chain overlapping alignments 0
Merge query 1
Search type 0
Search iterations 1
Start sensitivity 4
Search steps 1
Prefilter mode 0
Exhaustive search mode false
Filter results during exhaustive search 0
Strand selection 1
LCA search mode false
Disk space limit 0
MPI runner
Force restart with latest tmp false
Remove temporary files false

Input database "./GCA_019458185.1.faa" has the wrong type (Generic)
Allowed input:

Index
Nucleotide
Profile
Aminoacid

The text was updated successfully, but these errors were encountered:

fengqingling · 2024-12-19T07:40:03Z

OK, I can use easy-search to get the result.
mmseqs easy-search file pfam_seed/pfam result.txt tmp
But I still want to know what is the difference between the results of easy-search and search, and why search has an error.

martin-steinegger · 2025-01-02T05:05:04Z

mmseqs search ./GCA_019458185.1.faa can only read databases not fasta files. So you would need to call createdb first on the GCA_019458185.

martin-steinegger closed this as completed Jan 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Input database "./GCA_019458185.1.faa" has the wrong type (Generic) #923

Input database "./GCA_019458185.1.faa" has the wrong type (Generic) #923

fengqingling commented Dec 18, 2024

fengqingling commented Dec 19, 2024

martin-steinegger commented Jan 2, 2025

Input database "./GCA_019458185.1.faa" has the wrong type (Generic) #923

Input database "./GCA_019458185.1.faa" has the wrong type (Generic) #923

Comments

fengqingling commented Dec 18, 2024

Steps to Reproduce (for bugs)

MMseqs Output (for bugs)

fengqingling commented Dec 19, 2024

martin-steinegger commented Jan 2, 2025