-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError #3
Comments
Hello, The error message indicates that MeTAfisher is not able to map the gene IDs from the hmmsearch results (originally sourced from the faa file) to the CDS entries of the gff file. MeTAfisher expects to find the gene ID either in the attribute To diagnose this issue, can you verify whether the
It should have a corresponding line in the gff file with the same NC_010645.1 Protein Homology CDS 31171 31482 . + 0 gene=rpsJ;protein_id=WP_003806903.1;transl_table=11 or NC_010645.1 Protein Homology CDS 31171 31482 . + 0 ID=WP_003806903.1;Parent=gene-BAV_RS00140;Dbxref=Genbank:WP_003806903.1,GeneID:41391953 Best regards, Jean |
Unfortunately, the gff and faa files generated by Prodigal do not seem to be compatible with metafisher's expectations. This is something that could be enhanced in the code for better compatibility. As a quick fix, you could process the gff file to include the "protein_id" attribute that metafisher requires. To accomplish this, you can use an AWK command: awk 'BEGIN {OFS="\t"} $1 != prev_first {counter = 1; prev_first = $1} $1 !~ /^#/ {print $0"protein_id="$1"_"counter++";"}' your_initial.gff > your_new.gff This appends a new "protein_id" attribute at the end of each line, composed of the contig's name and a counter. After applying this command, you can try to run metafisher using the newly generated gff file. Best, |
Hello, teacher, I'm sorry to bother you again. After I modified awk according to you, there are still problems in operation. I would like to ask what software you used to make protein prediction.
|
Hello, teacher!
I followed your github operation steps under the installation, I used prodigal to predict the virus protein .faa and.gff files. Then run the code as follows:
nohup ./metafisher/metafisher.py --gff data_test/GCF_000070465.1/vOTU-gene.gff.gz
--faa data_test/GCF_000070465.1/vOTU-protein.faa.gz
--outdir metafisher_results1
--diamond_db TA_data/type_II_TA.dmnd -v &
The error is as follows:
ask for your help
The text was updated successfully, but these errors were encountered: