Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pVACseq: Discrepancy between {sample_id}.all_epitopes.tsv and {sample_id}.fasta ? #1118

Open
yiolino opened this issue Jun 26, 2024 · 6 comments

Comments

@yiolino
Copy link
Contributor

yiolino commented Jun 26, 2024

Hi, thank you for developing such an excellent software.

pVACseq: Discrepancy between {sample_id}.all_epitopes.tsv and {sample_id}.fasta ?

Description

I have successfully run pVACseq without any errors.
However, I have observed that certain peptides listed in the {sample_id}.all_epitopes.tsv are not present in the {sample_id}.fasta .

Expected Behavior

I expected that all peptides listed in the {sample_id}.all_epitopes.tsv would be included in the {sample_id}.fasta.

Actual Behavior

Certain peptides present in the {sample_id}.all_epitopes.tsv are missing from the {sample_id}.fasta.

Questions

  1. Is it expected behavior that not all peptides listed in the {sample_id}.all_epitopes.tsv file are included in the {sample_id}.fasta?
  2. If this is expected, could you please explain the logic behind this behavior? I am particularly interested in the section of the code in generate_protein_fasta.py here. Is my assumption correct that this part of the code is related to the observed behavior?

Additional Information

  • pVACseq version: Docker 3.0.1
  • OS Ubuntu 22.04
@susannasiebert
Copy link
Contributor

This is generally not expected. Would you be able to share your input VCF with us for further investigation? A VCF file with just the variant in question would be sufficient for us to be able to investigate this on our end.

@yiolino
Copy link
Contributor Author

yiolino commented Jul 14, 2024

@susannasiebert
Sorry for the late reply.
I have created a VCF with unnecessary information masked.
I have sent it to you via email.
Please check it.

@susannasiebert
Copy link
Contributor

I tried running pVACtools with the VCF you provided but I'm unable to replicate this issue. Can you please provide me with an example pVACseq command where you noted this problem?

@susannasiebert
Copy link
Contributor

@yiolino I wanted to ping you about this issue. I was unable to replicate this problem so it would help if you could provide an example pVACseq command where you noticed this discrepancy.

@yiolino
Copy link
Contributor Author

yiolino commented Aug 25, 2024

@susannasiebert
I apologize for the late reply.

The command I used is as follows:

pvacseq run \
    “$in_vcf” \
    “$sample_name” \
    “$HLAtype” \
    $binding_alogs \
    “$out_dir” \
    --n-threads “$threads” \
    --iedb-install-directory /opt/iedb \
    --trna-cov 0 \
    --trna-vaf 0 \
    --expn-val 0 \
    --phased-proximal-variants-vcf “$phased_vcf” \
    --pass-only

Would you need the contents of the phased VCF file to help replicate the issue?

Best regards,
yiolino

@susannasiebert
Copy link
Contributor

I think I do, yes. Please also include the values of $HALtype and $binding_alogs for this particular run.

Additionally, if you could also attach the all_epitopes.tsv and fasta file where you noticed the discrepancy, that would be helpful in debugging this further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants