Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential issue if we officially support exome (interval file) data analysis #29

Open
oskarvid opened this issue Feb 19, 2020 · 0 comments
Labels
question Further information is requested

Comments

@oskarvid
Copy link

I did some reading about exome data analysis with the tools that we use in Selma and apparently it would produce sub optimal results according this discussion: https://gatkforums.broadinstitute.org/gatk/discussion/6894/gatk-best-practices-for-exome-targeted-capture-small-region
The important points are the following quotes:

  • you should not use BQSR on [exome data]
  • You are probably better off doing hard filtering for a small target region [instead of using VQSR]

This discussion also has good information about why BQSR is not advised for datasets with less than 100 million bases.

We discussed running hap.py on exome (interval file) data analyses but based on these points this may not be a good use of our time given that we shouldn't run the BQSR, VQSR and ApplyVQSR tools on small datasets.

@oskarvid oskarvid added the question Further information is requested label Feb 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant