-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
when ever am passing a bam file as input am getting nothing out of this. #375
Comments
Hi @ankushreddy - thanks for checking out Guacamole. Most of the callers are still in progress, but hopefully we can help you test them out. For Does your BAM file have MDTags? If not, We do support computing of mdtags in Guacamole as well, but looking over the code, this isn't configured correctly right now. I wll file an issue for that and fix it. |
Hi @arahuja Thanks for the quick reply I just got few more questions am actually new to genomics. So don't know what exactly is going on with the code. **I used adam-submit and added the tags by using this adam-submit command. ./adam-submit transform /shared/avocado_test/NA06984.454.ssaha.SRP000033.2009_10.bam /shared/avocado_out/NA06984.454.ssaha.SRP000033.2009_10.bam_tags.adam -add_md_tags /shared/avocado_test/human_b36_male.fa drwxr-xr-x - asugured hdfs 0 2016-01-28 12:37 /shared/avocado_out/NA06984.454.ssaha.SRP000033.2009_10.bam_tags.adam Later I used /shared/avocado_out/NA06984.454.ssaha.SRP000033.2009_10.bam_tags.adam to submit it with spark-submit for guacamole. Error it is throwing avro parquet schema error. 16/01/28 12:46:07 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1, istb1-l2-b11-01.hadoop.priv): org.apache.parquet.io.InvalidRecordException: Parquet/Avro schema mismatch. Avro field 'recordGroupPredictedMedianInsertSize' not found. 16/01/28 12:46:07 INFO TaskSetManager: Starting task 1.1 in stage 0.0 (TID 2, istb1-l2-b11-01.hadoop.priv, NODE_LOCAL, 1516 bytes) Driver stacktrace: Could you please help me in understanding this. Thanks & Regards, |
What version of ADAM are you using, if you have moved to a new version of ADAM the schema of the ADAM format may be different. If you can use the BAM output of ADAM that may work better, but I have not used that before. |
I am adam latest version just now I cloned it from git and started using it. please correct me if am following the correct process. |
Hey @ankushreddy If you can try those steps again using that version of ADAM, guacamole should be able to read the Or, as @arahuja said, you can try using |
@ryan-williams hi ryan thanks for guiding me I used adam 0.18.1 am getting null pointer exception. 16/01/28 20:05:38 INFO MemoryStore: ensureFreeSpace(303352) called with curMem=0, maxMem=5556991426 16/01/28 20:05:46 INFO TaskSetManager: Starting task 1.1 in stage 0.0 (TID 2, istb1-l2-b14-19.hadoop.priv, NODE_LOCAL, 1590 bytes) 16/01/28 20:06:00 INFO TaskSetManager: Starting task 1.3 in stage 0.0 (TID 4, istb1-l2-b14-07.hadoop.priv, NODE_LOCAL, 1590 bytes) Driver stacktrace: Any kind of help is appreciated. Thanks & Regards, |
Your NPE is coming from this line; some contig is |
hi @ryan-williams I have tried with different .bam files but am facing the same issue. Could you please let me know when guacamole is ready to handle both .bam and fasta reference file. Thanks & Regards, |
Hi @ankushreddy Can you tell us more about the error you hit using a BAM file. As we mentioned earlier if the ADAM format is different than that support we say issues, but that part of the code has seen little use/testing as well. We recently upgraded our ADAM input if you wanted to retest. However, if you are seeing a similar error with a BAM input that would be good to know about |
Hi @arahuja just want to check what is the current version of adam I should use or is it enough if I just use a bam or sam file that is aligned with the reference I will test it once again and let you know the results. |
@ankushreddy We actually now only support loading the reference explicitly and do not rely on md-tags anymore. Also, we aren't really supporting |
@arahuja Thanks for the reply am actually testing it on the sam file but i see lot of variants are being called is there any way we can minimize the variants based on quality or anything. |
Hi team,
When am submitting the job it is getting executed successfully but it is not calling any genotypes.
could you please help me with this issue.
please see the output of the spark-submit I have used.
16/01/27 16:14:02 INFO YarnScheduler: Adding task set 19.0 with 1 tasks
16/01/27 16:14:02 INFO TaskSetManager: Starting task 0.0 in stage 19.0 (TID 14, istb1-l2-b12-07.hadoop.priv, PROCESS_LOCAL, 1432 bytes)
16/01/27 16:14:02 INFO BlockManagerInfo: Added broadcast_15_piece0 in memory on istb1-l2-b12-07.hadoop.priv:38654 (size: 1808.0 B, free: 2.1 GB)
16/01/27 16:14:02 INFO DAGScheduler: Stage 19 (count at VariationRDDFunctions.scala:144) finished in 0.101 s
16/01/27 16:14:02 INFO TaskSetManager: Finished task 0.0 in stage 19.0 (TID 14) in 88 ms on istb1-l2-b12-07.hadoop.priv (1/1)
16/01/27 16:14:02 INFO YarnScheduler: Removed TaskSet 19.0, whose tasks have all completed, from pool
16/01/27 16:14:02 INFO DAGScheduler: Job 5 finished: count at VariationRDDFunctions.scala:144, took 0.115971 s
16/01/27 16:14:02 INFO VariantContextRDDFunctions: Write 0 records
16/01/27 16:14:02 INFO MapPartitionsRDD: Removing RDD 22 from persistence list
16/01/27 16:14:02 INFO BlockManager: Removing RDD 22
*** Delayed Messages ***
Called 0 genotypes.
Region counts: filtered 0 total regions to 0 relevant regions, expanded for overlaps by NaN% to 0
Regions per task: min=NaN 25%=NaN median=NaN (mean=NaN) 75%=NaN max=NaN. Max is NaN% more than mean.
16/01/27 16:14:03 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null}
16/01/27 16:14:03 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
16/01/27 16:14:03 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/,null}
16/01/27 16:14:03 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/static,null}
16/01/27 16:14:03 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
16/01/27 16:14:03 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null}
16/01/27 16:14:03 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/json,null}
Thanks & Regards,
Ankush Reddy
The text was updated successfully, but these errors were encountered: