Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OutOfMemoryError #2

Open
flangelier opened this issue Mar 28, 2015 · 4 comments
Open

OutOfMemoryError #2

flangelier opened this issue Mar 28, 2015 · 4 comments
Assignees
Labels

Comments

@flangelier
Copy link
Contributor

Error:

2015-03-22 17:29:25 ERROR Executor:96 - Exception in task 0.0 in stage 1.0 (TID 16)
java.lang.OutOfMemoryError: GC overhead limit exceeded

Step to reproduce:

docker run -ti --rm --name client-genomics -v /data:/data gelog/avocado /bin/bash avocado-submit /data/SRR062634.adam /data/chr1.fa /data/SRR062634.avr /usr/local/avocado/avocado-sample-configs/basic.properties

You can get the files by following the step from :

@sebastienbonami
Copy link
Member

Have you try reducing the executor and driver memory in Spark? See: GELOG/adamcloud#11 (comment)

@davidonlaptop
Copy link
Member

How did you generate SRR062634.adam?

  • Please describe the ADAM commands as well (URL of original data, command used to run ADAM).

@flangelier
Copy link
Contributor Author

If order, you have to :

Getting the index

mkdir -p /data/
wget -O /data/chr1.fa.gz http://hgdownload.cse.ucsc.edu/goldenPath/hg19/chromosomes/chr1.fa.gz
gzip -d /data/chr1.fa.gz

Indexing with snap

docker run --rm=true -ti -v /data:/data gelog/snap index /data/chr1.fa /data/snap-index.chr1

Getting the chromosome

wget -O /data/SRR062634.filt.fastq.gz ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data/HG00096/sequence_read/SRR062634.filt.fastq.gz
gzip -d /data/SRR062634.filt.fastq.gz

Aligning the chromosome

docker run --rm=true -ti -v /data:/data gelog/snap single /data/snap-index.chr1/ /data/SRR062634.filt.fastq -o /data/SRR062634.sam

Running adam

docker run --rm=true -ti -v /data/:/data gelog/adam adam-submit transform /data/SRR062634.sam /data/SRR062634.adam

You should now have all the needed files

@davidonlaptop
Copy link
Member

The file SRR062634.filt.fastq.gz file contains the bogus reads that have been filtered out after quality control.

Try to feed this file to ADAM:
ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data/HG00096/alignment/HG00096.chrom20.ILLUMINA.bwa.GBR.low_coverage.20120522.bam

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants