You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So today they filled their hard drive while running the pipeline.
The fastq files they are running are very large since they only ran 12 or 24 samples in the run.
An example project they have
RawFastq(1G x 2) + Filtered Fastq(1G x 2) + trimmed fastq(1G x 2) + bam(790M) = 6.7G
Then they are running a few of these samples(you can see how this is adding up)
At WRAIR this is less of an issue because we have de-duplication on the storage server
Just as a test I tried gzipping one of the fastq files that was originally 1.2G and it came out 330M, which is a pretty great storage savings.
Maybe we should force gzip output from all stages?
The text was updated successfully, but these errors were encountered:
Another thing is that right now ngs_filter symlinks data from convert-formats if no filtering is done
could maybe fix this by skipping calling ngs_filter altogether within runsample. Are there any other symbolic link being used in the pipeline?
Related #204
So today they filled their hard drive while running the pipeline.
The fastq files they are running are very large since they only ran 12 or 24 samples in the run.
An example project they have
RawFastq(1G x 2) + Filtered Fastq(1G x 2) + trimmed fastq(1G x 2) + bam(790M) = 6.7G
Then they are running a few of these samples(you can see how this is adding up)
At WRAIR this is less of an issue because we have de-duplication on the storage server
Just as a test I tried gzipping one of the fastq files that was originally 1.2G and it came out 330M, which is a pretty great storage savings.
Maybe we should force gzip output from all stages?
The text was updated successfully, but these errors were encountered: