You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Test shows it's working for 16-GB V100: #18 (comment)
For large dataset such as Parihaka (4.8 GB), a 8-GB device cannot generate correct unzipped data. The peak memory usage should be around 1.5x the input datum size (in this case, 7.5 GB), because the workflow can be stated as
datum of 1x size loaded to GPU, generating (at max) quant. code of 0.5x
CSR, preserving according to compression ratio, after which, 1x datum is freed
In one case, 4.8 GB generates 800 MB CSR; this may exceed device memory capacity.
> ./bin/cusz -f32 -m r2r -e 1.0e-4.0 -i ~/Parihaka_PSTM_far_stack.f32 -D parihaka -z
[info] datum: /path/to/Parihaka_PSTM_far_stack.f32 (4850339584 bytes) of type f32
[dbg] original len: 1212584896 (padding: 34823)
[dbg] Time loading data: 3.19802s
[info] quant.cap: 1024 input eb: 0.0001
[dbg] Time inspecting data range: 0.0232662s
[info] eb change: (input eb) x 12342.2 (rng) = 1.23422 (relative-to-range)
[dbg] 2-byte quant type, 4-byte internal Huff type
[info] Commencing compression...
[info] nnz.outlier: 16607 (0.00136955%)
[dbg] Optimal Huffman deflating chunksize 131072
[info] entropy: 3.85809
[dbg] Huffman enc: #chunk=9252, chunksze=131072 => 212256108 4-byte words/6792051551 bits
[dbg] Time writing Huff. binary: 0.501431s
[info] Compression finished, saved Huffman encoded quant.code.
[dbg] Time tar'ing 1.01489s
[info] Written to: /path/to/Parihaka_PSTM_far_stack.f32.sz
> ./bin/cusz -i ~/Parihaka_PSTM_far_stack.f32.sz -x --origin ~/Parihaka_PSTM_far_stack.f32 --skip write.x
[info] Commencing decompression...
[info] Huffman decoding into quant.code.
[info] Extracted outlier from CSR format.
[info] Decompression finished.
[info] Huffman metadata of chunking and reverse codebook size (in bytes): 150336
[info] Huffman coded output size: 849024432
[info] To compare with the original datum
[info] Verification start ---------------------
| min.val -6893.359375
| max.val 5448.8828125
| val.rng 12342.2421875
| max.err.abs.val 6893.359375
| max.err.abs.idx 706941819
| max.err.vs.rng 0.55851759107283360795
| max.pw.rel.err 1
| PSNR 32.837037623060211899
| NRMSE 0.022811199295531395248
| correl.coeff -NAN
| comp.ratio.w/o.gzip 5.709997
[info] Verification end -----------------------
[info] Decompressed file is written to /path/to/Parihaka_PSTM_far_stack.f32.szx.
[info] Please use compressed data (*.sz) to calculate final comp ratio (w/ gzip).
[info] Skipped writing unzipped to filesystem.
misc. todo:
writing to filesystem info not correct
verification too slow
trailing slash in printing output file
The text was updated successfully, but these errors were encountered:
The problem was previously solved in an ad hoc manner: online releasing memory to keep the program run. However, it is too expensive to deallocate, thus, other better solutions are needed, e.g., data partitioner: only a (small) portion of data is compressed each time.
Test shows it's working for 16-GB V100: #18 (comment)
For large dataset such as Parihaka (4.8 GB), a 8-GB device cannot generate correct unzipped data. The peak memory usage should be around 1.5x the input datum size (in this case, 7.5 GB), because the workflow can be stated as
In one case, 4.8 GB generates 800 MB CSR; this may exceed device memory capacity.
misc. todo:
The text was updated successfully, but these errors were encountered: