Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support external compression/decompression programs #720

Closed
ciscon opened this issue Mar 7, 2016 · 6 comments
Closed

support external compression/decompression programs #720

ciscon opened this issue Mar 7, 2016 · 6 comments
Labels

Comments

@ciscon
Copy link

ciscon commented Mar 7, 2016

there doesn't appear to be native support for any multi-threaded compression/decompression methods, therefore it would be simple enough to allow the user to specify a custom command for compression/decompression using stdin/stdout.

@ThomasWaldmann
Copy link
Member

there has been some work on internal multithreading, but that will still take a while until it is usable, see multithreading branch + the pull request against it (it is not just about multithreaded compression, but doing all cpu heavy stuff in threads and also avoiding/reusing I/O wait times).

just doing the compression in a multithreaded way would only help for very slow compression algorithms and doing it externally would even add some slowdown/latency (for communication over pipes) and complexity (process management), so I currently don't think this is a valuable goal (and would rather invest developer time into debugging/improving the multithreading branch).

if you want quicker compression, you can always use lz4 (which is extremely fast) or gzip with lower levels (fast).

using high compression (higher gzip levels or lzma) only saves you a few percent of space - that is only valuable if you have a very slow network connection to the backup server and cpu cycles to spend.

@ciscon
Copy link
Author

ciscon commented Mar 8, 2016

It's good to hear that there's work being done on internal multithreading, I'll have to take a look at that branch. I've tried lz4 and gzip at level 1 but both still end up being much slower than my manual backup procedure using multithreaded compression (pigz and xz, both at very low compression but using ~20 threads). I'm up to an average of 30 minutes for a backup of an lvm backed vm using lz4 on a single thread compared to about 5 minutes with pigz/xz.

@ThomasWaldmann
Copy link
Member

Well, if borg and lz4 is slower than what you use now, it is unlikely due to compression. Even with just 1 thread, lz4 is very fast. But there is a lot of other things to compute in borg, like sha256 (or hmac-sha256), maybe AES, crc32, buzhash for chunking. Plus some I/O wait.

@tgharold
Copy link
Contributor

Tracking this because I'm interested in multi-threaded LZMA/LZMA2 compression (at least enough to use two cores, hopefully 4+ cores).

(Also planning on testing compression speed / amount on a large ~2TB dataset in the next few weeks so I can compare time/size trade-offs.)

@ThomasWaldmann
Copy link
Member

Closing this - forking external binaries is not desired and internal multithreading is being developed already (see multithreading branch).

@ThomasWaldmann
Copy link
Member

See also #1633.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants