-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open for Discussion: Folder Sending #137
base: main
Are you sure you want to change the base?
Conversation
This comment has been minimized.
This comment has been minimized.
I initially didn't want to implement the part of the protocol with the zip files, because it's ugly and has a lot of downsides. Most notably, not everything can be zipped and building large zip files is really annoying (especially if the transfer fails and you have to rebuild it). Instead, I hacked something using tarballs sent as files that at least support all files and can be streamed from/to disk. Of course, this has its downsides too. I am open to a zip based solution though. Maybe we can keep both implementations in a
If we instead go full in on the zip file approach, I'd like to see the following improvements (can be done in follow-up PRs):
Not likely, sorry. If we choose Tokio, we indeed have more dependency choices, but we also force it upon all dependents. Async-std is a lot easier there, because it spins up its own runtime. I'd be willing to work towards more executor-agnostic code though. This probably won't be possible for the whole transfer part, but I'm positive we could do it for the Heads up: I have started factoring out the transfer protocol into versioned submodules in preparation of v2. You will probably have to rebase onto |
Yeah, I can see the logic against it in general, but it's how other clients work and it'd be nice to interop with them properly even if it's viewed as a "legacy" routine.
This makes the most sense to me, yeah.
Logic makes sense though I'll defer to you on what those size limits and/or flags should be, since I wager you have more insight into what feels more natural for a wormhole client anyway.
This is mostly automatic on both counts due to how I built it (using tempfile - just waiting on this). So we should be good here barring anything that comes up.
I'm definitely open to missing something here, but I don't see anything in here that couldn't be done with
I think my point is actually illustrated here... by using the less de-facto async executor, you're making (forcing) it harder for things to interop well. Regardless, I'm not looking to start a religious war about async executors - more capable interop would probably work too. If I still feel like
Heh, yeah, I didn't see that PR until it was too late. I don't think rebasing should be too bad though, so I'll probably just keep adding to this and then rebase on that whenever it's merged. Can push out my app in the meantime and just peg it to my fork and then update it when your stuff is good to go. One last question: the hash building in the current Thanks again! |
Just to be clear: the current hack is compatible. It only reduced user experience, and not interoperability :D
Yes. It's easier to use an async-std library in a tokio project than the other way around.
The hash sum is part of my hack to allow streaming sending: the problem is that we need to know the size of the compressed archive beforehand, because of a protocol quirk. To work around having to build a temporary archive file, I simply archive the file twice: once to see how big it will get, and the second time while actually sending it. The hash is only there to detect any potential file system smear (i.e. concurrent file modifications) that would invalidate the process. |
Heh, we'll have to agree to disagree there. Regardless, I appreciate you considering this PR.
Ah, got it - this is helpful so I appreciate the clarification. It looks like my changes to |
I'm not sure, is the current code incompatible with other implementations in some way?
Not pretty, but I've "merged" (pushed) the relevant changes onto master. |
Mmm, compatible and interoperable are two different takes here - I'm moreso saying the latter. It's of course compatible insofar as transferring a file works. If it doesn't unzip on the other end, then no, I wouldn't necessarily consider it interoperable - I.e you're not sending a folder in the sense of how both sides see it. There's then additional work for another user to go through. I think there's a comment in the codebase that even explains that it doesn't unpack it on the other side, but I'm mobile so can't confirm at the moment. I can see your logic for a tarball ultimately being better, I just value working with the existing expectations more, hence the agree to disagree. 'S not meant to be a dig. |
I'm increasingly in favor of throwing my crappy tar ball hack out of the window, and also send the zip files without being too smart about it. Just get the feature working for compatibility, and focus on transfer-v2 for all the real improvements. |
Will transfer-v2 remove the whole prepack into archives or something? I feel streaming files makes more sense here, both in terms of memory, storage and CPU usage. In terms of bandwidth it probably makes little difference. That said, I would expect the zip to be a |
Yes. The future here is Dilated File Transfer which doesn't depend on any archive format but instead uses the protocol itself to bundle files into cohesive offers in any order/grouping the client software decides. It is not worth time tweaking the v1 transfer, IMO -- either implementations support it, or they don't. I certainly agree that ZIP isn't an ideal format here -- but at least it has wide support, and you can stream with it (see recent Python for example). While more or different performance characteristics could be gotten from tweaking the exact archive format and options, any effort I have in this direction will be spent on v2 / next-gen / Dilated file transfer. |
I was looking at this after some time away, and wanted to get (proper) folder sending working. Opening this PR as it's easier to have a discussion with code being linked, but this isn't truly a PR in the correct sense.
Anyway.
This change has a
send_zip_file
method intransfer.rs
, which takes an already zipped file and sends it via thePeerMessage
directory mode. The actual mechanics of zipping up a file are something I need to extract from my project - and I'm happy to include this in that PR, once I can do so - but this method should be testable by just redirectingsend_file_or_folder
tosend_zip
for a.zip
.So ultimately some questions:
This actually works, as I use it and files/folders are transferred without issue. That said, I'm sure I could be missing something about how the protocol works here, so I wanted to get more seasoned eyes on it. I do compute a hash of this elsewhere that I could pass in to
send_zip_file
if the checking logic currently in the bottom ofsend_folder
is needed.My work that I need to extract and add on to here mimics the Python implementation by writing to a temporary spooled file, which spills to disk after a default of ~10MB. I assume that'd be useful to include as well?
This is completely tangential, but is there any chance this repo could be migrated from async_std to tokio? The fractured async ecosystem in Rust is already a pain, and trying to look at ways to do something like async-zip-file-writing is a headache when the bulk of the ecosystem is in Tokio.
Or support both, I suppose, but that seems like more hassle than it'd be worth.