Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Threaded modules need to execute data.drop on all threads #117

Open
alexcrichton opened this issue Jul 19, 2019 · 9 comments
Open

Threaded modules need to execute data.drop on all threads #117

alexcrichton opened this issue Jul 19, 2019 · 9 comments
Assignees

Comments

@alexcrichton
Copy link
Collaborator

In the description of the passive segments portion of linking it mentions that __wasm_init_memory will be used to initialize all memory segments. This is presumably called on the first thread, and LLD today also executes data.drop for each memory segment.

I think, though, that all new threads also need to execute data.drop for all memory segments to avoid keeping them around, right? Should data.drop not be part of __wasm_init_memory? Or perhaps another synthetic function to drop segments?

FWIW we've had a strategy of doing this in wasm-bindgen prior to LLVM 9 which involved injecting a start function which did an atomic add to initialize a thread id counter, and based off the thread ID it'd initialize memory or drop segments. I wonder if perhaps most modules need something like that anyway to get synthesized during LLD as well?

@tlively
Copy link
Member

tlively commented Jul 19, 2019

Good point! I hadn't thought of dropping the unused segments on the other threads. I think it would make sense to remove the data.drops from __wasm_init_memory and instead put them in a separately synthesized __wasm_drop_data that can be called from all threads.

I think I would rather leave it up to the runtime to figure out when to call these functions, rather than having LLD synthesize a coordination scheme with an atomic counter. Although if such a scheme were to be synthesized anywhere, LLD would be the best place for it since it is already laying out the memory. @sbc100 do you have any thoughts on that?

@tlively tlively self-assigned this Jul 19, 2019
@sbc100
Copy link
Member

sbc100 commented Jul 19, 2019

This stuff is all rather web-specific, and probably won't apply if/when runtimes get actually shared-module threading. But then again the whole active/passive segment thing is web specific.

@quantum5 is currently working on TLS, which also needs startup code to run on each thread so we should follow the same model is possible.

@alexcrichton are you saying the you call this start function both on the main thread and on the background threads? Is this start function different to _start which run main or the function in the start section? I think to be more explict I'd be OK with exporting a separate function for initializing a thread.

And yes, I think lld is probably the right place to do this stuff.

@alexcrichton
Copy link
Collaborator Author

Currently the way set it up for the web (and that's actually a really good point, this is all very web-specific!) is that the module has a start function (via the start function section). This start function will do an atomic add to acquire a thread ID, and then it'll compare that against zero to see if it's the first thread or a child thread, doing various tasks on both. All threads, however, always execute data.drop

We don't really have great support for a "main"-like function (e.g. _start) and most modules we produce don't have one at all, it's just executed from JS at some point. If a module does have _start, though, then it's only executed from the main thread as the entry point.

Also I don't mind too much about the layout particularly here, we're allocating 4-bytes at __heap_base already (this is allocated after linking since it's a runtime thing we're not baking into the wasm files yet) to store the thread ID counter and we can still do that for sure.

One of the major issues with the scheme we have right now is that thread local data doesn't work super well, but the work done in #116 looks super promising and we'll definitely try to give it a spin and help test it out when it's ready in LLVM!

@quantum5
Copy link
Member

@alexcrichton the proposal in #116 has already been implemented in LLVM. Feel free to try it out. Thread local storage already works in emscripten. See emscripten-core/emscripten#8976.

@alexcrichton
Copy link
Collaborator Author

Oh awesome!

I got far enough along that I was mega confused and stumped for a few hours only to end up finding a bug and realizing llvm/llvm-project@21aafc2 had already fixed it, but otherwise things are looking good so far!

@tlively
Copy link
Member

tlively commented Jul 22, 2019

While we're making changes to pull the data.drop calls out of __wasm_init_memory into their own function, it's probably worth stepping back and figuring out what we want the big picture to be. I like the scheme @alexcrichton described that uses the WebAssembly start function to initialize memory. We have nothing better to do with the start function and that scheme eliminates the need for the host runtime to participate in initializing memory. The only tradeoff is that we are synthesizing more magical variables and functions in the linker, but I think this is probably worth the complexity. How does everyone else feel about having LLD synthesize an atomic __wasm_meminit_flag variable and a start function that conditionally initializes memory and drops all the segments?

@alexcrichton
Copy link
Collaborator Author

Would it be possible to basically only have a start function? That way it'd handle everything from static constructors, TLS initialization, memory initialization, etc. The other items could perhaps optionally be exported and perhaps be LLD-specific, but otherwise I'm not sure if so much needs to be exported.

I think in general it'd be good to assume as little runtime as possible because at least for the Rust use case there isn't much of a runtime (or it's quite minimal) and it'd be pretty slick if we could get a threading-capable module vanilla out of LLD ready to put in the browser. I imagine tools like Emscripten/wasm-bindgen may want to tweak how this works and inject their own startup/initialization, but that could relatively easily be done as a postprocessing pass of the wasm module rather than what we have to do today which is postprocess the module to make it ready to go onto the web.

@tlively
Copy link
Member

tlively commented Jul 23, 2019

Unfortunately we can’t call static constructors from the start function because they might want to call out to imported functions which might in turn want to call back in via exported functions, but those aren’t available yet during the start function. All the memory initialization could safely be done in the start function, though.

In addition, TLS initialization requires an allocator, which lld doesn’t not know about, so it needs to be supplied by the runtime.

@alexcrichton
Copy link
Collaborator Author

Ah that's true, but I think that something like a "boot" function of some form could still be exported. That being said the more I think about this the more I'm realizing it may just not be feasible. When spawning new threads something needs to allocate a stack and something needs to allocate space for TLS variables, but that can't really be synthesized easily with the linker. It may end up being the case that the only thing which can happen automatically is the memory initialization step, but having that "automatic" as part of the start function still seems reasonable!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants