-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What makes CropTarget special to require an asynchronous creation? #17
Comments
I think it's easiest to answer with Chrome as the concrete example, thereby keeping the discussion simpler. This is generalizable to other browsers. Chrome has a central "browser process," and documents are hosted in "render processes." (For simplicity, let's pretend every document has a dedicated render process.) Let's examine multiple documents embedded together in another document, and all living together in the same tab. Agains for simplicity, we'll call the document where the crop-target lives SLIDE, and the document which holds the track we'll call VC. This I find easier than talking about D1, D2 etc., as we can have a practical example in our mind's eye. If necessary, map (SLIDE, VC) to (D1, D2). CropTarget is essentially a token. That token is produced in SLIDE and passed elsewhere. It may be passed directly to VC or indirectly. A design that allows for it to be safely be passed through other documents is preferable, as it requires less care of developers. To be safely passed through other documents (and therefore processes), it should encode the minimum amount of information. This is mostly true for JS-exposed information, but non-JS-exposed information that lives in the render process that holds the token, is also theoretically accessible to malicious documents under certain conditions. So, to keep the minimum amount of information, the token should not actually encode the information that it originates in SLIDE. Instead, this knowledge will be recorded in the trusted browser process as a mapping of T<->SLIDE. When the token is minted, this mapping has to be recorded in the browser process, which requires IPC, which means that minting the token should be asynchronous. (Minting can fail if the browser process refuses to record more mappings.) To generalize away from Chrome, other UA-implementers will either run into similar implementation constraints, or else they can just return a pre-resolved Promise and not worry about it. |
When transferring a MessagePort or a RTCDataChannel from SLIDE to VC, there is a need to identify that the transferred object is originating from SLIDE, just like CropTarget. How is it possible to be synchronous for those objects but not for CropTarget? |
There could be multiple documents in between SLIDE and VC. Each hop only exposes its own origin. So if the token is sent SLIDE->TOP_LEVEL->VC, then VC would not know the origin of SLIDE, only of TOP_LEVEL. |
Again, the same could be said about transferring a RTCDataChannel from SLIDE->TOP_LEVEL->VC. |
I don't know how common it is to transfer RTCDataChannel objects. (I suspect somewhat uncommon...?) CropTarget is designed for transfer, and would have been wholly unnecessary otherwise. Given the cheap cost of ruggedizing CropTarget against being utilized by an attacker, I see it as desirable. I don't know enough about RTCDataChannel in order to say whether this was necessary there too but prohibitively complicated, or any other reason. |
What about MessagePort then? MessagePort are very common and designed specifically for being transferred. |
Does |
It remains bound to the MessagePort it was created jointly through MessageChannel.
For RTCDataChannel, WHATWG Streams and MediaStreamTrack yes, always, the 'source' remains in D1. MessagePort are a bit specific in that they are created as a pair through MessageChannel that can keep communicating with each other. MessageChannel.port1 can be synchronously transferred to another process, as well as MessageChannel.port2. port1 and port2 remain bound together in any case. |
Another example that might be closer to CropTarget is OffscreenCanvas. |
@eladalon1983 said in #11 (comment):
cropTo already returns a promise, so querying "all" documents in the single viewport captured to identify the crop target, seems reasonable to me. Cost to implementers is low on the priority of constituencies when it comes to shaping APIs. |
I think you mean "browsing context" where you said "viewport".
Cost to implementers is low priority, but non-zero. It's only a problem if accommodating implementers comes at a non-trivial cost to a higher-priority constituency. Does it? |
No, each iframe has its own browsing context, which is nested (immediately or multiple levels under) the top-level browsing context. I loosely mean all documents in this capture, which maybe translates to the top-level browsing context's document and all documents in its nested browsing contexts of iframes that intersect the viewport.
Looking up the CropTarget shouldn't be the bottleneck in extreme cases, so this should scale fine. |
I've been carrying that mistake around for a while. Thanks for enlightening me.
IPC with multiple processes is neither simple, nor performant, nor robust. The cost to implementers is greatly reduced when avoiding this. What's the downside to any other constituency? |
This is a known problem that is solved in modern browsers. A transferred WritableStream should not do multiple IPCes to locate the process of its sink when writing new values.
It is more costly to both web developers and web engines. It is not consistent with existing Web APIs AFAIK. |
Given this is a solved problem for other APIs and given these solutions are applicable to CropTarget as well, can we converge on moving away from Promises? |
I believe I have explained why we have implemented things this way in Chrome. This is a real issue.
The cost to Web developers is negligible. Crop-target production is a rare occurrence, it does not matter to the Web developer if it complete asynchronously. I can pull in Web-developers currently using Region Capture (in origin trial) for major products with a high level of polish, and they could comment as much. Would you find that convincing? (If not - please pull in a similarly qualified Web developer who could comment to the contrary.)
Let's converge towards Promises, given that it's an important implementation issue for Chrome. (And I believe that when the time comes for Safari and Firefox to implement this, they'll find it equally problematic.) |
You explained a real issue, that I would classify as an optimization problem (though at some point you alluded to security concerns as well). Is that correct? The argument is that the same optimization issue exists for already deployed APIs, and was solved without making use of promises. If we do not want to follow this preexisting pattern, we need a clear justification. To move forward, let's try with more narrowly focused questions:
As a recap, here are some APIs that I think face the same issue:
I implemented in WebKit some of the APIs that I think are facing the issue you are describing. Answering the above questions might help figuring out which problems I might be overlooking. |
Note: When we started trying to transfer MediaStreamTracks in Chrome, the synchronous nature of the transfer gave us major problems in implementation. So the idea that synchronous = solved problem is not universally true. |
It is great to hear Chrome implementation on transferring tracks is making progress. It is also great to hear Chrome implemented the transfer of synchronously-created MediaStreamTracks in a secure and efficient manner.
The point is more about create-then-transfer-synchronously is a solvable problem (do we all agree?), it was solved multiple times already. To break this existing pattern, compelling motivations seem necessary. Another reason not to use promises: what happens in case the element goes transferred to another document, which then gets destroyed (and the element gets reattached to another document), all of this during creation of the CropTarget. Should we reject the promise? A synchronous API is simpler for edge cases as well as for web developers. |
I had the same concern. I lost that concern when I learned that Elements are not transferable. (But do correct me if I am wrong.)
It is generally possible for a CropTarget to outlive its Element, and that's OK. The document discusses what happens then. The summary is:
|
The point I am making is during the creation of the CropTarget, i.e. when the promise is not yet settled. |
The thing I am still not getting is - what's going to happen to the Element during that time? The worst that could happen is that it gets garbage collected. I don't think that's a problem. It doesn't seem to matter if the Element is GCed before/after its CropTarget is produced. (And getting GCed after CropTarget-production should be a normal occurrence.) |
I want to settle this discussion about Promises. And I don't want to leave your message unanswered. Let's briefly examine the three other APIs you've brought up:
I hope we can proceed without trifurcating the discussion. I did not want to leave your points unanswered, but deep-diving into these three examples will be unwise. We have an independent engineering question here, and it can be resolved on its own merits. These precedents do not seem applicable. Nor should we assume that mistakes and compromises were not made in the design of these other APIs. Let's discuss our own case on its own merits. I believe I've made a compelling case for why
Let's go with an asynchronous produceCropTarget(). |
I should have posted #11 (comment) here. To summarize it: At least two highly skilled technical people were confused by the current API into thinking it does more than it does. That's a cost to web developers that we should and do avoid on the regular, as @youennf shows. |
This is an incorrect implementation since produceCropTarget is infallible. |
I humbly disagree. I believe I have argued the point.
You gave two examples. I present you you a third - multiple targets. Assume the application has two sub-sections that the user can switch between sharing by clicking an in-content button. If I were to implement this app, I'd see a click from the user as telling me: (i) stop capturing (and/or transmitting-remotely) the old target's content and (ii) start capturing the new target's content. I'd pause the track until (ii) is completed. This depends on the the app; it would only matter to some. |
It seems we are stuck on this complexity assessment. I hope we agree though that Chrome's current implementation is putting additional complexity on the web page shoulders for the sake of this optimisation.
I gave examples where pause is not required. Your third example is a generalisation of the second example, where pause is not required, at least to the apps I can think of.
I do not see what would drive user to expect pausing here. If the app is so keen to do that kind of optimisations, they can do it on their side in a faster way than any UA-side optimizations:
I still do not understand which apps would actually want to do what you described, and for which reasons. |
Holding multiple tracks but only using one at any given time, may incur performance costs, or be subject to implementation-specific practical limitations. However, I don't think this is core to our discussion, so I suggest we drop this particular sub-topic. It should be pretty clear that applications calling |
Not at all cost, see replaceTrack precedent for instance. |
@eladalon1983 I'm sorry are you still arguing for function totallySyncFunction() {
getCropTargetAndSendItToCapturer(element).then(() => {});
} Back to performance. I've shown a sync minting API in JS is actually faster because otherwise you're essentially serializing the two steps of generating the key and postMessaging it. This seems to undermine your hypothesis that your design is faster. Do you disagree? @yoavweiss's response suggests your design only eliminates the “token hasn’t yet arrived” case, eliminating the need for timeouts and lengthy failures in case an invalid token is used by VC. So your design sacrifices optimal success path of And in your response here you're finally claiming performance doesn't matter? Color me confused. Or are you criticizing my measurement fiddle for not demonstrating when the performance I measure would matter? If so, it's true I struggled to imagine an example when this would matter, but that would seem to be on you to produce. You've mentioned multiple targets... please show how the shape of
You seem to be in the "no-downside crowd", claiming there's no downsides in the face of me clearly explaining them (having to reevaluate state assumptions, fueling the proliferation of unnecessary async, non-idiomatic JS confusing web devs into thinking the API does more than it does, and finally, performance). Let me blow up my footnote that spelled out the relevant point it seems you flat-out missed:
|
Messages only make sense in their proper context.
Did you read my response?
I don't think this exaggeration promotes mutual understanding. Let's please keep to the arguments raised and not devolve to misrepresenting each other's arguments even as temporary polemic tools. |
I've asked you for the downsides. To your mention of these specific downsides, I answered:
|
Hello there. Aziz Javed from Google Meet here, working with Lindsay from Google Editors. We've already built a very significant feature based on this API's origin trial. I'm happy to inform y'all that we were not at all inconvenienced by the async nature of produceCropId. Exposure on MediaDevices vs. on CropTarget is likewise not at all a concern for us; either approach works. Generally speaking, all alternatives debated here are perfectly reasonable for us, so long as they're performant. We'd favor whichever API shape would allow this feature to be shipped in as many browsers as possible, as early as possible. |
@azizj1 glad to hear this! I believe we've already resolved exposure on MediaDevices, so for me that API shape is what @youennf proposed in #11 (comment): new CropTarget(element) Sorry this is taking so long to get to the bottom of. |
@jan-ivar, have you read this part of Aziz's message?
|
Quickly discussed at today's editor's call, hopefully we can discuss and resolve this issue at next WG interim (next Tuesday). |
Repeating from elsewhere for better visibility: CC @aboba and @dontcallmedom |
For those interested, the rescheduled meeting will take place 2022-06-23. Seem more details here. |
After looking over the slides and listening to the arguments, my take is the following:
|
It says to use sync if the "API ... will not be blocked by ... inter-process communication." (emphasis on blocked) I hope I showed that any IPC in
We talked about this on the call, and my recollection was this argument reduced to convenience of implementation, i.e. not a need.
Exactly. [in Chrome: mint a dud |
You missed "locks" inside the elipsis
It can though
"disputed" != "settled". There's still an ongoing discussion.
That would create a significantly worse developer experience, as failures would be disconnected from the code/origin/party that causes them. |
Are there "locks" involved other than IPC? The FPWD says it cannot fail is what I meant. I agree about settling #48 first. Who "caused" the (global?) resource error seems important only if they can do anything about it. We have no API for relinquishing CropTargets, only minting more, so any recovery would violate § 5.3. Don’t expose garbage collection. |
Locks were suggested as an alternative for renderer->browser IPCs, for off-main-thread renderer<=>renderer communication. I was suggesting that such a solution also requires an async API, to avoid the lock blocking the main thread. |
I don't think it does. Any API can trigger in parallel work. Returning a promise is only valuable if the caller needs the result (success being a result). If we "avoid blocking" then it says to use sync (if the "API ... will not be blocked by ... a lock.") |
If Moving some or all resource allocation to Recently I was asked to investigate a bug in a sync API that is supposed to always return immediately. In the most common use cases, it did return immediately (< .1 ms), no matter how much load was placed on it. However, in a particular (reproducible) set of circumstances, it blocks for 30-50 ms. The cause appears to be a mixture of resource (de)-allocation and blocking IPC. It's probably a bug not a feature, but sometimes sync APIs might not be able to return immediately for unanticipated reasons. I've also seen sync APIs where the "in parallel work" turned out to be more complicated than expected. For example, in ORTC the |
Causality runs the other way: Letting CropTargets fail allows for implementations vulnerable to exhaustion attacks. Not doing so, doesn't. A sensible implementation should be invulnerable to resource exhaustion attacks, by simply not tying resources to a token so easily created by anyone.
What resource allocation is needed? A sensible Chrome has implemented a neat but premature optimization, and refuse to implement the fallback needed to hide the resource exhaustion they’ve exposed themselves to. I don't find the idea that creating a |
The spec does not compel anyone to replicate our alleged neat mistakes. |
The issue is not really about other UA implementations but about web developer impact. It would be good to understand whether Chrome has a plan to fix this global resource exhaustion issue. I already said that but working on #48 may allow to make progress on this particular issue. |
As previously explained, the global limit is an artifact of the current implementation, and can be changed in a number of ways. The minimal step forward is to make the limitation per-iframe rather than per-tab; that much is likely quite simple, and I doubt more would be needed.
I'll be happy to share more about my plans for prioritizing work in Chrome if you would be interested in likewise sharing Apple's timelines for implementing this feature. Given the high engagement, may I read that both Mozilla and Apple are prioritizing this work highly and intend to accomplish it in the near future? |
We started discussing this topic (whether CropTarget creation should use promise or not) as part of #11 and it seems best to use a dedicated thread for this issue.
@eladalon1983 mentioned implementation issues that make using promises desirable, in particular security and potential race conditions if creating synchronously a CropTarget since CropTarget has to keep a link to the process it was created in. The following case was provided:
@eladalon1983, please correct this description if it is not aligned with your thoughts.
Existing objects like MessagePort, WHATWG Streams, RTCDataChannel or MediaStreamTrack can be created-then-postMessaged synchronously and UAs are implementing this today, hopefully without the security/race condition issues.
AIUI, it seems consistent to use the same approach to CropTarget (synchronous creation), except if we find anything specific to CropTarget that actually prevents this existing pattern.
The text was updated successfully, but these errors were encountered: