-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tab capture control #13
Comments
@eladalon1983 @jan-ivar, this above issue should complete the action item I took during the WG's meeting. |
Thank you for this in-depth feedback!
Just a quick mention that this interpretation might be correct for scroll-forwarding, but is incorrect for zoom-controls. More on this below.
I am supportive of empowering user agents to employ heuristics if they wish; if you want explicit text in the spec, I can add it. But we should retain an explicit way for applications to request the behaviors introduced in this specification. I cannot imagine a perfect heuristic; they are always liable to occasionally (a) fire when undesired and (b) fail to fire when desired.
If you want to reshape the API as follows, for future-proofing... dictionary ForwardedGestures {
boolean wheel = false;
// Future-proof for pinch etc.
};
partial interface CaptureController {
Promise<undefined> forwardGestures(
HTMLElement element,
optional ForwardedGestures gestures = {});
}; ...then I am supportive. Wdys? (Spoiler alert - this shape also solves other issues you have pointed out, and which I address further below.)
Currently, not requested by Web developers, so not a priority for me. But if you want to add that, I won't oppose. Please see the comment I have left in the WebIDL above.
I am not familiar with "keyboard zoom commands". I am, however, familiar with previous/next and page-down/page-up. These could be discussed as possible later extension; at the moment, I am hesitant about the security properties here. Let's start small.
These make me very nervous. Let's not have this in the MVP.
I am absolutely opposed to forwarding clicks. Such actions require a completely different model - one which I have presented in the past ("Video Portal"). But it is not in scope for us right now, and it's not mutually-exclusive with the current model. (That is, even if the Video Portal model were alive right now, it'd only serve some applications, and I'd argue for the introduction of Captured Surface Control APIs for other types of applications.)
Any model that requires opt-in from the captured app, does not solve the problem, because the majority of web pages will not opt-in, and therefore the user will not be properly served.
If you want the spec to also explicitly say that user agents MAY offer gesture-forwarding even when the application does not opt-in, I am happy to add it. Is this the case?
Chrome Security wanted a permission policy, and I agree with their reasoning. However, user agents that wish to impute permission without a prompt, may do so while remaining trivially compliant with the spec. If you'd like the Captured Surface Control spec to state as much explicitly, then I'd gladly add that. Should I?
Sorry, no, there is no such intent at the moment.
Web developers have asked for the ability to control zoom, and I think anyone who ever shared a tab during a video call, can immediately understand what this solves. I am not aware of any gesture that supports this behavior. As I have shown here, pinch-controls are NOT an alternative. Also, even if pinching were identical to zoom - which it is not - we'd still need to serve users without touchscreens, and applications that want to show zoom in/out buttons. Further yet, the read-access provided by Zoom-control is totally in-scope for me, and irreplaceable by pinch-controls.
Fine by me, although I am not aware of any Web developers requesting this. The API shape I have proposed earlier in this comment allows this neatly, because developers can do the following: // Start for e1 and e2.
controller.forwardGestures(e1, {wheel: true});
controller.forwardGestures(e2, {wheel: true});
// Stop e1 only.
controller.forwardGestures(e1, {wheel: false}); (You will note that I have made
Thank you for this concrete suggestion. Humbly, I disagree with it. :-) I see a few issues, but I think it's enough to name just one - it does not allow for a permission policy, for those user agents that wish to include one. Thanks again for this in-depth feedback. In summary, I believe that:
What do you say? |
That is an error in the IDL of the spec, it is indeed intended to be nullable and passing null stops the forwarding. The text of the spec assumes the parameter is nullable. However, the spec allows |
I understand the urge to quickly go to API but I think it is a bit premature.
I would suggest splitting these in sub issues that the WG will discuss in this repo. |
Your previous comment raised specific concerns, and I engaged with them, providing what I believe to be real solutions to real concerns. It would be nice if you could indicate which of these issues were satisfactorily resolved for you.
The length of the previous comment indicated to me that it was an exhaustive list. I now hear that it wasn't. Is the current list exhaustive, or do you anticipate additional issues?
No Web developer has yet requested this.
If the promise returned by
If you have any issues with Chrome, your bug reports would be most appreciated.
To me, it's obviously a
I believe it is permissible for user agents to differ in their implementation of permissions policies, including prompts. On Chrome, sometimes we get one-time permissions, and that too is JS observable. If there is a problem here, please explain it to me.
It is requested by Web developers. Why should it not be part of the MVP?
Such controls are in the captured app. The entire point of this API is to provide controls in the capturing app, without forcing the user to switch to the captured app.
Easily extensible at a later time. No need to delay the MVP on such extensions, which were NOT requested by Web developers. (But please feel free to send a PR if you are interested.)
Any reason for this to be in the MVP?
Doesn't sound like a blocking issue to me. I'm fine either way. I would really appreciate it if you could:
|
Lots to respond to. But first, my impression of https://captured-surface-control.glitch.me/
|
Thanks!
The demo is merely for illustration. For a real use example directly from a professional app, see this instead:
Thanks for trying, but I don't think the Meet or Chrome teams are eager to adopt this UX. Controls belong directly in the Web application. Our user studies show quite poor discoverability for anything outside of the viewport. |
Co-chair Jan-Ivar, I worry that it would not help facilitate reaching consensus, if we were to bring yet more topics into a thread that is so chokeful of them. Youenn is limited to this format because he's not yet sure he can comment on the original repo. Since you have no such limitation, please consider filing separate issues on the original repo. Thank you. Lest I seem to be avoiding your questions, my responses follow. But if at all possible, please spin off further discussion into distinct issues.
Discoverability is always a challenge. The general way to address it is to place the controls in the most immediately relevant place. In the case of dynamic-switching, that means moving the controls closer to the would-be newly-captured surface. In contrast, in the case of mic/camera controls, this means in the viewport, thanks to PEPC.
Product managers and UX experts from both Web developers as well as Chrome browser have expressed the desire to see these specific control in the capturing Web application. I have no reason to doubt their expertise in this matter, impressed though I am with your own UX-design insights. In our estimation, the UX you suggest is NOT preferable. (Ample context provided earlier in this comment, in our previous WG meetings, and during editors meetings.) I am looking forward to seeing Firefox's experimentation with these browser-level controls. You will note that the Captured Surface Control spec does not hinder your ability to bake such functionality into your browser. Please do share any results you can once you have them.
Uniformity across applications is definitely a plus. But so are:
|
With my co-chair hat, I recommend we separate issues in this repo then where we all can engage. I've moved my permission issue to #14 to start. My preference would be for the reporter or proposer to do this, but happy to help. I'm also open to engage here. With my member hat, my feedback was if browsers can reasonably provide zoom through their own UX for now, this seems useful for triaging: it lets us increment, which perhaps means we can focus on forwarding gestures for now. I plan to comment on @youennf's suggestion next, either here on in whatever issue we prefer. |
Agreed. And I further recommend that those who bring up new issues like this one, do so in a new issue.
Chrome does not plan to provide zoom through its UX. Does Firefox? Does Safari? |
I favor this API. Reusing the playback coupling through
I do not favor this API. It allows mistakes like forwarding to a capture other than the one playing. Also, dictionary default rules mean I'm also not convinced apps need to cherrypick behaviors in the first version. I'm not convinced about divs.
Firefox might put zoom controls in the video element (next to its PiP button). |
As we have covered on multiple threads ([1], [2]):
Additionally:
Points 2 and 3 are arguably captured by the previous discussion here, where I am still waiting for your response.
The change from
I'd argue that:
It was only out of a desire to be flexible and accommodating of Youenn, that I offered to change the API this way. But since this is blocking consensus rather than helping it - because Mozilla objects to this new shape - let's go back to
I don't understand this part. Please clarify...?
We have Web developer feedback that this is necessary, and their rationale is clear and reasonable; namely, they want to overlay the video with an element on which they draw emoji reactions and announcements, and these must not block scrolling. If you have reasons to doubt these Web developers, please share your thoughts. But let's concentrate on our role of serving Web developers and users.
First, your alternative does not solve the problem - Web developers need the ability to overlay customized controls zoom-controls at their chosen position, alongside additional app-level controls, or overlaid on top of the video preview tile. Developers need to be able to do either of these, and not all be forced into the same design. Second, you do not provide a real commitment by Firefox to your proposed alternative. You say Firefox "might" do it; this implies it also might NOT do it. |
The UA can prompt instead of scroll (or prompt instead of zoom +/- button). No promise needed.
This sounds more like a general input problem, not something that needs to affect API. Doesn't CSS already have: .overlay {
pointer-events: none; /* Allows pointer events to pass through */
} The UA controls the horizontal and the vertical, so I'm sure we can come up with a rule where it must forward scrolling here. Worst case: a
The attack vector was explained in #14 (comment). Tying scrolling to playback seems a reasonable direction to help UAs mitigate this.
I would leave it up to the UA to make a determination.
I don't believe it does that. If we agree on "multiple target-elements", I assume we agree interaction is carried upstream so it affects all of them.
Both With my co-chair hat, I ask that we not "assign intent or interpretations to other contributors' comments". Mozilla has not formally objected to anything yet. |
This thread starts to be long. How do people think of the list of items in #13 (comment). I see one item seems already covered by #14. I'll comment there. Maybe this is the first question we can sort out? |
That sounds reasonable to me. |
The established pattern is to use permission policies and return a promise through which the application can detect whether permission was granted or not. Your suggestion does not align with any precedents I am familiar with, nor does it sound desirable on its own merits. (For instance, how does the application discover the result of the prompt? Is it a blocking prompt?) Further, this comment shows how Youenn made the perfect argument for permission policies. Namely, he raised the need for user agents to allow revocation. I completely agree, and here we benefit of the fact that permission policies have already been specified and implemented, saving us the need to roll our own - and the mistakes we would surely have made.
We have clear feedback from developers about what they need to serve their users.
It has been sufficiently demonstrated that limiting to specific elements would not mitigate that attack vector. It would only annoy developers, but it won't stop abuse. (Nor do we have reason to expect abuse, btw. So much so that you argue we shouldn't even have a permission prompt...)
I can easily think of how these heuristics could backfire.
I don't understand this comment.
If you have a name that matches with the
You wrote: "I do not favor this API."
I would appreciate:
|
There's no singular pattern. Some examples of instant permission failure without prompt:
But when and where to prompt is often up to UAs. If we wanted the first scroll attempt to fail with a prompt, I don't see why not. That said, I'd prefer #14 to conclude with no permission needed. Just trying to untangle discussion.
const permission = await navigator.permissions.query({name: "captured-surface-control"});
permission.onchange = () => console.log(`changed to ${permission.state}`);
No. 1. By "synchronous" I mean the check is done in the synchronous part of the algorithm before going "in parallel". The "rejected" state is synchronously observable in web console upon function return. |
This issue is related to the https://screen-share.github.io/captured-surface-control/ proposal, based on my reading of the spec and experimenting with https://captured-surface-control.glitch.me/ to control google slides and google map.
First, the use case is fine. The current state of the prototype and spec do not seem sufficient for a consistent user experience though. This somehow casts doubts on the approach (or at least we should understand what needs to happen next to make the user experience good).
The spec/implementation is focusing on specific inputs (zoom and scrolling). Focusing on a more general principle may be beneficial. Some thoughts that I hope can help:
A few additional thoughts:
captureWheel
API is in scope,resize
events as well AIUI (I think that is what is used for zoom-in/zoom-out).setZoomLevel
on the side for now until we understand what it solves that forwarding user gesture approach cannot.captureWheel
name is not great if we plan to extend the user gestures that can be forwarded.captureWheel(HTMLMediaElement e)
is probably not right.captureWheel(HTMLMediaElement? e)
might be better to allow unsetting (seems like a MUST have). And probablycaptureWheel(HTMLVideoElement? v)
might be even better from a type perspective.Based on this, I would look at an API along those lines:
And maybe, a secondary optional API to allow web page to know what is going on:
This kind of API shape adds some flexibility in how much UA wants to forward or not user gesture (say user enables forwarding and in the middle of the call disables it) and unties the API from permissions.
The text was updated successfully, but these errors were encountered: