Why text? #119

jcayzac · 2020-06-17T07:52:34Z

The ability to link and focus on text fragments only seems rather limited. Why if the content is non-textual?

The ability to focus and highlight any part of a document using a selector grammar, being that of XPath or CSS, would have much more value IMHO. Both grammars also include matching text content.

bokand · 2020-06-22T19:29:55Z

Using a CSS selector was our initial attempt. This is detailed in the explainer. In summary:

Allowing arbitrary CSS selectors poses much larger attack surface
Creating a subset of the selector syntax proved to be rather complex
We expect text is more stable/robust
Text is easier for less-technical users to reason about

I think there's interesting use cases for a more general design and I wouldn't be opposed to it but there's a lot left to work out about how it should work.

jcayzac · 2020-06-25T02:12:15Z

Thanks. The timing attack concern makes sense. Maybe these things should be decoupled, with the current proposal targeting text only and a sister proposal targeting structure only but forbidding matching text or pseudo-elements.

bokand · 2020-06-25T03:38:44Z

It's not just timing. This becomes an issue if an attacker can determine that the text fragment succeeded (e.g. by finding some way of observing scroll, e.g. see #76). That should be non-trivial and we've taken steps to prevent it but 1) bugs happen 2) it's possible for pages to have a rare combination of attributes that make it possible.

Being able to discover text is bad but usually limited in damage. More arbitrary selectors of DOM would be much worse as pages can store CSRF tokens and other security critical information. Given the risk I think the expressiveness of such a proposal would have to be quite limited.

I think there's a strong use case for images in particular and have received requests from multiple parties for that. I think it's a natural addition and if we'd do images I think having something more generic for "resources" would make sense (e.g. videos, audio, etc.). I'm not sure of use cases beyond this (it'd be useful to list if you do have), perhaps a very limited set of selectors (e.g. _tag name_[src=_resource url_]) might work.

jcayzac · 2020-06-25T08:02:19Z

More arbitrary selectors of DOM would be much worse as pages can store CSRF tokens and other security critical information.

Exactly, so that "other" matching mechanism probably shouldn't allow values (text nodes and attribute values) in the selector grammar, only structural selectors, e.g. #~element~.container section:nth-child(2) > video:nth-child(1).

bokand closed this as completed Jun 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why text? #119

Why text? #119

jcayzac commented Jun 17, 2020

bokand commented Jun 22, 2020

jcayzac commented Jun 25, 2020 •

edited

Loading

bokand commented Jun 25, 2020

jcayzac commented Jun 25, 2020

Why text? #119

Why text? #119

Comments

jcayzac commented Jun 17, 2020

bokand commented Jun 22, 2020

jcayzac commented Jun 25, 2020 • edited Loading

bokand commented Jun 25, 2020

jcayzac commented Jun 25, 2020

jcayzac commented Jun 25, 2020 •

edited

Loading