Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why text? #119

Closed
jcayzac opened this issue Jun 17, 2020 · 4 comments
Closed

Why text? #119

jcayzac opened this issue Jun 17, 2020 · 4 comments

Comments

@jcayzac
Copy link

jcayzac commented Jun 17, 2020

The ability to link and focus on text fragments only seems rather limited. Why if the content is non-textual?

The ability to focus and highlight any part of a document using a selector grammar, being that of XPath or CSS, would have much more value IMHO. Both grammars also include matching text content.

@bokand
Copy link
Collaborator

bokand commented Jun 22, 2020

Using a CSS selector was our initial attempt. This is detailed in the explainer. In summary:

  • Allowing arbitrary CSS selectors poses much larger attack surface
  • Creating a subset of the selector syntax proved to be rather complex
  • We expect text is more stable/robust
  • Text is easier for less-technical users to reason about

I think there's interesting use cases for a more general design and I wouldn't be opposed to it but there's a lot left to work out about how it should work.

@bokand bokand closed this as completed Jun 22, 2020
@jcayzac
Copy link
Author

jcayzac commented Jun 25, 2020

Thanks. The timing attack concern makes sense. Maybe these things should be decoupled, with the current proposal targeting text only and a sister proposal targeting structure only but forbidding matching text or pseudo-elements.

@bokand
Copy link
Collaborator

bokand commented Jun 25, 2020

It's not just timing. This becomes an issue if an attacker can determine that the text fragment succeeded (e.g. by finding some way of observing scroll, e.g. see #76). That should be non-trivial and we've taken steps to prevent it but 1) bugs happen 2) it's possible for pages to have a rare combination of attributes that make it possible.

Being able to discover text is bad but usually limited in damage. More arbitrary selectors of DOM would be much worse as pages can store CSRF tokens and other security critical information. Given the risk I think the expressiveness of such a proposal would have to be quite limited.

I think there's a strong use case for images in particular and have received requests from multiple parties for that. I think it's a natural addition and if we'd do images I think having something more generic for "resources" would make sense (e.g. videos, audio, etc.). I'm not sure of use cases beyond this (it'd be useful to list if you do have), perhaps a very limited set of selectors (e.g. _tag name_[src=_resource url_]) might work.

@jcayzac
Copy link
Author

jcayzac commented Jun 25, 2020

More arbitrary selectors of DOM would be much worse as pages can store CSRF tokens and other security critical information.

Exactly, so that "other" matching mechanism probably shouldn't allow values (text nodes and attribute values) in the selector grammar, only structural selectors, e.g. #~element~.container section:nth-child(2) > video:nth-child(1).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants