-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
URLPattern usage #52
Comments
For compression dictionaries, the plan right now is to not allow regexp at all and bring URLPattern's use closer to what we had in place already with with wildcard syntax (which there has already been pushback on the IETF side about). For what it's worth, servers largely don't need to parse the URLPattern and use it for matching. They provide the match pattern (likely an opaque application-controlled string) and then the client does the matching. The server largely only cares if the requested resource is available with the requested dictionary after the client has already decided which dictionary it should use. |
I think that really depends on the client implementation, no? One might want to implement the matching in the networking layer and as such similar concerns apply. And that also seems like a client-server-centric view as server-server will also want to use this technology. |
Yes, sorry - in that case one of the servers is the client to another server. Client more in the sense of HTTP, "thing that sends a request", not necessarily end-user device. Middle-boxes like CDN's or other reverse proxies are largely still behaving like "servers" for compression dictionaries and don't need to store and match dictionaries to a given path and would still rely on the end client to do the matching. |
It's probably also worth noting that the |
I think we might still be talking past each other, but a variant of URLPattern that does not have regular expression support makes sense to me. |
It's always been the goal of URLPattern to support some call sites rejecting URLPatterns that contain regexp groups. The original plan was just to have the calling specs do something like:
where [parsing] and [has regexp groups] are @yoshisatoyanagisawa expressed concern about giving this "has regexp groups" predicate only to spec-writers, and not to web developers. That is, he thought it would be too difficult for web developers to debug why their URL patterns were getting rejected by (for example) service worker static routing, since His proposal is whatwg/urlpattern#191. Then, web developers can figure out easily if an input string |
Hi @pmeenan, @jeremyroman has seen hints somewhere that there's a desire to split the URL pattern usage into some sort of (That said, I can't find any matches for those keywords in current public documentation, so maybe we're off base?) |
It's part of the in-flight changes to the draft with the move to URLPattern from the wildcard string. Compression dictionary transport's use of a match pattern In the vast majority of cases the search part will likely use the default wildcard so it's only the path portion that most users will be using. Is there a problem with specifying a more-rigid URLPatternInit instead of adding the complexity of the USVString case? Otherwise we'd also need a bunch of language about the subset of pattern strings that are allowed. |
Yes, this fragments the API used across the web platform. In particular, you're not using URLPatternInit, or the actually-recommended URLPatternCompatible; you're using a subset of URLPatternInit's fields.
This isn't true. You can just ensure that the matches always return false. Concretely, if someone supplies |
Do you have a link to docs for URLPatternCompatible? I can't find it in the spec or explainer for URLPattern and Google search isn't finding it.
We need to do the same-origin check at the time the dictionary is validated and stored to make sure that the pattern refers to the same origin as i.e. If the dictionary resource is It's certainly doable, but practically that means constructing the URL pattern, constructing a URL from the URLPattern components and then comparing the origin of the reconstructed URL to the origin of the dictionary request to make sure they are the same. That's compared to guaranteeing the origin matches by not allowing any of the origin components to be specified and writing spec language that says that even though the string URLPattern constructor is being used, the string I don't mind making the change but it would help if the requirement to always uses strings and not allow for dictionaries or other component-based construction for URLPattern was solidified (as best as I can tell, it is being discussed but hasn't been merged). |
It's in an in-flight PR at the moment; preview available here: https://whatpr.org/urlpattern/199.html#typedefdef-urlpatterncompatible
Naively I wouldn't have expected additional logic to be required, except possibly to assist developers trying impossible things. I'm imagining that:
In that case, even if The upside of using the general syntax is greater consistency with other uses of URL patterns (and more likely to set a precedent for other uses in HTTP header fields that may have slightly different constraints). Though at least here I think the transformation is not too bad ( The downside of the general syntax is that it might create a slight expectation of being able to match things that aren't actually possible in this context, like cross-origin URLs and URLs with a fragment. I think this would be stronger in cases where this would be processed by a server that might not know the protocol and/or port the client is seeing (due to reverse proxy) and so constructing patterns with wildcards in those fields might be effectively necessary. Aside: that draft is currently using the request URL (unclear to me if it's the initial or final one); hopefully that corresponds to the URL after redirects (and possibly after service worker magic, if this processing might occur after that -- not sure where this is layered). |
Thanks. I'll have to wait for the PR to get resolved before it is something I could reference but as it looks right now, URLPatternCompatible is specific to the Javascript API. Probably more interesting is the JSON interface below it which can take a sparse map of URLPatternInit components which is not terribly different (just how the map is structured).
Not quite. There is no mandate on where or how the matches should be stored but, at least with Chrome, they are partitioned by network isolation key (document origin, not request origin). It's certainly possible to also store the original request origin for each dictionary and that can be an implementation detail but it's a lot less efficient than not allowing the invalid dictionary to be stored in the first place. The same-origin check still needs to be specified at either match time or storage time even if it is just to mention that the dictionary is only valid if the URLPattern is same-origin with the request.
As far as HTTP is concerned, redirects are their own requests that happen to have a 301/302 response code so this is guaranteed to be the final request that actually serves the dictionary. It's a browser thing that redirects are transparently handled under the covers. |
https://whatpr.org/urlpattern/199.html#urlpattern-build-a-urlpattern-from-a-webidl-value looks fine if you pass a string, but it does seem like a dedicated string entry point (that calls this under the covers and maybe also creates a pretend realm or ensures that isn't needed somehow) would help clarify things for non-IDL consumers. I guess that's what the XXX below is saying as well.
How is that guaranteed? What if somebody provided a dictionary body with their 3xx response? |
It doesn't really need to be guaranteed since the 3xx response is for a request on the origin where the 3xx response came from, not for the target location. I can't imagine most clients would store a dictionary from a 3xx response but there's also no issue if they do. |
Sorry, I should clarify. As far as the IETF draft is concerned it shouldn't matter. When the fetch part of the spec is drafted then it should be made clear that the only a |
Closing this out as the latest IETF drafts have integrated with the URL pattern spec with regex disallowed and using the intended integration points (not relying on the JS API) |
Colleagues and I were curious if Chromium had any plans it could share around whatwg/urlpattern#191. As running JS regular expressions in networking doesn't seem like it will fly.
Is the idea to have some kind of safe subset?
cc @domenic @pmeenan @cdumez
The text was updated successfully, but these errors were encountered: