Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: url_path(hostname: String) query #306

Open
dopry opened this issue Jan 20, 2023 · 7 comments
Open

feat: url_path(hostname: String) query #306

dopry opened this issue Jan 20, 2023 · 7 comments

Comments

@dopry
Copy link
Collaborator

dopry commented Jan 20, 2023

Headless builders typically represent a single site and they know which site that is. Currently headless sites need to parse url_path to stripe out the 'site path' from a given set of pages. To simplify URL construction for headless sites it would be nice if they could fetch url_paths relative to a given site.

  • Given hostname is not provided,

    • Then the path relative to default site is returned.
  • Given hostname is provided,

    • When a site exists for the given hostname
      • When the page is the Root Page
        • Then '/' is returned (this is to distinguish from the empty string returned when site is not matched)
      • When the page is not in the given site
        • Then an empty string is returned
      • Then the path relative to the site matching the hostname is returned
    • When a site does not exist for the given hostname
      • Then an empty string is returned
@zerolab
Copy link
Member

zerolab commented Jan 20, 2023

Drive-by note:
what about something like

{
    site(hostname: "my.domain") {
        pages(urlPath: $path) {
            title
        }
    }
}

https://wagtail-grapple.readthedocs.io/en/latest/general-usage/graphql-types.html#siteobjecttype

This is off the top of my head, while in-between tasks

@dopry
Copy link
Collaborator Author

dopry commented Jan 20, 2023

@zerolab so based on you suggestion the PageInterface.url_path resolver would vary the resolved urlPath based on whether it was a child of Site or not?

@dopry
Copy link
Collaborator Author

dopry commented Jan 30, 2023

@zerolab I figured out why I've been getting None for urls... when you're running headless and don't mount the wagtail urls page.url will always return None, see: wagtail/wagtail#9984
I found a workaround to mount the urls in an unreachable location.

urlpatterns = [
    path("admin/", admin.site.urls),
    # the order of grapple_urls and wagtailadmin_urls is important here. When initially implementing
    # if wagtailadmin_urls were first, then the /graphql and /graphiql endpoints were unreachable.
    path("", include(grapple_urls)),
    path("", include(wagtailadmin_urls)),
    path("documents/", include(wagtaildocs_urls)),
    ## the following lambda is meand to be a catch-all to make the wagtail_urls inaccessible
    path("", lambda: HttpResponse("404 Not Found", status=404)),
    ## load wagtail_urls so that wagtail_serve is mounted so RichTextBlock url generation works
    ## properly. This is a work around for https://github.com/wagtail/wagtail/pull/9984
    path("", include(wagtail_urls)),
]

Once I got this setup I did a basic multisite experiment

{
  site(hostname: "default.com") {
    hostname
    page(id: 66) {
      title
      url
      urlPath
    }
  }
   page(id: 66) {
      title
      url
      urlPath
    }
}

and got the response

{
  "data": {
    "site": {
      "hostname": "default.com",
      "page": {
        "title": "Page Test",
        "url": "http://test.com/page-test/",
        "urlPath": "/default/test/page-test/"
      }
    },
    "page": {
      "title": "Page Test",
      "url": "http://test.com/page-test/",
      "urlPath": "/default/test/page-test/"
    }
  }
}

It looks like we need to update site(hostname: "") to somehow trick Site.find_for_request, and it looks like we would do that by overwriting request._wagtail_site. I'm not sure if that would work in a scenario like

{ 
  site(hostname: "default.com") { pages { url } } 
  site(hostname: "test.com") { pages { url } } 
}

There may need to be some architecture work done at the core level to divorce site resolution from the request, or we're going to need to figure out how to shim in replacements for the core components that depend on that in the grapple resolvers....

@zerolab
Copy link
Member

zerolab commented Jan 30, 2023

Thank you for furthering this.

Will do a deeper dive during the weekend. The best place to fix this would be core in that to be truly headless we should not necessarily rely on wagtail_urls.

One thing (off the top of my head) is to try and make use of wagtail.models.sites.get_site_for_hostname in our site queries with some monkeypatching.
Or... perhaps wagtail-headless-preview as preview is one thing you absolutely want to have working when in a headless context 🤔

@dopry
Copy link
Collaborator Author

dopry commented Feb 2, 2023

I think we could also just override the url resolver in the short term to accept a hostname... but it would be nice if core query sets allowed us to specify an alternate hostname for url generation similar to what you proposed here... As it is, multisite domain resolution doesn't behave according to my expectations. In my example above the default site is default.com. I would expect that if I queried a page using the default site home page as the root of my query it would return as default site. What is actually happening is domain for the first parent to the page is being used... It's probably because I don't know the use cases that nested multisite support was built around...

@seb-b
Copy link
Contributor

seb-b commented May 8, 2023

+1 for the idea to have some way to get the relative url without the base path, would really reduce a bit of our back and forth when generating static paths and fetching pages.

Could there be a new field on the page interface specifically for this? It could be site agnostic, url_path seems to work ok even when the page urls aren't registered (we don't do this either and have ran into some weird urls problems)

class PageInterface(...):
    relative_url = graphene.String()
    
    def resolve_relative_url(self, info, **kwargs):
       parts = [part for part in self.url_path.split("/") if part]
       if len(parts) == 1:
          return "/"
       return f{"/".join(parts[1:])}/"    

@dopry
Copy link
Collaborator Author

dopry commented May 10, 2023

Unfortunately, we can't can't strip off the first folder in a path. With multisite setups any a site/hostname can use any page as a root. We need to know the 'site' in order to figure out the relative path, hence my proposal to pass hostname as an argument.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants