Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Server, Streamer] HTTP header rel=prefetch Links, prioritized media-types? #97

Open
danielweck opened this issue May 10, 2019 · 8 comments

Comments

@danielweck
Copy link
Member

Given the necessity to introduce limits in the number of HTTP header rel=prefetch Links (due to server/client limitations in the total number of bytes supported in HTTP headers), should CSS be prioritized over JS over fonts? (which order?)

Right now, the r2-streamer-js implementation generates prefetch Link for these HTTP Content-Types / media types:

["text/css",
"text/javascript", "application/javascript",
"application/vnd.ms-opentype", "font/otf", "application/font-sfnt",
"font/ttf", "application/font-sfnt",
"font/woff", "application/font-woff", "font/woff2"]

...however, there is no prioritization heuristic (JSON document order is used to walk the resources array property of the ReadiumWebPubManifest). Such prioritization algorithm is trivial to implement, so this is not a technical problem, just an important design consideration now that there is an artificial limit in place.

Relevant code diffs:
readium/r2-streamer-js@v1.0.10...v1.0.11
readium/r2-streamer-js@v1.0.11...v1.0.12

Related issue: #96

@danielweck
Copy link
Member Author

danielweck commented May 10, 2019

Example implementation:
https://github.com/readium/r2-streamer-js/pull/45/files

Priority list:

  • "text/css"
  • "text/javascript"
  • "application/javascript"
  • "application/vnd.ms-opentype"
  • "font/otf"
  • "application/font-sfnt"
  • "font/ttf"
  • "application/font-sfnt"
  • "font/woff"
  • "application/font-woff"
  • "font/woff2"

As soon as the number of HTTP prefetch links reaches the maximum ceiling limit (default is 10), the remainder of the prioritized list of prefetch-able resources is ignored. For example, if there are many CSS and JS files, fonts may not be prefetched at all.

@JayPanoz
Copy link
Contributor

A few quick notes off the top of my head, as I’d be expecting this to primarily impact fixed-layout EPUB in the near future.

  1. so first you have vendors recommending one stylesheet per page so if say the publication is 200-page long, you’ll get 200+ stylesheets (because they also recommend a reset so theoretically, one additional file at least);
  2. then you have files with lots of fonts – typical use case seems to be PDF conversion, as PDF allows the subsetting of fonts for each text page used);
  3. worst-case scenario would be 1 + 2.

@HadrienGardeur
Copy link
Contributor

In an ideal world, we would know for each HTML resource which CSS, JS and fonts is used. This would enable us to trigger the prefetch strictly when each HTML resource is requested.

While we could eventually achieve that with some heavy processing, we need something "less than ideal" in the meantime.

Here's my take on this:

  • prefetching can be delegated to the Web Viewer, which would inject the prefetch link in its own HTML and have full control over everything
  • prefetching should become optional in the publication server, with a hard limit on the number of resources and some priorities based on what affects rendering the most (I'll let @JayPanoz chime in on that one)

Prefetching through HTML links won't be affected by the same limitations as the Link header and shouldn't block the browser from doing its normal job.

@JayPanoz
Copy link
Contributor

Well, I guess CSS should be highest, given it’s render/layout-blocking, independently of the rendition (e.g. reflow/pre-paginated).

Fonts are critical for fixed-layout EPUB. Otherwise, browsers have defined they are not a long time ago (cf. font-display CSS prop). That said, you can’t necessarily swap or make them optional in EPUB reflow for example, because of fragmentation/pagination.

Scripts, I don’t have enough insights/data/anecdotes. Personally, I’ve always put them at the end of the <body> tag as a best practice but with all the authoring tools out there, my gut feeling is that in EPUB they might well be parser-blocking (<head>, and not async/defer) by “default.” Maybe that’s something Rookland (?) could talk about?

However, I guess adjusting the prioritization in this doc could be a good start: https://docs.google.com/document/d/1bCDuq9H1ih9iNjgzyAL0gpwNFiEP4TZS-YLRp_RuMlc/edit#

Especially as it can also serve as a ref longer term, for heuristics.

It seems CSS always win over fonts and JS, cf. default priorities section there: https://developers.google.com/web/fundamentals/performance/resource-prioritization

Note however there’s now priority hints because “so this script is async but it’s also important” so it kinda is a can of worms.

@JayPanoz
Copy link
Contributor

Also

Prefetching through HTML links won't be affected by the same limitations as the Link header

I guess it means the server should sent an "Accept-Ranges: bytes" response header in case the user clicks on a link while something is being prefetched?

@stadskle
Copy link

stadskle commented Jan 9, 2020

We found a quirk with the r2 streamer + AWS load balancer today that is relevant to this.

We are running the streamer behind an AWS ALB to create manifest.json on ingestion. That works fine, but we have a tiny fraction of ebooks failing with the not so informative HTTP 502 code. What we discovered is that some books with many CSS files caused the pre fetch headers to pass AWS ALB hard limitation on header size (https://docs.aws.amazon.com/elasticloadbalancing/latest/userguide/how-elastic-load-balancing-works.html). And then all it does is return a 502 with no real message on why this happened.

For anyone needing this to run safely on AWS, I guess some kind of size limit configuration would be useful.

(In our use case pre-fetching has no value, so we will just disable it. But if anyone else are using the streamer on AWS it is something to be aware of.)

@danielweck
Copy link
Member Author

Thank you @stadskle very useful feedback :)

@danielweck
Copy link
Member Author

Note that since version 1.0.12, r2-streamer-js supports a configurable number of prefetch HTTP header links, with a default of 10.
https://github.com/readium/r2-streamer-js/blob/develop/CHANGELOG.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants