Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPDS1: html text constructs must be unescaped #12

Open
llemeurfr opened this issue Dec 3, 2020 · 0 comments
Open

OPDS1: html text constructs must be unescaped #12

llemeurfr opened this issue Dec 3, 2020 · 0 comments
Assignees

Comments

@llemeurfr
Copy link

The Columbia feed, for instance, contains html summaries in OPDS entries. These are formatted like:

<summary type="html">&lt;p&gt;Benchmark for Faithful Digital Reproductions of Monographs and Serials. Version 1. .... Columbia University Catalog: go to CLIO&lt;/p&gt;
      &lt;p&gt;
   &lt;a href="https://clio.columbia.edu/catalog/14642100"&gt;Go to catalog record in CLIO.&lt;/a&gt;
      &lt;/p&gt;</summary>

Because an OPDS 1 feed is an extension of an Atom feed, rules of Atom feeds apply -> https://tools.ietf.org/html/rfc4287#section-3.1.1.2 in particular.

If the value of "type" is "html", the content of the Text construct MUST NOT contain child elements and SHOULD be suitable for handling as HTML [HTML]. Any markup within MUST be escaped; for example, "
" as "<br>". HTML markup within SHOULD be such that it could validly appear directly within an HTML

element, after unescaping. Atom Processors that display such content MAY use that markup to aid in its display.

Such content (of type html) must therefore be unescaped before being injected inside the webview, after some security cleaning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants