You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The validator cannot be built on top of the Dom\Document we are using for the sanitizer and optimizer, as it requires precise line/column/length coordinates for pinpointing validation issues in the source files.
The NodeJS validator uses a SAX parser to traverse the HTML, with the actual validation engine being a handler that gets triggered by the SAX events, i.e. startTag(), endTag(), ...
After looking at existing HTML SAX parsers in PHP, my conclusion is to port over the parser implementation from NodeJS instead of reusing an existing PHP HTML SAX parser, for the following reasons:
no implementation was recently maintained;
third-party dependencies should be avoided whenever we can for the toolbox;
a lot of the hard-coded logic of the parser is already found in the toolbox because we needed parts for other tools...
...therefore it can be ported with only modest effort.
The text was updated successfully, but these errors were encountered:
As already discussed in the meeting, Masterminds HtmlParser doesn't fulfil the condition for solving the original issue ampproject/amp-wp#3293, and I'd like to avoid pulling it in as a dependency just for the SAX parser, knowing that there's no guarantee that the parsing will result in the exact same tokenization than the one from the NodeJS library. Given the porting work is not huge, avoiding this dependency seems preferable to me.
The validator cannot be built on top of the
Dom\Document
we are using for the sanitizer and optimizer, as it requires precise line/column/length coordinates for pinpointing validation issues in the source files.The NodeJS validator uses a SAX parser to traverse the HTML, with the actual validation engine being a handler that gets triggered by the SAX events, i.e.
startTag()
,endTag()
, ...After looking at existing HTML SAX parsers in PHP, my conclusion is to port over the parser implementation from NodeJS instead of reusing an existing PHP HTML SAX parser, for the following reasons:
The text was updated successfully, but these errors were encountered: