About the implementation of HTML parser #466
Closed
byted-meow
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Implementation of HTML parser
Background
HTML Loader need a good way to parse user-created HTML and resolve its dependencies. A spec-compliant, performant, and multi-functional HTML parser is required.
None of webpack, vite or parcel are using html parser implemented in rust. So we need to do it ourselves.
We need to parse the html to AST and design a good API to operate on it which is used in the html loader and its plugin system.
HTML parser
Requirements
To load a html file as entry, we need a html parser that implements the following methods:
Parser Candidates
lol_html
swc html parser
kuchiki based on html5ever
Choice
I perfer to choose from between lol_html and swc_html
When compared, swc_html is:
And lol_html is:
Minimizer
Minimizer is not required as
Candidates
minify_html
swc_html_minifier
Choice
If we choose swc_html, we can use swc_html_minifier together.
If we use lol_html, and we need a minifier, we can use minify_html.
API Design
Do we need to wrap detailed implementation to low level, and create an abstract layer to plugin user?
Core API
core api works with or without abstract layer
modify_ast
modify_text
Shortcuts
shortcuts api works only with abstract layer, as we need to recreate element and their attribute presentation and implement the operation methods. Something like:
Conclusion
Beta Was this translation helpful? Give feedback.
All reactions