Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document fragment now do not handle 2 siblings inputs #652

Closed
acarpe opened this issue Apr 6, 2023 · 8 comments · Fixed by #696
Closed

Document fragment now do not handle 2 siblings inputs #652

acarpe opened this issue Apr 6, 2023 · 8 comments · Fixed by #696

Comments

@acarpe
Copy link

acarpe commented Apr 6, 2023

Bug Report

with the latest version if I have two sibling inputs (in the erb file):

<div>
  <div id="label-container">
    <label>
      <input type="file" accept="image/*" />
      <input type="hidden" value="" />
    </label>
  </div>
  <div id="after-label"></div>
</div>

Describe the bug

after parsing with Nokogiri.parse instead of Nokogiri::HTML5 it drops the second input and "move" the closing label tag outside the original div.

To Reproduce

you can add this test in document_fragment_test

test "should properly parse two siblings input" do
    raw_html = "<div><div id=\"label-container\"><label><input type=\"file\" accept=\"image/*\"><input type=\"hidden\" value=\"\"></label></div><div id=\"after-label\"></div></div>"
    fragment = StimulusReflex::HTML::DocumentFragment.new(raw_html)
    assert_equal 2, fragment.document_element.at_css('input').children.size
    assert_equal "<div><div id=\"label-container\"><label><input type=\"file\" accept=\"image/*\"><input type=\"hidden\" value=\"\"></label></div><div id=\"after-label\"></div></div>", fragment.to_html.squish
    assert_equal "<div><div id=\"label-container\"><label><input type=\"file\" accept=\"image/*\"><input type=\"hidden\" value=\"\"></label></div><div id=\"after-label\"></div></div>", fragment.outer_html.squish
end

Expected behavior

parse correctly the input tags

Versions

StimulusReflex

  • Gem: [3.5.0.rc2]
  • Node package: [N/A]

External tools

  • Ruby: [3.1.2]
@julianrubisch
Copy link
Contributor

Thanks for reporting this! We're already looking at our options...

@arambert
Copy link

arambert commented Nov 8, 2023

Hi,
First of all : thank you very much for this awesome gem!

I can confirm that this bug has larger scope than just 2 siblings inputs, it strips any html after the input.

For instance this HTML will not be preserved:

<div>
  <label>X</label>
  <div>
    <input type="text">
    <span></span>
  </div>
</div>
raw_html = "<div> <label>X</label> <div> <input type=\"text\"> <span></span> </div> </div>"
output_html = StimulusReflex::HTML::DocumentFragment.new(raw_html).to_html.squish
# returns "<div> <label>X</label> <div> <input type=\"text\"> </div></div>" => the span has been removed

This is an important issue for us because we need to be able to add icons after the input, and because of this bug, the icons are arbitrarily removed after each reflex.

This bug is really hard to confirm and to understand when you see it for the first time, I think we can assume that more Stimulus Reflex users have encountered it but could not find the cause.

@tomclose
Copy link

tomclose commented Apr 1, 2024

Just wanted to highlight that this is a really nasty issue to debug and one that is probably affecting a lot of people. For us it manifested as "we upgraded stimulus reflex from 3.4.1 and now seemingly random parts of our html don't make it back to the page". It's not at all obvious that this issue and associated PR are relevant, unless you've already found the root cause.

It looks like #674 is currently on hold, waiting for Nokogiri to make some upstream fixes. @marcoroth would it be possible to merge that PR as an interim fix, so that Stimulus Reflex itself doesn't appear broken in mysterious ways in the meantime?

@tomclose tomclose mentioned this issue Apr 1, 2024
2 tasks
@tomclose
Copy link

tomclose commented Apr 1, 2024

In case it's useful for anyone to know, #674 doesn't handle the following case:

# having a `->` in your input breaks the fix:
str = '<div class="foo"><input data-action="input->autocomplete#search"><span>some text</span></div>'

StimulusReflex::HTML::DocumentFragment.new(str).inner_html
#=> "<input data-action=\"input- /&gt;autocomplete#search\">" # span is missing

I'm sure this is just summarising what others already know, but I now understand that the difficult thing about this that there currently isn't a parser in Nokogiri that can handle the following:

  1. <input><input> (can't be parsed by Nokogiri::XML)
  2. <tr><td></td></tr> (can't be parsed by Nokogiri::HTML5::Document)
  3. <head><title>hello</title></head> (can't be parsed correctly by Nokogiri::HTML5::DocumentFragment)

It seems like the possibilities are:

  1. Wait for Nokogiri to fix Nokogiri::HTML5::DocumentFragment, so it can parse 3
  2. Use a fix like Properly handle HTML fragments with self closing tags #674 that's a bit fragile, but could be adapted to cover the common cases
  3. Make a (different) tradeoff between 1, 2, 3 above
  4. Switch from Nokogiri to a library that can handle all the cases

I don't know what the best path forwards is here, but tend to think that any of 2/3/4 are preferable to 1, assuming that the Nokogiri issue isn't going to be fixed imminently. I've put up #692 as a possibility for option 3. Maybe #674 is still the best thing to go for.

@marcoroth
Copy link
Member

marcoroth commented Apr 5, 2024

Thanks for summarizing this @tomclose! You are right! I think the best way to solve this is to do 4., at least for the time being.

There's an approach where we render out the whole page (as we do now), but instead of parsing and grabbing out the elements out of the parsed HTML via Nokogiri, we could send down the whole page and let the browser parse and grab out the elements we need.

That way, the browser is handling the valid/invalid HTML in both cases: 1) on first page load through the regular Rails request and 2) on re-render after a morph.

Ideally we could fix most of these things in Nokogiri itself. After talking to @flavorjones at RubyConf last year, he mentioned that the bahviour of the HTML5 part of Nokogiri itself might not be right/consistent in all cases. So if we could collect and report all of the weird findings, we might be able to get them fixed in upstream Nokogiri.

I'll see if I can implement a workaround, which might not be the most efficient on wire-size, but at least works the right and expected way so this issue can be unblocked.

@flavorjones
Copy link

👋 Nokogiri maintainer here, this is the first I've heard of these problems other than sparklemotion/nokogiri#3023

I just want to reiterate what I told @marcoroth in person, which is that there is no spec for fragment parsing in XML or HTML4, and so we adopted some conventions in Nokogiri that aren't serving us well with HTML5.

And the HTML5 fragment behavior is well-defined but requires a "context node" be provided because the HTML5 parsing rules are context-dependent -- for example, to parse <tr><td> you would need to provide a "context node" of <table> (I hope that makes sense, I can explain more if necessary or you can peek at the HTML5 spec's "in table" insertion mode section).

So I guess what I'm saying is, it seems unlikely that another library is going to handle your use cases unless and until we can figure out how to accomplish what stimulus-reflex is doing in an HTML5-compatible way. And I'd love to figure out what that looks like in collaboration with y'all.

I'm hoping to get to sparklemotion/nokogiri#3023 in the next week or two, I'm going to be spending more time than previously on open source this month. It might be worth having another real-time conversation about this once I've swapped my mental context back in.

@flavorjones
Copy link

he mentioned that the bahviour of the HTML5 part of Nokogiri itself might not be right/consistent in all cases

This isn't quite right -- the HTML5 parser we're using passes all the current HTML5 tests. The parser is correct and follows the spec. And if it doesn't follow the spec, it's a bug.

I think what I meant to communicate is that the HTML5 behavior is going to be different from the HTML4 behavior in some cases because the HTML5 spec is different and better-specified, and the gumbo parser is spec-compliant.

@marcoroth
Copy link
Member

@acarpe just wanted to tag you and mention that #696 should be solving this issue! If you get a chance, I'd love to hear how it works for you! Thank you!

marcoroth added a commit that referenced this issue May 5, 2024
…t parsing (#696)

This pull request implements the
[`nokogiri-html5-inference`](https://github.com/flavorjones/nokogiri-html5-inference)
to deal with parsing HTML5 fragments to be used with Selector Morphs.

The `nokogiri-html5-inference` gem come about out of the discussions in
#652 and sparklemotion/nokogiri#3023.

Fixes #652, Resolves
#692, Resolves
#674

Big thanks to @flavorjones to talking and working this through with me!

## Why should this be added

This pull request makes the handling of document fragments for selector
morphs more natural/predictable and resolves the last open issues to
release the final versions of StimulusReflex 3.5.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment