You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The other day, I received a spam e-mail with a text/html body part like
this:
==============================================================
blah blah<br><br
<a href=http://domain/path.html target=_blank>Go!</a><br><p>blah
==============================================================
My spam filter failed to parse the href URL from the message body due to
the unclosed "<br" tag. Closing it causes HTML::Parser to correctly
parse the URL.
I noticed that http://search.cpan.org/dist/HTML-Parser/Parser.pm#BUGS says:
«Unclosed start or end tags, e.g. "<tt<b>...</b</tt>" are not recognized.»
I don't understand what the implication of this is, however. Is it a
conscious decision not to support unclosed tags, or has there just been
no use case for a fix?
I tried how various browsers handle the HTML code from the spam message
above:
At least the following do render the link despite the preceding broken
"<br" tag: Firefox 3, Konqueror from KDE 3.5.9, Safari 3 & 4, Mail.app
At least the following do NOT render the link: IE 6, Opera 9.63
I'd appreciate it if an option could be added to HTML::Parser to
recognize unclosed tags.
The text was updated successfully, but these errors were encountered:
Migrated from rt.cpan.org#47748 (status was 'new')
Requestors:
From [email protected] on 2009-07-09 17:02:41
:
The text was updated successfully, but these errors were encountered: