Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make iframe parsing configurable [rt.cpan.org #46099] #11

Open
oalders opened this issue Aug 24, 2020 · 0 comments
Open

Make iframe parsing configurable [rt.cpan.org #46099] #11

oalders opened this issue Aug 24, 2020 · 0 comments

Comments

@oalders
Copy link
Member

oalders commented Aug 24, 2020

Migrated from rt.cpan.org#46099 (status was 'open')

Requestors:

From [email protected] on 2009-05-15 06:15:45
:

Since the latest versions of HMTL::Parser do not parse the content of
iframes, some of my applications using HTML::SimpleLinkExtor have
broken. The text between the iframe tags is what the browser displays
and is usually more HTML, and I need to be able to extract any links in
that text.

I'd like to at least be able to turn on parsing for iframes, even if it
is off by default.

From [email protected] on 2009-06-20 09:17:40
:

On Fri May 15 02:15:45 2009, BDFOY wrote:
> Since the latest versions of HMTL::Parser do not parse the content of
> iframes, some of my applications using HTML::SimpleLinkExtor have
> broken. The text between the iframe tags is what the browser displays
> and is usually more HTML, and I need to be able to extract any links in
> that text.

Browsers that support iframes are supposed to ignore everything inside the iframe.  They are 
supposed to render the HTML found at the 'src' location.

> I'd like to at least be able to turn on parsing for iframes, even if it
> is off by default.

I see the point if you need to emulate the behaviour of very old browsers.

A workaround is to invoke a subparser on the iframe content text.  I'll see if I find an easier 
way to do this.

From [email protected] on 2009-06-20 09:24:09
:

The TODO file has this entry:

- make literal tags configurable.  The current list is hardcoded to be "script", "style", "title", 
"iframe", "textarea", "xmp",  and "plaintext".

which would be my preferred way to fix this.

From [email protected] on 2011-09-20 17:20:09
:

Making literal tags configurable would also be useful for those doing
javascript templates with <script type="text/html"> tags.

From [email protected] on 2012-10-17 22:22:02
:

On Sat Jun 20 05:17:40 2009, GAAS wrote:
> > I'd like to at least be able to turn on parsing for iframes, even if
> it
> > is off by default.
> 
> I see the point if you need to emulate the behaviour of very old
> browsers.

What is the point of not parsing the content of iframes?  I can't find 
any justification, and it seems at odds both with the spec and user 
expectations.  Removing this special case would make HTML::Parser simpler 
and more uniform.

Andrew

From [email protected] on 2012-10-18 22:09:53
:

I explained the point just above the text you quoted. What's "the spec" you'r
refering to?


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant