Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove parser function output parsing when unneeded #9

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

Derugon
Copy link
Contributor

@Derugon Derugon commented Nov 29, 2024

Motivation

Parser functions and tags output a string, which may be unparsed wikitext, semi-parsed wikitext (on which replaceVariables was called), raw string, or a number (as a string).

Currently, most of these functions/tags treat their output as unparsed wikitext, and ask the parser to parse the output into semi-parsed wikitext:

  • parser functions set the noparse => false option on the result (parsing in a child frame),
  • tags call $parser->replaceVariables (parsing in the same frame).

When the output is not unparsed wikitext, 1 or 2 pre-processor nodes are created (then expanded), which takes a slight additional parsing time, without affecting the result.

Proposed changes

Remove noparse options or replaceVariables calls when we know the output does not contain any unparsed wikitext.

Note: This PR does not change whether unescaped wikitext is parsed or not. In practice, if we unescape braces or angle brackets, then we have to parse the output, otherwise we would not have to (need further testing?). This would require changing how the ParserPower::unescape function works, so I left it out of this proposal.

@Derugon Derugon marked this pull request as draft November 29, 2024 16:35
@Derugon
Copy link
Contributor Author

Derugon commented Nov 29, 2024

It seems list functions that return unescaped wikitext parse it twice, so I'm gonna work on it a little more.

@RheingoldRiver
Copy link
Member

Sounds good, thanks so much for your contributions already!!

@Derugon
Copy link
Contributor Author

Derugon commented Nov 29, 2024

Well, thank you all for still maintaining it, and for taking the time to review these PRs. :)

@Derugon
Copy link
Contributor Author

Derugon commented Dec 4, 2024

I wrote this had no impact on code, but it is not true.

The base issue

Using the changes from this PR with templates from an existing wiki it caused a (subtle) change, a beneficial one in my case, but still a breaking change: generated wikitext with transcluded wikitext syntax will have it evaluated.

For example, {{#trim: {{(}}{{(}}!{{)}}{{)}} }} yields |, while I would expect it to produce the same result as {{(}}{{(}}!{{)}}{{)}}, i.e. {{!}}.

This means, this PR (as of now) is a breaking change for the #trim and #or parser functions (and only these 2).

The issue with unescaping

This is a side-effect we have to deal with when unescaping: after the text is unescaped, we need to parse it again, so we parse {{#uesc: \{\{!\}\} }} the same way as {{#uesc: {{(}}{{(}}!{{)}}{{)}} }}, and both yield |.

Changing it would mean variables can no longer contribute reliably to unescaping, e.g. {{#uesc: {{X}} }} would yield {{!}} (with Template:X containing \{\{!\}\}), which would completely break the purpose of unescaping.

So I'll double down on what I said last week, and not suggest to remove the extra parsing from unescaped text in this PR.

@Derugon Derugon marked this pull request as ready for review December 4, 2024 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants