Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand template contains itself #193

Open
xxyzz opened this issue Jan 31, 2024 · 8 comments
Open

Expand template contains itself #193

xxyzz opened this issue Jan 31, 2024 · 8 comments

Comments

@xxyzz
Copy link
Collaborator

xxyzz commented Jan 31, 2024

Page: https://ru.wiktionary.org/wiki/footer
Template: https://ru.wiktionary.org/wiki/Шаблон:длина_слова

The "длина_слова" template calls itself if it is used for substitution, or use "main other" template. I think it's "{{{|safesubst:}}}" in the template delays the expansion of the arguments, but our code doesn't do that and expand the "длина_слова" template recursively.

wikitext docs:

@kristian-clausal
Copy link
Collaborator

Oh, that is such a headache... I have no recollection about how we do subst: and safesubst:, but it can't be pretty. The proper way to do it, based on the documentation, would be to actually expand them before anything else... I guess we just do everything in all go.

@xxyzz
Copy link
Collaborator Author

xxyzz commented Jan 31, 2024

Current code ignores them:

# Strip safesubst: and subst: prefixes
tname = (
tname.strip()
.removeprefix("subst:")
.removeprefix("safesubst:")
)

Expand before anything else? I think mediawiki does the opposite. And the "длина слова" template expands to a category link, so this bug doesn't affect the ru extractor much.

@kristian-clausal
Copy link
Collaborator

kristian-clausal commented Jan 31, 2024

https://en.wikipedia.org/wiki/Help:Substitution#Technical_implementation

This means that substitution necessarily occurs before any actions performed at the time of page rendering (conversion of the stored wikitext to HTML). In particular, substitutions are done before transclusions.

Substitutions change the page source itself. Normally templates transclude and are called each time.

EDIT:

You are correct, I meant that subst: is evaluated before other things, but of course you'd need to expand the template a subst is in in order to get to it...

@kristian-clausal
Copy link
Collaborator

I am staring at that template, and my head is starting to hurt. {{{|safesubst}}} is a hack used in templates so that you can actually use safesubst: as a keyword output from a template. If you wrote a template with {{safesubt:...}}, it would instantly evaluate when you save the template, removing the safesubst: from the source of the template and locking the template into whatever it was evaluated as. So if you want to return safesubst: from a template to another template or to the main article, you need to trick the parser into returning safesubst: the string by making it the second argument of a triple-braces, and because the first argument is empty / always false the template returns the safesubst: as a string inserting it inside the double-braces and outputting {{safesubst:...}} on the next level...

@kristian-clausal
Copy link
Collaborator

Ok, there might be a brighter future here:

  1. I do not believe that you can find find {{subst:...}} on any article page, (unless someone's used that kludge where you have a subst: inside a subst: so that you need to save a page twice (or more if it goes even deeper) so that the substitutions take place when the page is saved.
  2. safesubst is meant to be usable as both a subst and "not". Basically, safesubst can be ignored, which is what we basically do now.

We might be able to get away with small changes... The problem with the template in the first post might not be safesubst, it might be "ifsubst" which checks if we're in "substitution mode" (never applicable for us) or "transclusion mode", and if that breaks that might explain the infinite loop?

@xxyzz
Copy link
Collaborator Author

xxyzz commented Feb 1, 2024

I agree we could continue ignore "safesubst", I guess the problem is the template arguments passed to "#ifeq" are expanded before checking the if condition?

Here is a simplified test:

    def test_test(self):
        self.ctx.add_page("Template:ifsubst", 10, """{{#ifeq:yes|yes
 |{{{2|}}}
 |{{{1|}}}
}}""")
        self.ctx.add_page("Template:длина слова", 10, "{{ifsubst|{{длина слова}}|aaa}}")
        self.ctx.start_page("")
        self.assertEqual(self.ctx.expand("{{ifsubst|yes|no}}"), "no")
        self.assertEqual(self.ctx.expand("{{длина слова}}"), "aaa")

@kristian-clausal
Copy link
Collaborator

kristian-clausal commented Feb 1, 2024

If we get the behavior of subst: right, then ifsubst should theoretically work... It relies on the difference between the behavior of subst when substituting and transcluding (safesubst has the 'same' behavior for each).

I am still checking all the pages for {{subst:, and I've only found one broken one that was left on the page because something weird broke it; couldn't fix it, so removed the malformed template from the article ("leftback" if anyone wants to check it out.) Otherwise, not a single hit, and I've probably grepped through millions of pages by now.

Templates have them, but those templates are supposed to be used once with {{subst: (which will enable recursive subst:ing).

EDIT: There's not Template:ifsubst on en.wiktionary.org, but there is on en.wikipedia.org and ru.wiktionary.org

@kristian-clausal
Copy link
Collaborator

I'm an idiot, ifsubst uses safesubst:...

xxyzz added a commit to xxyzz/wiktextract that referenced this issue Mar 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants