kwargs are not checked for unexpected parameters #1238

timmc · 2022-03-26T02:35:04Z

Repro case: return markdown.markdown(post, output='html5')
Expected behavior: Error or warning
Actual behavior: Typo is ignored

For a while I had the code return markdown.markdown(post, output='html5'), which seemed to be working OK. However, it turns out that was a typo -- I should have been using output_format. Normally, the runtime would catch this, but instead **kwargs are collected and passed to the Markdown class, where keys are retrieved as needed.

It's not a security issue in this library, as far as I can tell, but this pattern has lead to security issues elsewhere. (Imagine if there were a safe_output kwarg that someone typo'd.)

I think this could be as simple as having a known-keys set that the kwargs dict's keys are checked against before processing. I'd be happy to contribute a PR if this would be an acceptable approach.

The text was updated successfully, but these errors were encountered:

waylan · 2022-03-28T14:09:28Z

Over the years we have changed the list of accepted keys. When a key is deprecated, we will raise a warning for a few versions, then remove it. However, we can't do that when adding a key. The point of accepting **kwargs is that the code still runs whether we add or remove a key. I suppose we could raise an error on receiving an unknow key, but I don't really see the need.

In fact, sometimes it can be useful. For example, in #816 we are passing all kwargs to the Pygments lexer. However, the lexer for each language has a different set of keys (most have none at all). Because Pygments' lexers don't error on unknown keys, it works correctly. If they did raise errors, we would need to add complexity to our config to clearly define which options go with which lexers. I realize that is not calling markdown, but it is an example were not raising an error on unknown keys is useful.

timmc · 2022-03-28T15:47:06Z

Hmm, so with Pygments and such you're basically stuck with it being a mostly-blind passthrough.

To give an example of the safety concerns, there's a Python JWT library that accepts an "algorithms" keyword. If it isn't specified, tokens are allowed to declare what algorithm they're using, which is a severe vulnerability (both in the library, and the spec). But compounding this, the decode() function takes a kwargs and passes it to a shared utility without further validation. I've seen someone pass algorithm instead of algorithms and unwittingly open their application to attack.

Again, I'm not aware of a specific vulnerability like that in Python-Markdown, but I did want to give an example of the kind of trouble this pattern can bring.

In my case it was just about 15 minutes of confusion, but warnings on unknown keys would have cleared it up very quickly. So there's also an argument to be made from the developer experience side.

waylan added needs-decision A decision needs to be made regarding request. feature Feature request. core Related to the core parser code. labels Mar 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kwargs are not checked for unexpected parameters #1238

kwargs are not checked for unexpected parameters #1238

timmc commented Mar 26, 2022

waylan commented Mar 28, 2022

timmc commented Mar 28, 2022

kwargs are not checked for unexpected parameters #1238

kwargs are not checked for unexpected parameters #1238

Comments

timmc commented Mar 26, 2022

waylan commented Mar 28, 2022

timmc commented Mar 28, 2022