Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up processing junk before or after first or last boundary #178

Closed
wants to merge 2 commits into from

Conversation

defnull
Copy link
Contributor

@defnull defnull commented Oct 28, 2024

This should fix #162 by removing log lines that would trigger once per character of junk before or after the first or last boundary. The parser now skips over junk after the last boundary, and triggers an error if there is too much junk in front of the first boundary. You may want to cherry-pick only the first commit if you want to stay compliant to the specification, which requires junk to be accepted and ignored.

Junk before the first or after the last boundary is permitted according to the specification, but should not trigger one log message per character. This patch removes the log lines, and also skips over all remaining bytes after the last boundary if possible.
Fail if there are more than 16 superfluous new-lines in front of the first boundary, as this indicates a broken or malicious client. The spec technically allows it, but no browser or http client should ever do that.
@defnull
Copy link
Contributor Author

defnull commented Oct 28, 2024

Tests pass, but I did not write new ones so coverage dropped to 99%

@@ -1102,10 +1102,17 @@ def data_callback(name: CallbackName, end_i: int, remaining: bool = False) -> No
c = data[i]

if state == MultipartState.START:
# Stop parsing if there is no boundary within the first chunk
if i == 16:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 16?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I'm not happy about that either. Two line breaks should be allowed, that's not that uncommon. More are uncommon, but who knows? It's just an arbitrary number between 2 and "too much to waste time on it". Maybe ignore the second commit for now and just remove the log lines?

Copy link
Contributor Author

@defnull defnull Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is lying, I was first checking for i == length-1 but that broke tests. This needs more work I guess.

@defnull
Copy link
Contributor Author

defnull commented Nov 28, 2024

This is worked on in -> #189 now

@defnull defnull closed this Nov 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Question about the package
2 participants