-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: enable reuse of FromData instance #6853
Conversation
aiohttp/formdata.py
Outdated
@@ -132,8 +131,8 @@ def _gen_form_urlencoded(self) -> payload.BytesPayload: | |||
|
|||
def _gen_form_data(self) -> multipart.MultipartWriter: | |||
"""Encode a list of fields using the multipart/form-data MIME format""" | |||
if self._is_processed: | |||
raise RuntimeError("Form data has been processed already") | |||
if not self._fields: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is safe to reprocess since it looks like get_payload
supports streaming payloads which may only be able to be consumed once
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, if we want to support this (and I've no idea if we do), then we'd probably need to subclass this, or have different logic depending on the type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an open issue about trying to resend form data on a redirect, which is probably something we want to look at sometime. So, still not sure about the implementation of this, but the test atleast should probably be updated to show the redirect issue, rather than reusing a form in 2 completely different requests (which I'm not sure we want to support).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is safe to reprocess since it looks like
get_payload
supports streaming payloads which may only be able to be consumed once
Just realized this review today Imao.
Some thoughts to this. To my understanding, FormData
is just a wrapper class for generating the MultipartWriter
object. In which _gen_from_data
is a helper method for generating Payload
objects, from our value passed to add_field
(when _is_multipart
). It is weird to me that MultipartWriter.write()
is actually reusable in current code, but this FormData
wrapper class is not. That's why I think _gen_from_data
should directly return the generated MultipartWriter
object when it is called for multiple times.
Yet, I do agree with your concern here about the safety issue, but I believe this safety issue you are worrying about actually is an already-existent issue in current code, based on the recallable MultipartWriter.write()
. Quick reproduction to what I have mentioned: mpwriter.append(io.BytesIO(b"...")); await session.put("https://bing.com/", data=mpwriter)
,which bing.com
302 redirects me to a local domain, mpwriter.write()
is then called again upon redirect and cause IO operation on closed file (that io.BytesIO
).
I agree that this is a critical issue and thus I will be working on it to solve together with this FormData reuse. However, this PR is old, many new changes have been made over time, and the objectives have changed if we would like to solve the issue above. I might be opening another PR today after work to try fix and discuss this.
Yet, thanks for your review, I will let you know when the new PR is ready.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an open issue about trying to resend form data on a redirect, which is probably something we want to look at sometime. So, still not sure about the implementation of this, but the test atleast should probably be updated to show the redirect issue, rather than reusing a form in 2 completely different requests (which I'm not sure we want to support).
I might be opening another PR here based on some new observations (see my comments above), will let you know when the new PR is ready.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Random thought: If we know the writer is not callable twice, should we consider enabling expect100 by default? That would increase the chance of a server giving us the redirect before we write the body, which would atleast solve the issue in some cases...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Random thought: If we know the writer is not callable twice, should we consider enabling expect100 by default? That would increase the chance of a server giving us the redirect before we write the body, which would atleast solve the issue in some cases...
emm... I'm not sure if enabling Expect 100 by default is a good approach, cuz for my understanding it would somehow increase latency (and some web servers may not have the mechanism to handle Expect)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Dreamsorcerer @bdraco new PR is here #9201; Will update the test coverage issue and the CHANGES file later, but can take a look first, thanks
I have been facing a situation that an API I was requesting would have some retryable error especially with form-data requests. While with reusing the FormData instance in my request, the RuntimeError was raised. In this case, I was confused that why the FormData instance would raise error upon re-gen.
With looking into the source code of aiohttp, I cannot discover any possible errors that may occurs with the reuse of FormData instance, especially the MultipartWriter seems to be designed with the ability to write() for more than once. As a result, I tried the change in this pr in my own code to test and proved that the resulting form-data to be send is correctly outputted.
To work more, with the fact that MultipartWriter seems to be designed for more than one time usage, the FormData may also accept further adding fields into the writer after already generated form data for a certain number of times. With this, it might be more flexible for we to deal with form-data request.
What do these changes do?
Are there changes in behavior for the user?
Users may reuse the FormData instance without the need of creating a new instance for every form-data requests.
Related issue number
n/a
Checklist
CONTRIBUTORS.txt
CHANGES
folder<issue_id>.<type>
for example (588.bugfix)issue_id
change it to the pr id after creating the pr.feature
: Signifying a new feature..bugfix
: Signifying a bug fix..doc
: Signifying a documentation improvement..removal
: Signifying a deprecation or removal of public API..misc
: A ticket has been closed, but it is not of interest to users.