You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm not entirely sure what the intent is here so hesitate to file a PR. We saw some errors thrown by our webapp (using gunicorn) and traced it to request.encget():
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/webob/request.py", line 495, in url
url = self.path_url
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/webob/request.py", line 467, in path_url
bpath_info = bytes_(self.path_info, self.url_encoding)
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/webob/descriptors.py", line 70, in fget
return req.encget(key, encattr=encattr)
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/webob/request.py", line 165, in encget
return bytes_(val, 'latin-1').decode(encoding)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 66: invalid start byte"
My read of util.byte_ is that, when passed a string, it performs val.encode() on it. So the following code in encget():
I'm not sure why we'd ever be explicitly encoding a string as latin-1 and then decoding it as UTF-8 in the first place -- a simpler return val.encode(encoding) would seem more appropriate here -- but again, there's probably nuance that I'm not understanding, hence the issue report.
The text was updated successfully, but these errors were encountered:
This is due to the fact that HTTP doesn't officially support unicode in HTTP requests/paths and as explained in https://peps.python.org/pep-3333/#unicode-issues all of the HTTP path/URI's should be treated as latin-1.
I'm not entirely sure what the intent is here so hesitate to file a PR. We saw some errors thrown by our webapp (using gunicorn) and traced it to
request.encget()
:My read of util.byte_ is that, when passed a string, it performs
val.encode()
on it. So the following code in encget():is the same as doing:
Based on our exception we can see that the value of encoding is "utf-8", which gives us:
or with a specific example that will fail:
I'm not sure why we'd ever be explicitly encoding a string as latin-1 and then decoding it as UTF-8 in the first place -- a simpler
return val.encode(encoding)
would seem more appropriate here -- but again, there's probably nuance that I'm not understanding, hence the issue report.The text was updated successfully, but these errors were encountered: