Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoid reading in whole blob during NamedBlobFile validation (slows TUS uploads) #155

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

djay
Copy link
Member

@djay djay commented Nov 29, 2023

following on from plone/plone.restapi#1690

Currently NamedBlobFile and NamedImageFile fields read in the entire file contents during validation which can be slow and use a lot of memory for large files.
The validation itself doesn't do a lot. It's checking the data field has 0 or more length.
This fix bypasses this check and instead only checks the object provides the right schema.

@mister-roboto
Copy link

@djay thanks for creating this Pull Request and helping to improve Plone!

TL;DR: Finish pushing changes, pass all other checks, then paste a comment:

@jenkins-plone-org please run jobs

To ensure that these changes do not break other parts of Plone, the Plone test suite matrix needs to pass, but it takes 30-60 min. Other CI checks are usually much faster and the Plone Jenkins resources are limited, so when done pushing changes and all other checks pass either start all Jenkins PR jobs yourself, or simply add the comment above in this PR to start all the jobs automatically.

Happy hacking!

@djay
Copy link
Member Author

djay commented Dec 13, 2023

@davisagli any feedback on this solution?
plone/plone.restapi#1690 is dependent on a fix to this problem. or the test is at least.

@djay djay changed the title avoid reading in whole blob during NamedBlobFile validation avoid reading in whole blob during NamedBlobFile validation (slows TUS uploads) Dec 13, 2023
Copy link
Member

@davisagli davisagli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djay Sorry for the delay looking at this. I think you're on to something but have a suggestion to make sure we don't get rid of other parts of the validation.

plone/namedfile/field.py Outdated Show resolved Hide resolved
plone/namedfile/tests/test_storable.py Outdated Show resolved Hide resolved
@davisagli
Copy link
Member

@djay Could you add a changelog entry please?

@davisagli
Copy link
Member

@jenkins-plone-org please run jobs

@djay
Copy link
Member Author

djay commented Jan 15, 2024

@davisagli looks like restapi is using that schema to generate the json for the field.

Failing test is (and same for image)

    def test_file_type(self):
        response = self.api_session.get("/@types/File")
        response = response.json()
        self.assertIn("fieldsets", response)
        self.assertIn("file.data", response["properties"]["file"]["properties"])  # noqa

@davisagli
Copy link
Member

@djay We're getting into hack territory now but maybe you could do something like this:

def _validate(self, value):
    self.schema = INamedTyped
    try:
        super()._validate(value)
    finally:
        self.schema = INamedBlobFile

But probably it would be cleaner to copy the code from Object._validate so you can pass in INamedTyped instead of self.schema

@djay
Copy link
Member Author

djay commented Jan 16, 2024

@jenkins-plone-org please run jobs

@djay
Copy link
Member Author

djay commented Jan 17, 2024

But probably it would be cleaner to copy the code from Object._validate so you can pass in INamedTyped instead of self.schema

@davisagli I did this. didn't like copying all that code but seemed less hacky. It now passes everything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants