Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an example of using Django's file storage API to open files #3997

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lancegoyke
Copy link

I have been working with pymupdf inside of a Django project and thought it might be useful to document how to use Django's File Storage API to write code using pymupdf that works with a local filesystem and a remote storage backend like S3.

I was able to figure out what to do without these docs based on what is already there, so feel free to close this if you don't think it's necessary.

Also, I have not tested that the docs will build yet. Wasn't sure how to do that and I'm out of time at the moment.

Copy link
Contributor

github-actions bot commented Oct 28, 2024

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@lancegoyke
Copy link
Author

I have read the CLA Document and I hereby sign the CLA

github-actions bot added a commit that referenced this pull request Oct 28, 2024
Copy link
Collaborator

@jamie-lemon jamie-lemon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a bit more to this example which explains which Django libraries require to be installed (and how to from pipit I assume)? ( i.e. A basic setup for someone who doesn't know Django would be good ). I would like to be able to test this but don't know Django :)

import pymupdf
from django.core.files.storage import default_storage

from .models import MyModel
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does "MyModel" come from - is that part of Django?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so MyModel is my made up representation of a database table.

@lancegoyke
Copy link
Author

The set up is definitely non-trivial. It involves installing packages (django and django-storages[s3], creating object storage service access keys (I used S3), and adjusting your Django project settings to tell it how to store files. It doesn't work right after installation.

I did create a barebones project if you'd like to test it out for yourself with your own API keys: https://github.com/lancegoyke/pymupdf-django

Perhaps this is more than we care to add in here and the user can fend for her or himself.

The default_storage variable is picking whatever configuration the user has already set. If they've gone through the trouble to configure AWS S3 storage for their project, they could be using it as their default storage.

Based on the last hour or two of work on that repo I pasted above, I might recommend we not try to tackle helping the user setup object storage for their project.

Let me know if this is helpful at all or if there's something else I can try to clarify.

----------


Opening Django Files
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually better as: "Opening Files from the Django Storage Area" - as we are still opening PDFs etc. that might be stored there right? ( i.e. we are not opening "Django system files" )

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are totally correct. I'm embarrassed to see that you saw the original title!

The convention in Django is to call these "media files" or "user-uploaded media". How do you feel about "Opening Media Files in a Django Application"?

Django implements a `File Storage API <https://docs.djangoproject.com/en/5.1/ref/files/storage/>`_ to store files. The default is the `FileSystemStorage <https://docs.djangoproject.com/en/5.1/ref/files/storage/#the-filesystemstorage-class>`_, but the `django-storages <https://django-storages.readthedocs.io/en/latest/index.html>`_ library provides a number of other storage backends.

You can open the file, move the contents into memory, then pass the contents to |PyMuPDF| as a stream.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about including a standout note here? Something like:

.. note::

    This assumes some knowledge and familiarity with Django and that you have a Django project in place.
    

@jamie-lemon
Copy link
Collaborator

As I appreciate setting up Django is non-trivial - in this case I think we should just add some further notes about Django and assumed knowledge around it. I know Python developers generally love Dango so I think having this kind of example is valuable!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants