Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

License (in)compatibility #34

Open
rmast opened this issue Jan 1, 2022 · 4 comments
Open

License (in)compatibility #34

rmast opened this issue Jan 1, 2022 · 4 comments

Comments

@rmast
Copy link

rmast commented Jan 1, 2022

Hi, any progress on the license-incompatibility with OcrMyPDF (MPL-2.0)?

Would GScan2PDF (GPLv3) be a better fit? I'll try to study the differences...

@MerlijnWajer
Copy link
Collaborator

My idea was to make a binary/script that OCRmyPDF could call externally. So far there are no plans to switch from AGPLv3, but that also won't be necessary in this case.

I briefly took a look at replacing images in a PDF with MRC'd versions, but ran into some problems in PyMuPDF that I still have to file a bug for -- some other things came up with month. I'll try to resume that work in the next week.

@rmast
Copy link
Author

rmast commented Jan 2, 2022 via email

@mara004
Copy link
Contributor

mara004 commented Jan 4, 2022

My idea was to make a binary/script that OCRmyPDF could call externally.

I find this license incompatibility fairly disappointing. Calling an external binary is by far not as elegant as a native interface, and will result in the program being less performant and functional than it technically could be. It also means a lot more development work that would theoretically be avoidable.
OCRmyPDF and archive-pdf-tools are both libre / open-source tools, both written in Python. It is kind of a pity that they can't be linked just because of licensing issues, but I understand it is hard to improve the situation as archive-pdf-tools relies on PyMuPDF.

@MerlijnWajer
Copy link
Collaborator

Well there a few things to consider:

  1. archive-pdf-tools is based on PyMuPDF (which is AGPL-v3). I've invested quite some time in making the program really optimised and fast, using PyMuPDF. Being fast was a big priority.
  2. AGPLv3 was the suggested license for the work I've been doing, I could ask to change it but it won't make a difference because of (1)
  3. If you see archive-pdf-tools as a program to just compress the hell out of your PDF , it really shouldn't matter if it is AGPLv3 or not. You can use OCRMyPDF (or some other tool, like archive-pdf-tools itself, although it is more low level) to create PDFs the way you like. I have been thinking about adding OCRmyPDF-like features to archive-pdf-tools, but just plugging it in for compression the PDF seems like a win-win for both projects.

Just the MRC interface could definitely be decoupled from PyMuPDF, and I can ask for permission to relicense that part (I don't believe it depends on libraries that require AGPLv3). So relicensing that part is not out of the question at all, I'd just rather get PDF recompression working well first, and then look at splitting out the MRC interface, making it reusable for OCRmyPDF or other projects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants