-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fails with out-of-memory for a very-very large pdf file #125
Comments
More info
Following instructions on #119 cpdf also failed with a
|
@LudeeD say> Pages: 308, File size: 1322590721 bytes 1322590721/308 = 4294128 bytes/page. Hmm! Is big! You can change /FlateDecode (~ png) to /DCTDecode (~ jpeg), use ghostscript:
|
After running for 3 hours I gave up on this. feel free to close this issue if handling > 1Gb files is not really a priority Thanks for the help |
Can you share this file? It sure sounds interesting and I would like to
have a look at it.
Thanks,
Rogério Brito.
Em seg, 1 de jul de 2019 12:54, Luís Silva <[email protected]>
escreveu:
… I have a pdf file that is 1.3 Gb in size ( it's a master thesis, that's
why I am not annexing it here )
Okular can handle it pretty well but crashes Adobe
While trying to use pdfsizeopt it crashes too with a memory error
info: This is pdfsizeopt ZIP rUNKNOWN size=69734.
info: prepending to PATH: /home/ludee/Programs/pdfsizeopt/pdfsizeopt_libexec
info: loading PDF from: /home/ludee/Desktop/Dissertação_Ana_Antunes_201405897.pdf
info: loaded PDF of 1322590721 bytes
info: separated to 2269032 objs + xref + trailer
Traceback (most recent call last):
File "/proc/self/exe/runpy.py", line 162, in _run_module_as_main
File "/proc/self/exe/runpy.py", line 72, in _run_code
File "./pdfsizeopt.single/__main__.py", line 1, in <module>
File "./pdfsizeopt.single/m.py", line 6, in <module>
File "./pdfsizeopt.single/pdfsizeopt/main.py", line 5622, in main
File "./pdfsizeopt.single/pdfsizeopt/main.py", line 2664, in Load
File "./pdfsizeopt.single/pdfsizeopt/main.py", line 689, in __init__
File "./pdfsizeopt.single/pdfsizeopt/main.py", line 942, in Get
File "./pdfsizeopt.single/pdfsizeopt/main.py", line 1217, in ParseDict
File "./pdfsizeopt.single/pdfsizeopt/main.py", line 1148, in ParseSimpleValue
MemoryError
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#125?email_source=notifications&email_token=AABTZMIXYH56MRBFKPB2MGLP5ISEFA5CNFSM4H4TY5WKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4G4VUHNA>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AABTZMJBALHMD5X7I67PTXLP5ISEFANCNFSM4H4TY5WA>
.
|
@rbrito say> It sure sounds interesting and I would like to have a look at it. Use pdftk to process the file in parts. |
pdfsizeopt indeed uses a lot of memory for large PDF files, because it keeps the parsed version of the entire PDF file in memory. It also keeps multiple versions of compressed image data in memory for the current image being optimized. Throwing more memory at it should make it work. Unfortunately there is no easy estimate for the total required memory for a given input file. In the meantime, splitting the PDF file on some page boundary (with pdftk or qpdf), running pdfsizeopt on the split PDF files individually, and joining the results may work for some PDFs. I'm keeping this issue open as a reminder to add memory optimizations. |
I have a pdf file that is 1.3 Gb in size ( it's a master thesis, that's why I am not annexing it here )
Okular can handle it pretty well but crashes Adobe
While trying to use pdfsizeopt it crashes too with a memory error
The text was updated successfully, but these errors were encountered: