Converted PDF have broken pages in Acrobat Pro DC #111

galaxy001 · 2019-03-19T05:13:32Z

I remove the broken xmp item, and get the result with ../pdfsizeopt.single nnGm.pdf nnGmo2.pdf 2>nnGmo2.log.

The "nnGmo2.pdf" can be viewed with macOS Preview app, but in Acrobat Pro DC 2018, some pages are broken. Such as: xx, 68, 69,70, 226-230 as labeled.

The pdf structure view also breaks after page 21.

nnGmo2.log
nnGmo2.pdf

The text was updated successfully, but these errors were encountered:

pts · 2019-03-19T12:48:01Z

Thank you for reporting this in detail! This may indicate multiple bugs in pdfsizeopt.

Unfortunately I don't have a license for Acrobat Pro DC 2018, so I can't reproduce the problem. The indicated pages of nnGmo2.pdf work for me in Google Chrome and Evince.

Do you have more detailed error messages from Acrobat Pro DC 2018?

You may want to diagnose this further. First, try pdfsizeopt.single --use-pngout=no, to make the image processing faster. Then try pdfsizeopt.single --use-pngout=no --do-optimize-fonts=no. Does this fix all the problems you were encountering?

FYI The full list of useful flags to try are: pdfsizeopt.single --do-optimize-images=no --do-optimize-fonts=no --do-optimize-objs=no --do-optimize-streams=no --do-decompress-most-streams=yes --do-generate-xref-stream=no --do-generate-object-stream=no

galaxy001 · 2019-03-20T04:56:19Z

Even the last one pdfsizeopt.single --do-optimize-images=no --do-optimize-fonts=no --do-optimize-objs=no --do-optimize-streams=no --do-decompress-most-streams=yes --do-generate-xref-stream=no --do-generate-object-stream=no, does not work for Acrobat.

However, after convert to QDF with qpdf, it works.

Then, process the QDF with the last one above, it sucks again.

Would you use the free trial license to give a try ?
https://acrobat.adobe.com/us/en/free-trial-download.html

I don't know how to get a detailed error messages from Acrobat Pro DC 2018 yet.

galaxy001 · 2019-03-20T08:02:06Z

I got a fix with 'qpdf':

qpdf --decode-level=none --normalize-content=y fo.pdf for.pdf
qpdf --decode-level=none for.pdf forr.pdf

$ ls -1s fo.pdf for.pdf forr.pdf
 9984 fo.pdf
18440 for.pdf
10248 forr.pdf

pts · 2019-03-20T10:57:20Z

Can you please convert fo.pdf, for.pdf and forr.pdf with pdfsizeopt, upload all 3*2 files to this issue, and declare which work in Acrobat and which don't? It would be awesome to have such short example PDF files which don't work in Acrobat.

Having uploaded the files here you may also want to report the bug to Adobe, and wait for analysis and comments of the Adobe engineers. Currently (without a meaningful error message from Adobe Acrobat) it's not obvious whether pdfsizeopt or Adobe Acrobat has the bug.

I'm developing pdfsizeopt on Linux. In order to use Adobe Acrobat, I'd have to give my credit card details to Adobe (that's the smaller issue), and I'd have to either buy a Mac or install Windows to one of my existing computers (or into a VM). I'm willing to do this only if I'm compensated for the licenses and the work in advance.

galaxy001 · 2019-03-21T07:27:47Z

man ls

     -s      Display the number of file system blocks actually used by each file, in units of 512 bytes, where
             partial units are rounded up to the next integer value.  If the output is to a terminal, a total
             sum for all the file sizes is output on a line before the listing.  The environment variable
             BLOCKSIZE overrides the unit size of 512 bytes.

I tried to extract some pages. When there are a few pages, everything is right, thus I use this file set:

pdfsizeopt --do-keep-font-optionals=yes --do-regenerate-all-fonts=no --do-double-check-type1c-output=yes --do-ignore-generation-numbers=no --do-optimize-objs=no --use-multivalent=yes ex.pdf exo.pdf
qpdf --decode-level=none --normalize-content=y exo.pdf exor.pdf
qpdf --decode-level=none exor.pdf exorr.pdf

ex.pdf
exo.pdf
exor.pdf
exorr.pdf

The Acrobat shows page 12 is missing for exo.pdf.

babinslava · 2019-10-18T19:45:18Z

I have the same issue. Thank you @galaxy001 for help qpdf --decode-level=none --normalize-content=y fixes the file and doesn't even increase file size.

@pts error message in adobe acrobat pro is Expected a dict object.

pts · 2019-12-12T17:48:52Z

Thank you for reporting this. I'd love to debug and fix this, but unfortunately I don't have a copy of Acrobat Pro DC, and the error message Expected a dict object. is already helpful, but not specific enough, it could take hours or days to debug by trial and error. Any contributions are welcome.

pts · 2023-02-23T02:47:34Z

Nevertheless, it's still worth investigating what difference qpdf --decode-level=none --normalize-content=y makes to the PDF file. Maybe pdfsizeopt itself could do it.

Ndolam · 2023-02-23T15:52:57Z

Hi Péter,

I keep a copy of the last Adobe Reader for Linux (9.5.5) around, and just for your information, that program also doesn't show page 12 (and when I scroll down that far acroread spits out an error message "There was a problem reading this document (14).").

Also, my version of pdftk can't process this exo.pdf file.

To any Linux users reading this: you need quite a number of 32-bit libraries installed on your system to use this, and I have been told there are some security bugs with this version. So use it at your own risk.

galaxy001 changed the title ~~Convert PDF have broken pages in Acrobat Pro DC~~ Converted PDF have broken pages in Acrobat Pro DC Mar 19, 2019

galaxy001 mentioned this issue Mar 20, 2019

Assertion error with pdfsizeopt #110

Closed

pts added the bug label Feb 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Converted PDF have broken pages in Acrobat Pro DC #111

Converted PDF have broken pages in Acrobat Pro DC #111

galaxy001 commented Mar 19, 2019 •

edited

Loading

pts commented Mar 19, 2019

galaxy001 commented Mar 20, 2019 •

edited

Loading

galaxy001 commented Mar 20, 2019

pts commented Mar 20, 2019

galaxy001 commented Mar 21, 2019

babinslava commented Oct 18, 2019 •

edited

Loading

pts commented Dec 12, 2019 •

edited

Loading

pts commented Feb 23, 2023 •

edited

Loading

Ndolam commented Feb 23, 2023 •

edited

Loading

Converted PDF have broken pages in Acrobat Pro DC #111

Converted PDF have broken pages in Acrobat Pro DC #111

Comments

galaxy001 commented Mar 19, 2019 • edited Loading

pts commented Mar 19, 2019

galaxy001 commented Mar 20, 2019 • edited Loading

galaxy001 commented Mar 20, 2019

pts commented Mar 20, 2019

galaxy001 commented Mar 21, 2019

babinslava commented Oct 18, 2019 • edited Loading

pts commented Dec 12, 2019 • edited Loading

pts commented Feb 23, 2023 • edited Loading

Ndolam commented Feb 23, 2023 • edited Loading

galaxy001 commented Mar 19, 2019 •

edited

Loading

galaxy001 commented Mar 20, 2019 •

edited

Loading

babinslava commented Oct 18, 2019 •

edited

Loading

pts commented Dec 12, 2019 •

edited

Loading

pts commented Feb 23, 2023 •

edited

Loading

Ndolam commented Feb 23, 2023 •

edited

Loading