Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDF/A compliance #148

Open
Keks-Dose opened this issue Jan 4, 2021 · 11 comments
Open

PDF/A compliance #148

Keks-Dose opened this issue Jan 4, 2021 · 11 comments

Comments

@Keks-Dose
Copy link

I have been using pdfsizeopt for as long as it exists. Great software, shrinks my PDFs (produced by pdftex) by factor 10.

Times are changing. It seems that I'll have to provide pdfs according to standards like PDF/A.

Like said here, a »PDF/A-1b is a PDF with an OutputIntent and certain metadata.«

So this is my MWE:

    \documentclass{article}
    \usepackage[T1]{fontenc}

    \usepackage{libertine}

    \usepackage[pdfa]{hyperref}
    \usepackage{hyperxmp}

    \hypersetup{%
      pdftitle={Irgendeine Angabe},
      pdfauthor={Der Name},
      pdflang={de-DE},
      pdfapart=3, %set to 1 for PDF/A-1
      pdfaconformance=B
    }

    % Create an OutputIntent in order to correctly specify colours
    \immediate\pdfobj stream attr{/N 3} file{sRGB.icc}
    \pdfcatalog{%
      /OutputIntents [
      <<
      /Type /OutputIntent
      /S /GTS_PDFA1
      /DestOutputProfile \the\pdflastobj\space 0 R
      /OutputConditionIdentifier (sRGB)
      /Info (sRGB)
      >>
      ]
      /ViewerPreferences
      <</PrintScaling/None>>
    }

    \begin{document}

    And some text.

    \end{document}

The PDF is, according to veraPDF, compliant to PDF/A-3B.

But no longer after running pdfsizeopt on it, even if the command is
pdfsizeopt --do-unify-fonts=no --do-regenerate-all-fonts=no --do-optimize-images=no MWE-pdfsizeopt-pdfA.pdf MWE-pdfsizeopt-pdfA.pdf

veraPDF reports errors:

The stream keyword shall be followed either by a CARRIAGE RETURN (0Dh) and LINE FEED (0Ah) character sequence or by a single LINE FEED (0Ah) character. The endstream keyword shall be preceded by an EOL marker.

The object number and generation number shall be separated by a single white-space character. The generation number and obj keyword shall be separated by a single white-space character. The object number and endobj keyword shall each be preceded by an EOL marker. The obj and endobj keywords shall each be followed by an EOL marker.

The file trailer dictionary shall contain the ID keyword whose value shall be File Identifiers as defined in ISO 32000-1:2008, 14.4

Usually only the first error is important.

If somebody has an idea how to deal with these errors, I'd be glad. Otherwise I'll have to provide larger PDFs, obviously not triggering the end of the world.

This isn't a complain, I doubt that it even is an issue of pdfsizeopt. But maybe there is an easy remedy, or somebody else stumbles upon this issue.

@zvezdochiot
Copy link

@Keks-Dose say:

But maybe there is an easy remedy, or somebody else stumbles upon this issue.

See #119 .

@Keks-Dose
Copy link
Author

@Keks-Dose say:

But maybe there is an easy remedy, or somebody else stumbles upon this issue.

See #119 .

Could you elaborate a bit, please? cpdf doesn't compress much, even with the option --squeeze. File size is not about 10%, more about 90%.

@zvezdochiot
Copy link

zvezdochiot commented Jan 4, 2021

@Keks-Dose say:

Could you elaborate a bit, please?

You are using cpdf incorrectly, not for its intended purpose. Take a closer look at your own question.

@Keks-Dose
Copy link
Author

@Keks-Dose say:

Could you elaborate a bit, please?

You are using cpdf incorrectly, not for its intended purpose. Take a closer look at your own question.

? No idea what you mean, sorry.

@zvezdochiot
Copy link

@Keks-Dose say:

No idea what you mean.

Cpdf is needed to fix the processed using pdfsizeopt file.

@Keks-Dose
Copy link
Author

So you basically suggest:

  1. compile *.tex to in.pdf
  2. pdfsizeopt in.pdf
  3. cpdf -create-objstm -no-preserve-objstm in.pdf -o out.pdf

I tried all that, but in vain: PDF file is not compliant to PDF/A.

@zvezdochiot
Copy link

@Keks-Dose
Copy link
Author

https://github.com/qpdf/qpdf

Well, not cpdf, but qpdf now.

You haven't got a clue, what you are talking about, do you? The manual of qpdf offers thousands of commands, many really low level. Sorry, your comments are not helpfull at all.

@zvezdochiot
Copy link

@Keks-Dose say:

You haven't got a clue, what you are talking about, do you?

It's enough for me to understand the essence. You want to "fix" the processed PDF. Find the right tool.

@pts
Copy link
Owner

pts commented Feb 21, 2023

@Keks-Dose: Thank you for reporting the issue and suggesting veraPDF to verify PDF/A compliance. It would be easy to add the command-line flag pdfsizeopt --write-pdfa=yes, which would fix the compliance issues above.

I'm keeping this issue open in case anyone volunteers to implement it.

@pts
Copy link
Owner

pts commented Feb 21, 2023

@Keks-Dose: Could you please upload your input PDF to this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants