Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors in various example PDF files #226

Open
petervwyatt opened this issue May 26, 2023 · 3 comments
Open

Errors in various example PDF files #226

petervwyatt opened this issue May 26, 2023 · 3 comments

Comments

@petervwyatt
Copy link

  • Most of the PDFs have incorrect PDF versions according to their content - its probably safer and far easier to just set the version to PDF 1.7 and not have to worry about things.

  • ProcSets were deprecated back in PDF 1.4 (early 2000s) and only apply to PDF 1.3 features for PostScript compatibility reasons. They cannot be used with PDF 1.4 and later. Just remove them as no PDF processor requires them anymore as no one implements PDF via PS procedure sets since the mid 2000s!

  • sample_form.pdf:

    • invalid PDF date string for key CreationDate (DocInfo): "20200816155332-04'00'"
    • invalid PDF date string for key ModDate (DocInfo): "20200816155332-04'00'"
    • Panose (StyleDict) in PDF 1.7 should be a 12-byte string but is zero bytes
  • unipdf-tables-division-layout.pdf:

    • Document Catalog minor version is earlier than PDF header version.
  • unipdf-templates-documentation.pdf:

    • Document Catalog minor version is earlier than PDF header version.
  • unipdf-tables-division-layout.pdf:

    • Has an undocumented key ColorComponents in Image XObjects
@petervwyatt
Copy link
Author

petervwyatt commented May 26, 2023

Further: in many example PDFs the Border Style dictionary is incorrect as it says /Subtype /Border. It should always be /Type /Border.

pdf-custom-toc.pdf
unidoc-report-landscape.pdf
unidoc-report.pdf
unipdf-aviation-checklist.pdf
unipdf-draw-lines.pdf
unipdf-medical-bill.pdf
unipdf-rental-agreement.pdf
unipdf-security-report.pdf
unipdf-tables.pdf
unipdf-templates-documentation.pdf

@gunnsth
Copy link
Contributor

gunnsth commented May 26, 2023

Thanks @petervwyatt We will check that out. We do have PDF/A support which helps with conforming to certain sets of standards. But yeah for sure it makes sense to produce output that is consistent with PDF versioning, even though most viewers probably ignore unused properties.
Were you using a specific tool to check this? Just wondering as it would make it easier to confirm when we address those items.

@petervwyatt
Copy link
Author

For PDF/A (all versions) and PDF/UA-1 (if you support it), please be sure to use the latest veraPDF for compliance checking. Later versions have improved checks.

I used my Arlington PDF Model to detect these and then double-checked manually with the latest PDF 2.0 ISO spec. Note that Arlington is a data model and my software is only PoC - other implementers are taking the data model and making more robust and friendly software to report non-conformances.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants