-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse PDFs with broken xref section #101
Comments
Hello, you already did the next best thing to fixing it yourself. Seems something is fishy with the xrefs, but I have not looked at the details yet. |
Thanks your message. Yes, I can fix xrefs manually. In debugging, I found a process that adds the This process is called in the following process, which causes a panic due to index shifting. I hope this is of some help to you. |
Not sure how to handle the file. It is quite broken. |
The xref section is broken. |
Please excuse the delay in replying. I'm guessing that the viewer ignores or doesn't handle xref sections as strictly as it should, since they open fine in the viewer. Also, as importantly, I would like to at least avoid panic. |
The xref section is broken, so it can't be used to access data in the file efficiently. It does not panic anymore as of baa9235 . |
Thank you so much for fixing that! PDFs with broken xrefs like this are probably really rare cases, so I can't allege that you should spend the effort to parse the entire file. I'm very happy to see this will be implemented as a fallback etc. |
I tried this crate to parse my pdf collection. It can not parse ~100 files from 11866, |
That is to be expected. |
I know, in comment I tried to point out that all "mature" enough libraries has fallback in case there are problems with xref, |
The simple solution is to build a new xref section from the file data. |
Hi,
I got a panic when loading some PDFs. They are displayed fine in the viewer. I have attached an example of a PDF that can be reproduced. Is there anything I can do about this kind of PDF?
thanks.
invalid.pdf
The text was updated successfully, but these errors were encountered: