Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opening 24 MB PDF file with graphics takes very long and 100 % CPU usage #18307

Closed
paulmenzel opened this issue Jun 20, 2024 · 13 comments · Fixed by #18910
Closed

Opening 24 MB PDF file with graphics takes very long and 100 % CPU usage #18307

paulmenzel opened this issue Jun 20, 2024 · 13 comments · Fixed by #18910

Comments

@paulmenzel
Copy link

Attach (recommended) or Link to PDF file here:

425429.pdf from https://www.osce.org/files/f/documents/f/5/425429.pdf, 24 MB, MD5: 63f989b7058bb5bc2a48319ab3ca9139

Configuration:

Steps to reproduce the problem:

  1. Open the PDF
  2. Notice rendering takes several minutes and usage of one CPU thread is 100 %

What is the expected behavior? (add screenshot)

It should work

What went wrong? (add screenshot)

It didn’t render instantly.

PS: Evince 46.3 also has trouble rendering this.

@calixteman
Copy link
Contributor

How do you want to render instantly a file containing 51 jpeg images 9933x7016 !!!?
I've a powerful workstation with a lot of RAM and it took 80s in Chrome (vs 180 in Firefox) and I gave up to try to render it with Acrobat.
I understand that from your point of view it's just a single basic page at the end but internally this pdf is awful.
I'm not sure we've that much room for improvements here.

@paulmenzel
Copy link
Author

Thank you for the analysis.

How do you want to render instantly a file containing 51 jpeg images 9933x7016 !!!?

;-)

I've a powerful workstation with a lot of RAM and it took 80s in Chrome (vs 180 in Firefox) and I gave up to try to render it with Acrobat.

Should this be reported to Firefox to be at least on par with Chrome?

I understand that from your point of view it's just a single basic page at the end but internally this pdf is awful.
I'm not sure we've that much room for improvements here.

Understood. Feel free to close.

I wonder what machine the creator of the document used five years ago. The document properties say:

Adobe Photoshop for Windows -- Image Conversion Plug-in
Adobe Photoshop CC (Windows)
Mi 19 Dez 2018 14:28:24 +02:00
Do 11 Jul 2019 10:29:41 +03:00

I guess I need to switch to Microsoft Windows.

@calixteman
Copy link
Contributor

I'm on Windows 11 and it's slow.
Maybe it was fine in Photoshop and then they created the pdf without checking that the result is "correct" (you know it's like those kind of one-line patches you don't test because it's so obvious... and at the end you've a bug).
That said, if you really need to share such a pdf, you should flatten it in printing it into pdf !
I've some plans to use a builtin jpeg decoder which should seriously improve the situation here but as said 51 9933x7016 images is something...

@paulmenzel
Copy link
Author

I'm on Windows 11 and it's slow.

I guess you need Microsoft Windows 10 or 7. :P

Thank you for the hint with the printing to PDF.

I've some plans to use a builtin jpeg decoder which should seriously improve the situation here but as said 51 9933x7016 images is something...

Any idea, which decoder is used right now on GNU/Linux (GNOME Shell with X.Org here)?

As written, feel free to close.

@timvandermeij
Copy link
Contributor

If it helps, we have compiled a list of optimization tips for PDF files at https://github.com/mozilla/pdf.js/wiki/Frequently-Asked-Questions#what-types-of-pdf-files-are-slow-in-pdfjs-can-i-optimize-a-pdf-file-to-make-pdfjs-faster that could make a significant difference if you're in control of the PDF file.

@MrSuddenJoy
Copy link

@paulmenzel 256GB RAM, 4Gbps connection, downloading + rendering this file took 0.02s.

@paulmenzel
Copy link
Author

@MrSuddenJoy, thank you for the feedback. I’d have assumed it’s limited by the CPU resources. Can you share your CPU model and environment and version too, please, so I can reproduce?

@MrSuddenJoy
Copy link

I’d have assumed it’s limited by the CPU resources.

@paulmenzel this is the most probable cause. But question remains: why someone would limit CPU resources per proccess? The only, reasonable for me, answer, is that pdf.js is hosted on limited-resource machine (like shared-hosting/VPS and alike)....

your CPU model

Sure thing :)
Zrzut ekranu 2024-06-28 o 12 10 38

@paulmenzel
Copy link
Author

The only, reasonable for me, answer, is that pdf.js is hosted on limited-resource machine (like shared-hosting/VPS and alike)....

pdf.js is part of the Firefox browser, so runs on desktop with all kinds of configurations.

@MrSuddenJoy
Copy link

@paulmenzel

  • not using Firefox,
  • there is ability to run pdf.js as a standalone script.....

@calixteman
Copy link
Contributor

@paulmenzel 256GB RAM, 4Gbps connection, downloading + rendering this file took 0.02s.

I don't really know how it's possible... I've myself a desktop machine (Windows 11) with 64Gb RAM and 32 cores (3.5gHz).
In Firefox it takes 1:50m to render, 40s in Chrome and 12s in Acrobat.
The main bottleneck in pdf.js is having to decode the images with a pure js decoder.
@MrSuddenJoy I don't know what you're measuring here, I don't know if the final rendering is correct, but I'm a bit doubtful about your 0.02s ... maybe more realistically it's 0.02h...

@MrSuddenJoy
Copy link

@calixteman Im using server hardware. Meaning bare-metal

@peteygao
Copy link

@MrSuddenJoy That's impossible due to physics. You only have a 4Gbps downlink, not 10Gbps. Downloading a 24MB file will take 47ms (do the math: 24 / (4096 / 8 ) = 0.046875). You couldn't have downloaded and rendered the file in 0.02ms (20ms) because it takes more than double that time to download the file!

Re-measure it again, or else your results are null and void.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants