Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PI: optimize _decode_png_prediction #2068

Merged
merged 3 commits into from
Aug 10, 2023
Merged

Conversation

exiledkingcc
Copy link
Contributor

@exiledkingcc exiledkingcc commented Aug 8, 2023

this commit makes tests/test_page.py::test_image_new_property cost from 8.98s to 5.69s on my machine, about x1.5 faster.

@codecov
Copy link

codecov bot commented Aug 8, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.01% 🎉

Comparison is base (f9d25e6) 94.23% compared to head (18ad99c) 94.24%.

❗ Current head 18ad99c differs from pull request most recent head 19429ba. Consider uploading reports for the commit 19429ba to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2068      +/-   ##
==========================================
+ Coverage   94.23%   94.24%   +0.01%     
==========================================
  Files          41       41              
  Lines        7340     7355      +15     
  Branches     1445     1448       +3     
==========================================
+ Hits         6917     6932      +15     
  Misses        263      263              
  Partials      160      160              
Files Changed Coverage Δ
pypdf/filters.py 94.53% <100.00%> (+0.15%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pubpub-zz
Copy link
Collaborator

unless I'm wrong, this filter is only for images: wouldn't it be more efficient to use pillow ? your opinion?

@exiledkingcc
Copy link
Contributor Author

i can not find any thing like this in Pillow. do you have any idea how to do it with pillow ?

@MartinThoma MartinThoma added the soon PRs that are almost ready to be merged, issues that get solved pretty soon label Aug 8, 2023
@MartinThoma
Copy link
Member

Very nice work 👍

On my machine it's:

OLD: 22s - 25s
NEW: 14s - 15s
=> 1.78x to 1.47x

The paeth_predictor can be deprecated, but I'll take care of that.

@MartinThoma
Copy link
Member

I'm updating the test code a bit ( #2072 ), but this improvement will almost certainly go into the release this weekend :-)

@MartinThoma
Copy link
Member

I would like to add a new benchmark for this as well (before I merge, so that we can see the difference).

The current benchmarks are not really helpful. Lets see if such a big change is visible.

@pubpub-zz
Copy link
Collaborator

i can not find any thing like this in Pillow. do you have any idea how to do it with pillow ?

I had a quick review and found no solution.😳

@MartinThoma MartinThoma merged commit f0781db into py-pdf:main Aug 10, 2023
12 checks passed
@MartinThoma
Copy link
Member

MartinThoma commented Aug 10, 2023

https://py-pdf.github.io/pypdf/dev/bench/ -> look at the very bottom :-)

image

This shows iterations per second, hence higher is better. Still, I'm a bit disappointed that the magnitude there is way lower than what @exiledkingcc and me observed.

@exiledkingcc
Copy link
Contributor Author

there is a huge mount of math calculations in the method, which is Python's weeknees. besides, it costs a lot of time for pillow to save png file.

MartinThoma added a commit that referenced this pull request Aug 13, 2023
## What's new

### Performance Improvements (PI)
-  optimize _decode_png_prediction (#2068)

### Bug Fixes (BUG)
-  Fix incorrect tm_matrix in call to visitor_text (#2060)
-  Writing German characters into form fields (#2047)
-  Prevent stall when accessing image in corrupted pdf (#2081)
-  append() fails when articles do not have /T (#2080)

### Robustness (ROB)
-  Cope with xref not followed by separator (#2083)

[Full Changelog](3.15.0...3.15.1)
@exiledkingcc exiledkingcc deleted the optimize branch August 17, 2023 10:18
@MartinThoma MartinThoma removed the soon PRs that are almost ready to be merged, issues that get solved pretty soon label Sep 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants