-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Email 198 performance in image html email #239
base: master
Are you sure you want to change the base?
Email 198 performance in image html email #239
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @seanfabs
Thank you for your PR. Please see my comments.
commons-email2-jakarta/src/test/resources/scripts/example-script.js
Outdated
Show resolved
Hide resolved
commons-email2-jakarta/src/test/java/org/apache/commons/mail2/jakarta/ImageHtmlEmailTest.java
Outdated
Show resolved
Hide resolved
Hello @seanfabs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@seanfabs
Please see my comment.
Thanks for the suggestion, changes made |
67a88f4
to
59a9c16
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #239 +/- ##
============================================
+ Coverage 74.34% 76.71% +2.36%
- Complexity 353 635 +282
============================================
Files 22 39 +17
Lines 1072 1915 +843
Branches 126 233 +107
============================================
+ Hits 797 1469 +672
- Misses 216 345 +129
- Partials 59 101 +42 ☔ View full report in Codecov by Sentry. |
There are several fixes and improvements to the regex's here:
\\s*[^>]*?\\s+
part, this is because the\s
,[^>]
, and\\s+
parts can all match whitespace and so when excessive whitespace is encountered it will attempt all combinations to find a match. This is a fairly well documented phenomenon https://blog.codinghorror.com/regex-performance/<script>
tag matching did not match multiple scripts.<imgx ...>
[^\"']+
which gives a slight performance boost - we will always want to grab the whole url.One question is should I also update the similar class in the javax package?
Benchmark results
VM version: JDK 17.0.10, OpenJDK 64-Bit Server VM, 17.0.10+8-b1207.12
Result for old regex
(<[Ii][Mm][Gg]\s*[^>]*?\s+[Ss][Rr][Cc]\s*=\s*["'])([^"']+?)(["'])
32.339 ±(99.9%) 4.394 ops/s [Average]
(min, avg, max) = (20.590, 32.339, 40.601), stdev = 5.866
CI (99.9%): [27.945, 36.733] (assumes normal distribution)
Result for new regex - candidate non greedy url
(<[Ii][Mm][Gg](?=\s)[^>]*?\s[Ss][Rr][Cc]\s*=\s*["'])([^"']+?)(["'])
1244.602 ±(99.9%) 225.819 ops/s [Average]
(min, avg, max) = (518.410, 1244.602, 1512.089), stdev = 301.462
CI (99.9%): [1018.783, 1470.420] (assumes normal distribution)
Result for new regex
(<[Ii][Mm][Gg](?=\s)[^>]*?\s[Ss][Rr][Cc]\s*=\s*["'])([^"']+)(["'])
1480.002 ±(99.9%) 160.408 ops/s [Average]
(min, avg, max) = (986.366, 1480.002, 1651.095), stdev = 214.140
CI (99.9%): [1319.594, 1640.409] (assumes normal distribution)