Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow performance on smaller documents #279

Open
GUI opened this issue Mar 12, 2024 · 1 comment
Open

Slow performance on smaller documents #279

GUI opened this issue Mar 12, 2024 · 1 comment

Comments

@GUI
Copy link

GUI commented Mar 12, 2024

I noticed that Commonmarker (v1.0.4) was performing worse than Kramdown for my own documents, which I wasn't expecting based on the benchmarks. I was able to replicate your benchmark results on my own computer (where commonmarker appears faster than Kramdown when using the default 11MB bencmark markdown file), but where I found a discrepancy was if I started benchmarking smaller documents, where then Commonmarker appears to be one of the slower options.

This slowdown for small documents is not the case for the v0.23.10 release of Commonmarker, so I think it's somehow tied to the switch to comrak. So I'm not sure if this is an issue with comrak or in the Ruby bindings, but I thought I'd start here since I stumbled into this when comparing Ruby libraries. I'm wondering if there's perhaps some overhead in initializing things or calling comrak that the large benchmark file maybe masks (since you'd be doing fewer iterations of calling Commanmarker repeatedly with a big document where more of the overhead is in the underlying Markdown parsing)? But I'm happy to shift this conversation over to comrak if you believe this is an issue on their end.

Here's what I'm seeing on my M1 MacBook Pro (observed both in Linux Docker images and directly on the Mac, but all of these tests would be using aarch64 binaries if that makes a difference):

v1.0.4 release with default benchinput.md input (11MB file)

input size = 11064832 bytes

ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin22]
Warming up --------------------------------------
           redcarpet     2.000 i/100ms
commonmarker with to_html
                         1.000 i/100ms
            kramdown     1.000 i/100ms
Calculating -------------------------------------
           redcarpet     23.045 (± 4.3%) i/s -    116.000 in   5.051665s
commonmarker with to_html
                          5.682 (± 0.0%) i/s -     29.000 in   5.119065s
            kramdown      0.401 (± 0.0%) i/s -      3.000 in   7.493124s

Comparison:
           redcarpet:       23.0 i/s
commonmarker with to_html:        5.7 i/s - 4.06x  slower
            kramdown:        0.4 i/s - 57.44x  slower

Roughly in line with the results from the readme where commonmarker is ~4x slower than redcarpet and Kramdown is ~60x slower than redcarpet.

v1.0.4 release with lorem1.md input (3.7KB file)

I used this lorem1.md sample file for a smaller benchmark file that is more representative of the files I'm passing through the library.

input size = 3789 bytes

ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin22]
Warming up --------------------------------------
           redcarpet     8.262k i/100ms
commonmarker with to_html
                        65.000 i/100ms
            kramdown   198.000 i/100ms
Calculating -------------------------------------
           redcarpet     85.995k (± 1.9%) i/s -    437.886k in   5.093855s
commonmarker with to_html
                        653.203 (± 1.4%) i/s -      3.315k in   5.075908s
            kramdown      1.946k (± 3.7%) i/s -      9.900k in   5.093986s

Comparison:
           redcarpet:    85995.3 i/s
            kramdown:     1946.4 i/s - 44.18x  slower
commonmarker with to_html:      653.2 i/s - 131.65x  slower

As you can see, with this smaller file Kramdown is now 44x slower than Redcarpet but Commonmarker is 131x slower than Redcarpet (and about 3x slower than Kramdown).

v0.23.10 release with default benchinput.md input (11MB file)

I then switched over to the libcmark-gfm based release of this gem to see how these same test files performed on the benchmark suite in that branch:

input size = 11064832 bytes

ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin22]
Warming up --------------------------------------
           redcarpet     2.000 i/100ms
commonmarker with to_html
                         1.000 i/100ms
commonmarker with to_xml
                         1.000 i/100ms
commonmarker with ruby HtmlRenderer
                         1.000 i/100ms
commonmarker with render_doc.to_html
                         1.000 i/100ms
            kramdown     1.000 i/100ms
Calculating -------------------------------------
           redcarpet     26.032 (± 3.8%) i/s -    130.000 in   5.015141s
commonmarker with to_html
                         17.601 (± 5.7%) i/s -     88.000 in   5.027352s
commonmarker with to_xml
                         17.368 (± 5.8%) i/s -     87.000 in   5.022493s
commonmarker with ruby HtmlRenderer
                          3.156 (± 0.0%) i/s -     16.000 in   5.142554s
commonmarker with render_doc.to_html
                         14.082 (±21.3%) i/s -     67.000 in   5.045525s
            kramdown      0.446 (± 0.0%) i/s -      3.000 in   6.727232s

Comparison:
           redcarpet:       26.0 i/s
commonmarker with to_html:       17.6 i/s - 1.48x  slower
commonmarker with to_xml:       17.4 i/s - 1.50x  slower
commonmarker with render_doc.to_html:       14.1 i/s - 1.85x  slower
commonmarker with ruby HtmlRenderer:        3.2 i/s - 8.25x  slower
            kramdown:        0.4 i/s - 58.37x  slower

With the large file Commonmarker is around 1.5x slower than Redcarpet. This also shows that Commonmarker v1.0.4 is about 2-3x slower than Commonmarker v0.23.10 for this type of large file, for whatever that's worth, but I realize you may have expected some performance differences with the switch in libraries.

v0.23.10 release with lorem1.md input (3.7KB file)

input size = 3789 bytes

ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin22]
Warming up --------------------------------------
           redcarpet    10.071k i/100ms
commonmarker with to_html
                         8.733k i/100ms
commonmarker with to_xml
                         9.069k i/100ms
commonmarker with ruby HtmlRenderer
                         2.574k i/100ms
commonmarker with render_doc.to_html
                         7.132k i/100ms
            kramdown   132.000 i/100ms
Calculating -------------------------------------
           redcarpet     94.489k (± 3.8%) i/s -    473.337k in   5.016736s
commonmarker with to_html
                         90.672k (± 1.6%) i/s -    454.116k in   5.009576s
commonmarker with to_xml
                         90.453k (± 1.8%) i/s -    453.450k in   5.014733s
commonmarker with ruby HtmlRenderer
                         25.674k (± 3.7%) i/s -    128.700k in   5.020167s
commonmarker with render_doc.to_html
                         72.578k (± 4.2%) i/s -    363.732k in   5.020900s
            kramdown      1.322k (± 3.8%) i/s -      6.732k in   5.098611s

Comparison:
           redcarpet:    94488.7 i/s
commonmarker with to_html:    90671.7 i/s - same-ish: difference falls within error
commonmarker with to_xml:    90453.4 i/s - same-ish: difference falls within error
commonmarker with render_doc.to_html:    72577.6 i/s - 1.30x  slower
commonmarker with ruby HtmlRenderer:    25673.7 i/s - 3.68x  slower
            kramdown:     1322.3 i/s - 71.46x  slower

Here you can see the bigger difference between the old and new releases of Commonmarker, since Commonmarker is nearly as fast as Redcarpet with this small file on under v0.23.10, but significantly slower in v1.0.4.

Let me know if you have any questions, but thanks for all your work on this gem!

@gjtorikian
Copy link
Owner

Thank you for these numbers and the test file. I had a similar experience while working on updating the benchmarks in 0224ec8.

To be honest, I have not really looked into the performance of this gem, but I am starting to come around to it being the next thing to look at. I don't know when I'll have the time to dedicate to measuring the performance. I can say that it's pretty likely that any potential optimizations can be made in comrak; Commonmarker is just a really dumb wrapper around that lib.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@GUI @gjtorikian and others