Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smartypants plugin seems to be drastically slower #33

Open
calculuschild opened this issue Jun 2, 2023 · 11 comments
Open

Smartypants plugin seems to be drastically slower #33

calculuschild opened this issue Jun 2, 2023 · 11 comments

Comments

@calculuschild
Copy link

calculuschild commented Jun 2, 2023

In a 5-page document, Markdown 4.3.0 with the smartypants option takes ~140 ms in my application to render the document.

In v5.0.4, switching to the marked-smartypants extension, parsing now takes ~307 ms. I.e., more than half of the parsing time is spent inside this plugin. This makes live-editing a document very laggy.

Any ideas to speed this extension up? Performance profiling in Chrome show it is spending a lot of time in the educateQuotes() function, but in general the bulk of the processing is getting hung up on the many str = str.replace( functions:

Of the ~167 ms slowdown I see:

function slowdown
educateQuotes() 106 ms
educateDashesOldschool() 10 ms
educateBackticks() 9 ms
educateEllipses() 11 ms
ProcessEscapes() 24 ms
other 7 ms

I realize this is just including the smartyPants package directly, so editing the code isn't really feasible, but it is so slow compared to the simpler Marked.js approach. With Marked.js 5 deprecating this feature natively, this extension is a bit overkill for the "high speed" goal of Marked.js.

Thoughts?

@calculuschild
Copy link
Author

Perhaps some kind of "lite" option that just does the old behavior?


function smartyPantsLite() {
  return {
    tokenizer : {
      inlineText(src) {
        // don't escape inlineText
        const cap = this.rules.inline.text.exec(src);

        /* istanbul ignore next */
        if(!cap) {
          // should never happen
          return;
        }

        const text = cap[0]
          // em-dashes
          .replace(/---/g, '\u2014')
          // en-dashes
          .replace(/--/g, '\u2013')
          // opening singles
          .replace(/(^|[-\u2014/(\[{"\s])'/g, '$1\u2018')
          // closing singles & apostrophes
          .replace(/'/g, '\u2019')
          // opening doubles
          .replace(/(^|[-\u2014/(\[{\u2018\s])"/g, '$1\u201c')
          // closing doubles
          .replace(/"/g, '\u201d')
          // ellipses
          .replace(/\.{3}/g, '\u2026');

        return {
          type : 'text',
          raw  : cap[0],
          text : text
        };
      }
    }
  };
};

@UziTech
Copy link
Member

UziTech commented Jun 3, 2023

That could be a separate extension. Something like marked-smartypants-lite. The biggest problem with that approach is it doesn't follow the smartypants spec and it fails to convert quotes correctly.

I think the best approach would be to make smartypants faster.

@sernaferna
Copy link

Any update on this issue? I was about to update Marked (requiring this new extension) when I found this issue, so now I'm holding off until I know I'm not going to lose performance.

@calculuschild
Copy link
Author

I ended up making a "lite" version of this extension that didn't have such slowdown. I'll see if I can publish it as a package shortly.

@calculuschild
Copy link
Author

calculuschild commented Jun 22, 2023

@sernaferna
Copy link

Honestly, I'm going to leave it alone for now and leave marked at version 4.3.0. I don't like having an out of date package like that, but the whole "ecosystem" is a mess. It's not just smartypants, xhtml support has also been pulled into its own extension, which isn't playing nicely in TypeScript. (I could create my own definition file and probably get it working but... what problem would I find next?)

When things have settled down and all of the extensions are working properly I'll look into this again.

@UziTech
Copy link
Member

UziTech commented Jun 28, 2023

@sernaferna sticking with marked v4 should be fine, but if you are going to wait until marked doesn't have any more problems you are going to be waiting a long time (probably forever).

Since marked is free open source software not backed by any company those problems only go away when someone who find a problem has enough ambition and free time to fix it. A better approach than waiting would be to contribute a fix for problems that you find. That would make the problems that you care about go away faster. (Like marked-xhtml working well with typescript)

@sernaferna
Copy link

If I had the expertise I'd be happy to, but unfortunately I was leveraging this library because I don't have the expertise. I may have to move off of marked altogether at some point (maybe to PageDown?), and it's in my backlog to look into it, it just hasn't bubbled up to the top yet...

@UziTech
Copy link
Member

UziTech commented Jun 28, 2023

No one has the expertise until they try 😁. The best thing about open source software is that you don't need expertise to try.

@webketje
Copy link
Contributor

webketje commented Jan 30, 2024

A cursory look at the source code reveals that the smartypants library:

  • could use some ESlint love (no-else-return, no-var a.o.)
  • does not have the concept of an "instance" like marked does. Every time it is called, it does config normalization with 2 to 30!! if checks (the string 'weibq' will be the slowest)
  • re-checks and re-normalizes its config for every token (that is your input string divided by the amount of HTML tag/ plain text interruptions)
  • omits else checks for no reason which would avoid useless reprocessing like here, here, and here
  • redefines expensive regexes that could be defined once in every utility function run

@UziTech
Copy link
Member

UziTech commented Jan 31, 2024

@webketje that would be a good list to post on the smartypants repo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants