Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extensions #68

Open
5 of 7 tasks
dmitry-shechtman opened this issue Dec 22, 2015 · 25 comments
Open
5 of 7 tasks

Extensions #68

dmitry-shechtman opened this issue Dec 22, 2015 · 25 comments
Milestone

Comments

@dmitry-shechtman
Copy link
Contributor

@Knagis,

So I've had some solid progress on enabling extensions. Now I'd appreciate your feedback on pre-existing features in this regard.

  1. Strikeout

    seems to me as an obvious candidate.

  2. Line breaks

    I believe this one should also be an extension.

  3. Unicode bullet

    I didn't see it mentioned anywhere in the spec. Extension?

  4. HtmlFormatter + OutputDelegate

    Are two different formatters really required? AFAIK one of these was added as a means to enable customizations. Since we now have a modular design, no modifications to the bulk formatters should be required.

  5. TrackSourcePosition

    The implementation would obviously remain deeply integrated, just the API would be a little cleaner, similarly to the reference label case extension.

  6. UriResolver

    Need I say more?

  7. Anything else?

Thanks

Edit: Added omissions and turned into checklist.

@Knagis
Copy link
Owner

Knagis commented Dec 22, 2015

Strikeout can be an extension, line breaks could be as well. The problem is though that currently it is very easy for the developer to discover these additional features that are supported by default (through the switch). It would be nice that the discovery could be made just as easy.

Unicode bullet is not in the spec (I haven't heard any good arguments against supporting it and I actually made a PR at one point that added it both to the spec and both reference implementations, the PR is still open). It could be an extension though it is a question of does it really have to be - maybe it is easier to just leave it as is.

HtmlFormatter was mainly created because people wanted to do things like <h1 class="header">foo</h1> with as little code as possible.

@dmitry-shechtman
Copy link
Contributor Author

All valid points. Let me address each separately (these could easily become own issues).

@dmitry-shechtman
Copy link
Contributor Author

Strikeout can be an extension, line breaks could be as well. The problem is though that currently it is very easy for the developer to discover these additional features that are supported by default (through the switch). It would be nice that the discovery could be made just as easy.

Sure, it's easy to discover Strikethrough when it's the only available additional feature ;)

Admittedly I didn't dedicate much thought to extension discoverability. All extensions currently reside in the Extension namespace, which may quickly become bloated as settings and features are added.

So, for example, there is a FancyLists extension with FancyListsSettings, which in turn has FancyListsFeatures, NumericListStyles and AdditiveListStyles.

One possible solution would be to use type nesting (FancyLists.Features etc.), although I'm personally leaning against it.

There's always the option of restricting customizability (Yes, I might have gotten carried away with the multitude of settings and features). That doesn't seem to work in the above case, however, as most scenarios won't require UpperLatin, ArabicIndic and Hebrew being always on.

The path of the least resistance (and my personal favorite) would be to leave things as are and refer users to documentation (once that's available).

@dmitry-shechtman
Copy link
Contributor Author

Unicode bullet is not in the spec (I haven't heard any good arguments against supporting it and I actually made a PR at one point that added it both to the spec and both reference implementations, the PR is still open).

I'm sorry to hear that your PR didn't get through.

There is indeed no reason why it shouldn't be added to the spec. However, I'm not sure having an always-on feature that isn't in the spec (yet?) is such a good idea.

FancyLists have 4 bullet list types (I used a different code point for disc), all off by default (although that in itself is subject to a separate discussion).

@dmitry-shechtman
Copy link
Contributor Author

HtmlFormatter was mainly created because people wanted to do things like <h1 class="header">foo</h1> with as little code as possible.

In the proposed design one would

  1. create a custom formatter that extends HeaderFormatter
  2. override WriteOpening()
  3. create a custom extension that initializes the formatter
  4. register the extension in settings.Extensions.

That might require a little more user code, but zero changes to the library code.

P.S. How much a performance gain did you get with the delegate array trick? I didn't apply to formatters yet, but I suspect that it would also be somewhat faster than a virtual method call for each element (especially inline).

@Knagis
Copy link
Owner

Knagis commented Dec 23, 2015

About the extension visibility - one idea I had - each extension would expose an extension method on the CommonMark namespace, so one could write something like:

using CommonMark;

var settings = CommonMarkSettings.Default.Clone();
settings.EnableFancyLists(some settings here as arguments);

This approach would enable other developers to write extensions that are external but for a developer those could be added the same way.

This would also enable you to separate the extension classes in separate namespaces for code clarity since the additional level would not add any extra work to the developer.

@Knagis
Copy link
Owner

Knagis commented Dec 23, 2015

P.S. How much a performance gain did you get with the delegate array trick? I didn't apply to formatters yet, but I suspect that it would also be somewhat faster than a virtual method call for each element (especially inline).

The original code was something like

if (c == '*' || c == '_') HandleEmphasis();
else if (c == '!') HandleExclamation();
etc...

The delegate array approach resulted in the same performance but with the added flexibility of changing the handle methods on the fly and without the need to hardcode settings checks in the parser method.

@Knagis
Copy link
Owner

Knagis commented Dec 23, 2015

However, I'm not sure having an always-on feature that isn't in the spec (yet?) is such a good idea.

It's not. I was always hoping that it would get to the spec sooner than later.

@Knagis
Copy link
Owner

Knagis commented Dec 23, 2015

HtmlFormatter

HtmlFormatter could go if the same can be achieved with different means. What I would like to keep is HtmlFormatterSlim since that produces the best performance.

What performance are you seeing with the extension model?

@dmitry-shechtman
Copy link
Contributor Author

Extension methods could be nice, but I see two problems with these:

  1. won't work in .NET 2.0
  2. would make the extension code less pretty.

I have found a workaround for 1, and 2 may be just my distorted point of view, but still.

@dmitry-shechtman
Copy link
Contributor Author

What I would like to keep is HtmlFormatterSlim since that produces the best performance.

I assumed as much. I'll then remove HtmlFormatter and rename HtmlFormatterSlim.

What performance are you seeing with the extension model?

I didn't measure yet, but I'm expecting that to be at least as good as that of the monolithic design (I applied the same trick to all parser methods, so instead of an if/else forest it's just a couple of delegate invocations - alright, more than a couple, but those should still perform much better than virtual calls.).

@dmitry-shechtman
Copy link
Contributor Author

The following exist in cmark:

  1. Safe (a.k.a. HTML SanitizationMuting)

    Am I correct in my presuming that this wasn't ported yet?

  2. Smart (a.k.a Smart Punctuation)

    I am aware of the potential performance hit this might incur. Yet it's something that many users see as basic Markdown functionality (Smarty Pants has been around almost since the beginning).

  3. Tree (a.k.a. XML AST)

    A low priority one. Provided that Printer uses exactly the same tag names (does it?), this should be pretty straightforward.

@Knagis
Copy link
Owner

Knagis commented Dec 24, 2015

  1. won't work in .NET 2.0
  2. would make the extension code less pretty.

I don't think 1. is an issue - since most people even if they are forced to target .NET 2.0 will be using newer compilers that do support extension methods.

Why would it make it less pretty? In my mind the extension method was simply a shortcut for enabling the shortcut - if the dev wants he can still instantiate the extension and add it manually to the settings instance.

A similar approach is used with Owin in ASP.NET - the various frameworks add extension methods for IAppBuilder interface, so you end up simply calling app.MapSignalR().

PositionTracking

Perhaps, if that is indeed possible. Note #21 - if the implementation is changed drastically, this should be kept in mind - even if the implementation is not updated but at least to not make it harder to implement that down the road.

CustomFormatter

?

Safe (a.k.a. HTML Sanitization)

The implementation in cmark is not really HTML sanitization but rather disabling of HTML and unsafe URI schemes. I would prefer to use a separate external sanitization library that would actually process the HTML instead of prohibiting it. The problem is that the last time I looked, I could not find anything in the .NET world (MS had made one but deprecated it).

Smart

Yes, this one should be implemented. I once almost started but only almost.

Tree

It could be added and should be pretty easy to. Though I think I still prefer the plain text output in case I have to debug something. But there was at least one who was actually using that XML output in cmark for further processing.

@dmitry-shechtman
Copy link
Contributor Author

Why would it make it less pretty?

I was actually thinking about built-in extensions. Those would either need to open a separate namespace declaration, e.g.

namespace CommonMark
{
    public static partial class CommonMarkSettingsExtensions
    {
        public static void RegisterMathDollars(this CommonMarkSettings settings)
        {
            settings.Extensions.Register(new Extension.MathDollars(settings));
        }
    }
}

or a single file with a CommonMarkSettingsExtensions class would need to reference all built-in extensions at once (which is slightly prettier since I already have a RegisterAll() method that does the same).

I don't think 1. is an issue - since most people even if they are forced to target .NET 2.0 will be using newer compilers that do support extension methods.

This is a runtime issue rather than a compiler one. Trying to compile the above for .NET 2.0 results in

error CS1110: Cannot define a new extension method because the compiler required type 'System.Runtime.CompilerServices.ExtensionAttribute' cannot be found. Are you missing a reference to System.Core.dll?

As already mentioned, this is magically solved with a reference to LinqBridge. However, this begs the question: How many people actually use the .NET 2.0 version? Perhaps 2016 is the year it finally gets to retire?

@dmitry-shechtman
Copy link
Contributor Author

By CustomFormatter I meant OutputFormat.CustomDelegate. Sorry about the misnomer.

@dmitry-shechtman
Copy link
Contributor Author

The implementation in cmark is not really HTML sanitization but rather disabling of HTML and unsafe URI schemes. I would prefer to use a separate external sanitization library that would actually process the HTML instead of prohibiting it.

Yes, I noticed that after-the-factly. I think it's a matter of cmark compatibility. Surely CommonMark.NET is CommonMark-compliant, but it might look bad if it lacks an option some would consider a critical one. Maybe it should just ape cmark until a proper solution is found.

@Knagis
Copy link
Owner

Knagis commented Dec 24, 2015

The only reason I see to keep the 2.0 version is if someone somewhere wants to use it in an environment like SQL Server 2005 that allows to run managed code in its own process but only supports .NET 2.0. From what I read, we could add the attribute to the code the same as Lazy and Func.

There could be other reasons why for example JSON.NET still supports .NET 2.0.

OutputFormat.CustomDelegate

That is the current option (though ugly) to use custom formatters - the idea is that you could modify CommonMarkSettings.Default and thus change how another library that wraps commonmark.net is generating the output. It can go away as soon as there is a better way of specifying the formatter.

HTML sanitization

Yes, since there is no better solution for now, the simple disable-html flag could be the answer.

@dmitry-shechtman
Copy link
Contributor Author

From what I read, we could add the attribute to the code the same as Lazy and Func.

I tried that, it didn't work (although I might have done something wrong).

There could be other reasons why for example JSON.NET still supports .NET 2.0.

Sentimental value? 😛

Unlike JSON.NET, CommonMark.NET has the luxury of being in pre-release, which means it has much greater freedom in this regard.

Although I managed to work around missing .NET features so far, it feels like coding with one hand tied behind one's back. I recall at least fourfive occurrences of adding/removing/adding/removing Func flavors.

@dmitry-shechtman
Copy link
Contributor Author

It can go away as soon as there is a better way of specifying the formatter.

Gone 😎

@dmitry-shechtman dmitry-shechtman added this to the 1.0 milestone Dec 24, 2015
@Bryan-Legend
Copy link

Any chance you could add image size extensions?

https://github.com/jgm/CommonMark/wiki/Deployed-Extensions#image-size

@mike-ward
Copy link

These are common requests I get in Markdown Edit for extensions:

  • Language specification in fenced code blocks for syntax highlighting

    ```C#
    
  • Generate header/image identifiiers for later styling

  • Checkboxes similar to GitHub

  • Tables

@mdmoura
Copy link

mdmoura commented Nov 18, 2016

Math formulas for Math Jax?

@drusellers
Copy link

TABLES

@asbjornu
Copy link

  • I'd love to have checkboxes!

If I was to implement such an extension myself, how should I go about doing it?

@Knagis
Copy link
Owner

Knagis commented Oct 8, 2018

Sorry for the late response. Overall my intention for a while now has been to update the readme to say that this package is deprecated and instead recommend to use https://github.com/lunet-io/markdig

The reason is that when I started work on porting cmark to .net I was under the impression that the standard will be constantly improved and things like tables will be added to it so that there would not be such a significant need for extensions. Unfortunately that did not turn out to be true, so the design of markdig that enables extensions is much more suitable for the current situation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants