Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What's the best way to add extensions? #43

Open
yetanotherchris opened this issue Sep 15, 2015 · 18 comments
Open

What's the best way to add extensions? #43

yetanotherchris opened this issue Sep 15, 2015 · 18 comments
Labels

Comments

@yetanotherchris
Copy link

I want to add a few basic extensions using CommonMark.NET: image dimensions/alignment and tables.

Is there a way of plugging into the parser yet to do this? Some suggestions on the CommonMark forum suggest putting the image dimensions in the alt part of the image tag, which I can handle with a CustomHtmlFormatter but the tables looks a lot more involved.

Any suggestions would be welcomed!

@Knagis
Copy link
Owner

Knagis commented Sep 16, 2015

Adding inline rules is simple - they should be added to InlineMethods.InitializeParsers(). There is some advanced topics regarding the stack used in prioritizing overlapping inlines (such as emphasis and links) - the logic is described in the spec though there are slight differences now between the reference implementation and this project (though it started with identical approach).

Adding block rules is much more complex - they have to be added in BlockMethods.IncorporateLine(). The good news is that this method is kept almost identical (structure-wise) as the reference C implementation (although it is currently still missing an update for the new HTML block parsing rules).

@yetanotherchris
Copy link
Author

I'm a bit wary of changing your implementation of the spec, as you obviously have a lot more knowledge of the spec and C/C# parser.

Would it be possible to add hook points easily for the two method? And also extend BlockTag somehow, maybe add BlockTag.Extension or similar.

@yetanotherchris
Copy link
Author

For the table markdown, it may be best to have it parsed and turned into HTML before it reached CommonMark, so any columns with markdown inside them get styled correctly (if this is the right way to do it?). So in effect, I would just be adding extra inline parsing

@Knagis
Copy link
Owner

Knagis commented Sep 18, 2015

I have a good understanding of the inline parser but limited about the block parser - the code there is ported from the reference implementation and I have not really researched the logic.

I don't think that parsing tables before everything else is a good idea - if they are to work well with other structures such as lists such seperation might not be possible. You also would have to account for fenced code blocks etc.

@dmitry-shechtman
Copy link
Contributor

@yetanotherchris Did you make any progress on the tables? I'm looking to add them as well.

@Knagis
Copy link
Owner

Knagis commented Nov 19, 2015

@dmitry-shechtman - see https://github.com/kevin-montrose/CommonMark.NET/tree/ast-transforms-and-tables-squashed - it seems that there is a table parsing already implemented.

I haven't had the time to properly look into ir - perhaps @kevin-montrose can give some background on that implementation and if it is something that I should look for merging back into the main library.

@dmitry-shechtman
Copy link
Contributor

@Knagis Thanks for the reply. That looks quite promising. Some interesting AST Transforms stuff as well. Any plans on pulling it?

In the meantime, I implemented sub/sup, although I didn't test those extensively. Which brings me to another issue...

@kevin-montrose
Copy link

@dmitry-shechtman @Knagis

The table implementation is pretty adhoc, I try to match Github-style tables but wasn't able to find a proper "spec" for them. The basic approach is to do a pass over paragraphs and see if they can be turned into tables, basically waiting until the last possible moment to make the decision. The code works for our (currently, very limited - Q&A remains on pre-CommonMark Markdown) purposes at Stack Overflow, but I doubt it's performant enough to be merged upstream.

The AST Transforms are a little rough (things like "insert after" and "insert before" require lots of work, there's only helpers for "replace" and "remove" because that's all we needed), but might serve as an OK starting point (the biggest annoyance was figuring out adjustments). Biggest issue to merging is that it requires a full copy of the markdown be kept, which again hurts performance.

@dmitry-shechtman
Copy link
Contributor

@kevin-montrose Thanks for chiming in. I'm a SO user since beta, which makes this even more exciting!

You've done a very impressive job. I'm not worried about the performance, but I do have concerns, such as:

  1. My own changes. Although the outcomes so far are modest (sub/sup and case-sensitive reference labels), they required extensive modifications, such as propagating the settings through all inline methods down to NormalizeReference().
  2. The AST Transforms are not an actual requirement at this point. Are tables-squashed safe for merge?
  3. How do I go about the merge? My git(hub) knowledge is lacking. Can I create a pull request to myself?

Thanks again, and sorry about the noob question.

@kevin-montrose
Copy link

@dmitry-shechtman tables-squashed is lacking some fixes for bugs that were found after it and the AST branches were merged - I wouldn't merge it.

Easiest thing is to probably to pull the full branch, delete the AST stuff, then merge into your changes. If I recall correctly, the AST changes are all contained in the Transforms/ folder so just deleting it should do the trick.

Be aware, the table extensions add .EquivalentMarkdown to Block and Inline, .OriginalMarkdown to Block, and requires you parse with CommonMarkSettings.TrackSourcePosition = true. There's also a new CommonMarkAdditionalFeatures: GithubStyleTables. I think I also deleted some deprecated properties, hopefully you don't need them.

Quickest path to merge in git would be something like:

  1. git checkout -b tables
  2. git pull [email protected]:kevin-montrose/CommonMark.NET.git st-transforms-and-tables-squashed
  3. Delete what you don't need and commit
  4. git checkout master
  5. git merge tables

@dmitry-shechtman
Copy link
Contributor

@kevin-montrose Thank you so much, that worked (though I wish you hadn't put the settings in the middle of the block method args ;)

A couple more questions if I may:

  1. Aren't AST transforms an opt-in feature?
  2. Isn't TrackSourcePosition turned on automatically with GithubStyleTables?
  3. Should TableInBlockquote fail?

@dmitry-shechtman
Copy link
Contributor

In the meantime, an easy fix for .NET 2.0 compatibility (thanks to SO :)

AMDL@8f02e57

@dmitry-shechtman
Copy link
Contributor

(although it is currently still missing an update for the new HTML block parsing rules).

@Knagis is this still relevant?

@Knagis
Copy link
Owner

Knagis commented Dec 24, 2015

No, those spec updates were already implemented.

@dmitry-shechtman
Copy link
Contributor

No, those spec updates were already implemented.

This makes this issue dangerously close to being closed 👍

@yetanotherchris
Copy link
Author

@dmitry-shechtman Is the table support part of your PR? Or is it still pending? Looking at the SO implementation that goes beyond anything I would do, so I'm not likely to re-invent the wheel until that's stable and ready to use.

The only problem I have with that is I want to ditch Creole support for Roadkill, and move to Markdown/CommonMark but obviously can't until table support is there, unless I put some quick and dirty regex code in to support tables/img dimensions.

@dmitry-shechtman
Copy link
Contributor

@yetanotherchris The last time I checked the "SO implementation" wasn't ready (not stable enough for me, that is).

I'm not sure which PR you're referring to. I'm working on a major overhaul of CommonMark.NET that would allow to seamlessly add extensions, but pipe tables aren't part of it yet (although they started it all - note the name of the branch).

@ruffin--
Copy link

(Realizing I'm zombie threading a little...) I've had success with pre-processing when I added table support to MarkUpDown, a Windows 10 Markdown editor that I moved over to CommonMark.NET last year. The process is relatively painless, though tedious...

  1. Pass through your Markdown, removing & storing all of the things, like code blocks, that shouldn't be parsed.
    • So, for instance, if you had GitHub flavored table markup in a fenced code block, you don't want to turn that into a table. Take the code block out.
  2. Byte sniff any table markup.
  3. Remove and store the GitHub flavored table markup (or whatever flavor you're using).
  4. Place markers for the now-removed tables in the Markdown to reinsert processed tables later.
  5. Reinsert the "to ignore" text from step 1 into your table-free Markdown source.
  6. Process the Markdown sans tables (but with table markers!) through CommonMark.NET normally.
  7. Translate your table markup (from step 3) to html with your own logic.
  8. Process the contents of each cell of the table through CommonMark.NET individually (painful, but straightforward).
    • I think that's legitimate. I can't think of a reason what happens in another cell or elsewhere in the doc would affect how you process an individual cell.
    • Bullheaded and inefficient, but the result is fine.
  9. Replace the markers for tables (from step 4) in your now-processed Markdown (from step 6) with the html for your processed table(s) (from step 8).
  10. Profit?

Obviously it should be more efficient to integrate with a good parser, but this route "does no evil" and means any mistakes are your own, not CommonMark.NET's.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants