-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rust user observations #61
Comments
The big difference between There are some caveats though. This parser ( First of all it's hard / maybe impossible to have explicit "text content" nodes. I might give this another try though. I think it's still ok, as the information is present, but I see the impracticality. Another solution would be to provide methods to extract text in the binding library. The double tree structure is also caused by these restrictions. But please note that almost all markdown parsers use 2 passes (the spec even recommends to do so). Here as well it might make sense to abstract a bit more in the bindings and provide something like a There are some inefficiencies still and I'm trying to work on those. For the testing you did in (3.) you should make sure though that you use The error you mention in (4.) is definitely due to both libraries being loaded, as they both add C bindings with the same name and cargo can't deal with that. (That's ok though, as noone should probably use both libraries :D) Finally there is already support for tables. I just didn't get around to pushing these changes to crates.io. (Also tables are still not tested very well) |
Also, thanks! This is very valuable feedback |
* Implement `parse_with` for `MarkdownTree` * Add `MarkdownCursor` to traverse `MarkdownTree` more easily * Add tests for `MarkdownCursor` and pipe tables
I implemented a lot of your feedback. I added |
I've been testing both tree-sitter-md and tree-sitter-markdown over the past couple days and have the following observations. Note that each observation is standalone and they aren't listed in any particular order. Also note that my use case is general in that it is not specifically or only concerned with syntax highlighting but includes other uses such as general markdown parsing/processing similar to pulldown-cmark.
While tree-sitter-md (this crate) is more recently updated and maintained than tree-sitter-markdown, and depends on tree-sitter 0.20, its grammar seems more basic than tree-sitter-markdown's. In particular:
inline
kinds are present, and there are inline nodeinline
kinds, but there seems to be no kind for just the plain text (?), so this would require additional calculations to extract the plain text, or replace parts of the rendered content with the inline spans... (?)tree-sitter-md uses a 2-pass approach which forces walking those multiple trees. Perhaps the
MarkdownTree
struct could provide a method to walk all of the trees? tree-sitter-markdown doesn't work like this, and there aren't any drawbacks afaik, because whether a node is a block or inline can be determined by its kind. Here's an example of usage for this crate, but tree-sitter-markdown does not need the nested bit to walk the inline tree(s).In my initial testing, parsing 2 large but slightly different files both took roughly the same amount of time (50 ms) for each file, which should not be the case because it should only be processing the changes (right?). Using tree-sitter-markdown, it took 50 ms for the first one and 4 ms for the second one. Perhaps I'm doing something wrong?
Saw an error during
cargo test
:thread 'project::tests::doc_from_empty' panicked at 'Could not load injection query: QueryError { row: 0, column: 1, offset: 1, message: "inline", kind: NodeType }',
tree-sitter-md-0.1.1/bindings/rust/lib.rs:113:14
. I did not see this on a standalone attempt; it only cropped up when I combined both tree-sitter-md and tree-sitter-markdown into the same project, so would guess that is causing some specific conflict.The text was updated successfully, but these errors were encountered: