Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tree-sitter grammars must maintain compatibility with specific Tree-sitter #684

Open
Xophmeister opened this issue Feb 26, 2024 · 6 comments

Comments

@Xophmeister
Copy link
Member

tree-sitter is a direct dependency of Topiary. Each Tree-sitter grammar (e.g., tree-sitter-json, tree-sitter-rust, etc.) has a transitive dependency on tree-sitter. The direct dependency and transitive dependencies must maintain API compatibility, otherwise Topiary won't build.

Currently, we are using Tree-sitter 0.20. v0.21 is now available and some grammars allow this, which causes Cargo to build both 0.20 and 0.21 for Topiary and subsequently fail. We can use the [patch] directive in our Cargo.toml to pin the transitive dependency (e.g., e91534b).

This might be more of a "Cargo problem", than a Topiary one, but it would be nice if there were some tooling around this. Perhaps, at least, a CI step that checks there's only one version of tree-sitter in the Cargo.lock...

(#4 will definitely be impacted by this.)

@yannham
Copy link
Member

yannham commented Feb 27, 2024

Would pinning the git dependencies on those grammars help, or even with a pinned dep, cargo would still somehow fetch a newer version of tree-sitter if available?

@Xophmeister
Copy link
Member Author

Xophmeister commented Feb 27, 2024

Would pinning the git dependencies on those grammars help, or even with a pinned dep, cargo would still somehow fetch a newer version of tree-sitter if available?

Pinning both the grammar and its transitive dependency on Tree-sitter appears to work in practice. It's a bit verbose, mind you:

[dependencies]
tree-sitter = "= X"
tree-sitter-foo = { git = "https://github.com/tree-sitter-foo/tree-sitter-foo.git", rev = "Y" }

[patch."https://github.com/tree-sitter-foo/tree-sitter-foo"]
tree-sitter = "= X"

Obviously, one has to pick a rev Y for which Tree-sitter X will work.

@yannham
Copy link
Member

yannham commented Feb 27, 2024

I see, you still need to patch part, which is a bit unfortunate.

@ZedThree
Copy link

Since tree-sitter 0.23 this is no longer the case. Grammars should depend on tree-sitter-language, with only a dev dependency on tree-sitter itself.

@ZedThree
Copy link

Looks like it's not as straightforward as bumping to the latest tree-sitter, 0.24.4:

modified   Cargo.toml
@@ -75,12 +75,12 @@ test-log = "0.2"
 tokio = "1.32"
 tokio-test = "0.4"
 toml = "0.8"
-tree-sitter = "0.22.6"
+tree-sitter = "0.24"
 # tree-sitter-bash = { git = "https://github.com/tree-sitter/tree-sitter-bash", rev = "1b0321ee85701d5036c334a6f04761cdc672e64c" }
 # tree-sitter-css = { git = "https://github.com/tree-sitter/tree-sitter-css.git", rev = "02b4ee757654b7d54fe35352fd8e53a8a4385d42" }
-tree-sitter-json = { git = "https://github.com/tree-sitter/tree-sitter-json.git", rev = "94f5c527b2965465956c2000ed6134dd24daf2a7" }
+tree-sitter-json = { git = "https://github.com/tree-sitter/tree-sitter-json.git", rev = "4d770d31f732d50d3ec373865822fbe659e47c75" }
 # tree-sitter-nickel = { git = "https://github.com/nickel-lang/tree-sitter-nickel", rev = "43433d8477b24cd13acaac20a66deda49b7e2547" }
-tree-sitter-ocaml = { git = "https://github.com/tree-sitter/tree-sitter-ocaml.git", rev = "036226e5edb410aec004cc7ac0f4b2014dd04a0e" }
+tree-sitter-ocaml = { git = "https://github.com/tree-sitter/tree-sitter-ocaml.git", rev = "98c2130c59ca7553b47086f91c5d22180151ad55" }
 # tree-sitter-ocamllex = { git = "https://github.com/314eter/tree-sitter-ocamllex.git", rev = "4b9898ccbf198602bb0dec9cd67cc1d2c0a4fad2" }
 # tree-sitter-query = { git = "https://github.com/nvim-treesitter/tree-sitter-query", rev = "a0ccc351e5e868ec1f8135e97aa3b53c663cf2df" }
 # tree-sitter-rust = { git = "https://github.com/tree-sitter/tree-sitter-rust.git", rev = "e0e8b6de6e4aa354749c794f5f36a906dcccda74" }
@@ -96,8 +96,3 @@ topiary-tree-sitter-facade = { version = "0.5.1", path = "./topiary-tree-sitter-
 topiary-core = { version = "0.5.1", path = "./topiary-core" }
 topiary-config = { version = "0.5.1", path = "./topiary-config" }
 topiary-queries = { version = "0.5.1", path = "./topiary-queries" }
-
-# tree-sitter-json's dependency on Tree-sitter is looser than ours, so
-# we have to pin its version to maintain API compatibility
-[patch."https://github.com/tree-sitter/tree-sitter-json"]
-tree-sitter = "0.22.6"

gives build failures in topiary-tree-sitter-facade parser.rs and query.rs.

The parser is an easy fix:

modified   topiary-tree-sitter-facade/src/parser.rs
@@ -4,7 +4,7 @@
 mod native {
     use crate::{
         error::{IncludedRangesError, LanguageError, ParserError},
-        language::Language,
+        language::{Language, LanguageRef},
         logger::{Logger, LoggerReturn},
         point::Point,
         range::Range,
@@ -31,7 +31,7 @@ mod native {
         }
 
         #[inline]
-        pub fn language(&self) -> Option<Language> {
+        pub fn language(&self) -> Option<LanguageRef> {
             self.inner.language().map(Into::into)
         }

seems to work, but tree_sitter::QueryMatches now implements StreamingIterator instead of Iterator, so this requires further changes

@ZedThree
Copy link

I've had a stab at updating Query to use tree-sitter 0.24, but I'm struggling with lifetime issues and I'm afraid my rust just isn't good enough.

The streaming iterator returns references to the items, which means QueryMatch needs to store a reference to tree_sitter::QueryMatch, so in Query::matches we would now have:

        pub fn matches<
            'query,
            'cursor: 'query,
            'tree: 'query,
            T: tree_sitter::TextProvider<I> + 'query,
            I: AsRef<[u8]> + 'query,
        >(
            &'query self,
            node: &Node<'tree>,
            source: T,
            cursor: &'cursor mut QueryCursor,
        ) -> impl StreamingIterator<Item = QueryMatch<'query>> + 'query + use<'query, 'tree, T, I> {
            cursor
                .inner
                .matches(&self.inner, node.inner, source)
                .map(Into::into)
        }

and I don't know how to convince the .map(Into::into) that its argument will live as long as self. This is as far as I can get:

error: lifetime may not live long enough
  --> topiary-tree-sitter-facade/src/query.rs:36:23
   |
22 |             'query,
   |             ------ lifetime `'query` defined here
...
36 |                 .map(|q: &'query tree_sitter::QueryMatch<'query, 'tree>| -> QueryMatch<'query> {Into::into(q)})
   |                       ^  - let's call the lifetime of this reference `'1`
   |                       |
   |                       requires that `'1` must outlive `'query`

My rust is not so great and I don't understand the error -- I thought that the &'query on the closure argument is saying precisely that its lifetime will be 'query

Any help would be appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants