Generate syntax highlighting files #4113
Replies: 2 comments 1 reply
-
Antlr Lexer grammars don't provide the level of details required to achieve this goal. All Lexer patterns are equal, whereas code editors typically expect different sets for control keywords (if, else ...) built-in types (string, int ...) punctuation pairs ( parenthesis, brackets ...) comments and so forth. |
Beta Was this translation helpful? Give feedback.
-
A grammar by itself is insufficient for syntax highlighting! You must have additional information to describe an association of parts of the parse tree--or tokens--that you are interested in with some highlighting property such as color. A parser generator only inputs a grammar. The extra information for syntax highlighting has basically nothing to do with a grammar and a parser generator. So, the correct design would be to use a parser generator to create a tool that inputs the parse tree plus additional syntax highlighting tuples, then implements syntax highlighting. For example, TextMate tmLanguage files describe a set of tuples of regular expressions and types. The main problem I have with TextMate is that it's really grotesque as a specification:
XText implements syntax highlighting in Eclipse by assuming an association with certain names in the grammar, like "Identifier". What a horrible design! You know what they say about "assume"--"ass" "u" "me". At one point, I prototyped a Visual Studio Code extension and C# LSP server that would take a trgen-generated parser from an Antlr grammar, and a list of tuples of XPath expressions on the parse tree and types, and perform "semantic" highlighting. I found it useful in "debugging" a grammar visually when observing the syntax highlighting of some input file. You could "see" how the parse worked via color. Since the extension is an LSP implementation (in C#, which compiles and runs anywhere), the LSP server could theoretically be reused for extensions for other text editors. https://github.com/kaby76/uni-vscode So, for the grammars-v4/java/java grammar, there would be additional information needed (ha ha, as a JSON file!!!):
As I vaguely recall, LSP has standard types like "variable", "method", "keyword", so I used that--which, again, is just horrible. LSP defines these types, but it should not! People need to stop assuming the basics of what a language is. It's too limited. The main problem with such an implementation is that Antlr4 is not an incremental parser (although one could rewrite the codegen templates for Antlr4 to implement that). Worse, many languages in the grammars-v4 repo are just terrible because people don't know how to write a grammar. So, a parser generated from a bad grammar is horribly slow. |
Beta Was this translation helpful? Give feedback.
-
After you create a parser then you have write syntax highlighting files to syntax highlight the language your parser parses. When doing changes to the parser you manually have to update these files.
It would be good if you could generate such syntax highlight files for:
Beta Was this translation helpful? Give feedback.
All reactions