From 8a3d0bed16cca4f827c2acce21ae22446b9cb80f Mon Sep 17 00:00:00 2001 From: Markus Rudolph Date: Wed, 13 Dec 2023 10:59:09 +0100 Subject: [PATCH] Add a guide for multiple languages support (#191) --- hugo/content/guides/multiple-languages.md | 440 ++++++++++++++++++++++ 1 file changed, 440 insertions(+) create mode 100644 hugo/content/guides/multiple-languages.md diff --git a/hugo/content/guides/multiple-languages.md b/hugo/content/guides/multiple-languages.md new file mode 100644 index 00000000..e9b0ee0d --- /dev/null +++ b/hugo/content/guides/multiple-languages.md @@ -0,0 +1,440 @@ +--- +title: "Multiple dependent languages" +weight: 400 +--- + +This guide is about integrating multiple dependent languages in one Langium project. + +One common situation where it makes sense to create dependent languages is when you only want to read concepts in one language and predefine them in another file (probably also a built-in one). Think of splitting SQL into a defining `CREATE TABLE table (...)`) and a reading part (`SELECT * FROM table`). + +> Notice that for `n` independent languages, you can simply create `n` independent Langium projects. + +If you want to see a living example, I recommend to visit the [`requirements` example](https://github.com/eclipse-langium/langium/tree/main/examples/requirements) of the main [Langium repository](https://github.com/eclipse-langium/langium). + +## Our plan + +The entire change touches several files. Let's summarize what needs to be done: + +1. the **grammar** (the `*.langium` file) needs to be split into the three parts that were discussed above +2. the **Langium configuration** (the `langium-config.json` file in the Langium project root) needs to split the language configuration into three parts +3. the **module file** of your language (`XXX-module.ts`) needs to create the new language services as well. +4. Last, but not least, you have to **cleanup all dependent files**. Here we can give general hints. +5. if you have a **VSCode extension** + 1. the `package.json` needs to be adapted + 2. the extension entry point file (`src/extension/main.ts`) needs to be changed slightly + + +## Our scenario + +To keep this guide easy, I will use the [`hello-world` project](/docs/getting-started/). + +Let’s imagine that we have three languages: + +* the first language **defines** persons +* the second language **greets** persons of the first language +* the third language **configures** which person you are + +Just as a finger practice, let's require that you cannot greet yourself. + +{{}} +flowchart + Implementation -->|requires| Definition + Configuration -->|requires| Definition + Implementation -->|requires| Configuration +{{}} + +## Let's start! + +### Grammar + +The most relevant change might be in the grammar. Here is the original grammar from the `hello-world` example, which is generated by Langium's Yeoman generator: + +```langium +grammar MultipleLanguages + +entry Model: + (persons+=Person | greetings+=Greeting)*; + +Person: + 'person' name=ID; + +Greeting: + 'Hello' person=[Person:ID] '!'; + +hidden terminal WS: /\s+/; +terminal ID: /[_a-zA-Z][\w_]*/; +terminal INT returns number: /[0-9]+/; +terminal STRING: /"(\\.|[^"\\])*"|'(\\.|[^'\\])*'/; + +hidden terminal ML_COMMENT: /\/\*[\s\S]*?\*\//; +hidden terminal SL_COMMENT: /\/\/[^\n\r]*/; +``` + +Now, split it into three new files (let's call the entry rules units and the files we can name like `multiple-languages-(configuration|definition|implementation).langium`): + +Our definition grammar: +```langium +grammar MultiDefinition + +entry DefinitionUnit: + (persons+=Person)*; + +Person: + 'person' name=ID; + +hidden terminal WS: /\s+/; +terminal ID: /[_a-zA-Z][\w_]*/; + +hidden terminal ML_COMMENT: /\/\*[\s\S]*?\*\//; +hidden terminal SL_COMMENT: /\/\/[^\n\r]*/; +``` + +Our configuration grammar (note the import): +```langium +grammar MultiConfiguration + +import "multiple-languages-definition"; + +entry ConfigurationUnit: 'I' 'am' who=[Person:ID] '.'; +``` + +Our implementation grammar (note the import again): +```langium +grammar MultiImplementation + +import "multiple-languages-definition"; + +entry ImplementationUnit: + (greetings+=Greeting)*; + +Greeting: + 'Hello' person=[Person:ID] '!'; +``` + +### Langium configuration + +Splitting the grammar alone is not sufficient to generate anything using the CLI. You need to change the `langium-config.json` in the root folder as well. Let's make it happen! + +The initial version of this file was: + +```js +{ + "projectName": "MultipleLanguages", + "languages": [{ + "id": "multiple-languages", + "grammar": "src/language/multiple-languages.langium", + "fileExtensions": [".hello"], + "textMate": { + "out": "syntaxes/multiple-languages.tmLanguage.json" + }, + "monarch": { + "out": "syntaxes/multiple-languages.monarch.ts" + } + }], + "out": "src/language/generated" +} +``` + +The actual change is simple: Triple the object in the `languages` list and fill in reasonable values. Like here: + +```js +{ + "projectName": "MultipleLanguages", + "languages": [{ + "id": "multiple-languages-configuration", + "grammar": "src/language/multiple-languages-configuration.langium", + "fileExtensions": [".me"], + "textMate": { + "out": "syntaxes/multiple-languages-configuration.tmLanguage.json" + }, + "monarch": { + "out": "syntaxes/multiple-languages-configuration.monarch.ts" + } + }, { + "id": "multiple-languages-definition", + "grammar": "src/language/multiple-languages-definition.langium", + "fileExtensions": [".who"], + "textMate": { + "out": "syntaxes/multiple-languages-definition.tmLanguage.json" + }, + "monarch": { + "out": "syntaxes/multiple-languages-definition.monarch.ts" + } + }, { + "id": "multiple-languages-implementation", + "grammar": "src/language/multiple-languages-implementation.langium", + "fileExtensions": [".hello"], + "textMate": { + "out": "syntaxes/multiple-languages-implementation.tmLanguage.json" + }, + "monarch": { + "out": "syntaxes/multiple-languages-implementation.monarch.ts" + } + }], + "out": "src/language/generated" +} +``` + +From now on you are able to run the Langium CLI using the NPM scripts (`npm run langium:generate`). It will generate one file for the abstract syntax tree (AST) containing all languages concepts (it is also a good idea to keep the names of these concepts disjoint). + +For the next step you need to run the Langium generator once: + +```sh +npm run langium:generate +``` + +### Language module file + +The module file describes how your language services are built. +After adding two more languages, some important classes get generated - which need to be registered properly. + +1. Open the module file (`/src/language/multiple-languages-module.ts`). +2. You will notice a wrong import (which is ok, we renamed it in the previous steps and derived new classes by code generation). +3. Import the new generated modules instead. + Replace this line: + ```ts + import { + MultipleLanguagesGeneratedModule, + MultipleLanguagesGeneratedSharedModule + } from './generated/module.js'; + ``` + with the following: + ```ts + import { + MultiConfigurationGeneratedModule, + MultiDefinitionGeneratedModule, + MultiImplementationGeneratedModule, + MultipleLanguagesGeneratedSharedModule + } from './generated/module.js'; + ``` +4. In the function `createMultipleLanguagesServices` you will notice an error line now, because we deleted the old class name in the previous step. The code there needs to basically be tripled. But before we do this, we need to define the new output type of `createMultipleLanguagesServices`. In the end this should lead to this definition: + ```ts + export function createMultipleLanguagesServices(context: DefaultSharedModuleContext): { + shared: LangiumSharedServices, + Configuration: MultipleLanguagesServices, + Definition: MultipleLanguagesServices, + Implementation: MultipleLanguagesServices + } { + const shared = inject( + createDefaultSharedModule(context), + MultipleLanguagesGeneratedSharedModule + ); + const Configuration = inject( + createDefaultModule({ shared }), + MultiConfigurationGeneratedModule, + MultipleLanguagesModule + ); + const Definition = inject( + createDefaultModule({ shared }), + MultiDefinitionGeneratedModule, + MultipleLanguagesModule + ); + const Implementation = inject( + createDefaultModule({ shared }), + MultiImplementationGeneratedModule, + MultipleLanguagesModule + ); + shared.ServiceRegistry.register(Configuration); + shared.ServiceRegistry.register(Definition); + shared.ServiceRegistry.register(Implementation); + registerValidationChecks(Configuration); + registerValidationChecks(Definition); + registerValidationChecks(Implementation); + return { shared, Configuration, Definition, Implementation }; + } + ``` + +After this step, Langium is set up correctly. But if you try to build now, the compiler will throw you some errors, because the old concepts of the AST are not existing anymore. + +> Be aware of the fact that we are using `MultipleLanguagesModule` in all three services, three independent services! If you want to avoid this (because of duplicated state etc.), you should put some work into creating instances for each service. + +### Cleanup + +Let's clean up the error lines. Here are some general hints: + +* keep in mind, that you are dealing with three file types now, namely `*.me`, `*.who` and `*.hello` + * you can distinguish them very easily by selecting the right sub service from the result object of `createMultipleLanguagesServices`, which is either `Configuration`, `Definition` or `Implementation`, but not `shared` + * all these services have a sub service with file extensions: `[Configuration,Definition,...].LanguageMetaData.fileExtensions: string[]` + * so, when you are obtaining any documents from the `DocumentBuilder` you can be sure that they are parsed by the matching language service + * to distinguish them on your own, use the AST functions for determining the root type, for example for the Configuration language use `isConfigurationUnit(document.parseResult.value)` + +### VSCode extension + +If you have a VSCode extension, you need to touch two files: `package.json` and `src/extension/main.ts`. + +#### File `package.json` + +In this file we define what services this extension will contribute to VSCode. + +Before the change only one language and grammar was defined: + +```js +//... +"contributes": { + "languages": [ + { + "id": "multiple-languages", + "aliases": [ + "Multiple Languages", + "multiple-languages" + ], + "extensions": [".hello"], + "configuration": "./language-configuration.json" + } + ], + "grammars": [ + { + "language": "multiple-languages", + "scopeName": "source.multiple-languages", + "path": "./syntaxes/multiple-languages.tmLanguage.json" + } + ] +}, +//... +``` + +After the change, we tripled the information. Be aware of that the language ids must match the ids from the Langium configuration. Also make sure that the paths to the syntax files and the language configuration are correct. + +> For the language configuration for VSCode, we reused the old file three times. If you want to make a more precise configuration per language, you should also split this file. But let's use the same for a moment and for simplicity. + +```js +//... +"contributes": { + "languages": [ + { + "id": "multiple-languages-configuration", + "aliases": [ + "Multiple Languages Configuration", + "multiple-languages-configuration" + ], + "extensions": [".me"], + "configuration": "./language-configuration.json" + }, { + "id": "multiple-languages-definition", + "aliases": [ + "Multiple Languages Definition", + "multiple-languages-definition" + ], + "extensions": [".who"], + "configuration": "./language-configuration.json" + }, { + "id": "multiple-languages-implementation", + "aliases": [ + "Multiple Languages Implementation", + "multiple-languages-implementation" + ], + "extensions": [".hello"], + "configuration": "./language-configuration.json" + } + ], + "grammars": [ + { + "language": "multiple-languages-configuration", + "scopeName": "source.multiple-languages-configuration", + "path": "./syntaxes/multiple-languages-configuration.tmLanguage.json" + }, + { + "language": "multiple-languages-definition", + "scopeName": "source.multiple-languages-definition", + "path": "./syntaxes/multiple-languages-definition.tmLanguage.json" + }, + { + "language": "multiple-languages-implementation", + "scopeName": "source.multiple-languages-implementation", + "path": "./syntaxes/multiple-languages-implementation.tmLanguage.json" + } + ] +}, +``` + +#### File `src/extension/main.ts` + +And here is the extension file before the change: + +```ts +// Options to control the language client +const clientOptions: LanguageClientOptions = { + documentSelector: [{ scheme: 'file', language: 'multiple-languages' }] +}; +``` + +After the change, it should look like this (the language IDs should be the same as they are in the Langium configuration): + +```ts +// Options to control the language client +const clientOptions: LanguageClientOptions = { + documentSelector: [ + { scheme: 'file', language: 'multiple-languages-configuration' }, + { scheme: 'file', language: 'multiple-languages-definition' }, + { scheme: 'file', language: 'multiple-languages-implementation' } + ] +}; +``` + +## Test the extension! + +Now everything should be executable. **Do not forget to build**! + +Let's run the extension and create some files in our workspace: + +### Definition `people.who` + +``` +person Markus +person Michael +person Frank +``` + +### Configuration `thats.me` + +``` +I am Markus. +``` + +### Implementation `greetings.hello` + +``` +Hello Markus! +Hello Michael! +``` + +## Checklist + +You should be able now...: +* to see proper syntax highlighting +* to trigger auto completion for keywords +* to jump to the definition by Cmd/Ctrl-clicking on a person's name + +# Add a validator (task) + +As promised, let's add a simple validation rule, that you cannot greet yourself. Therefore we enter our name in the `thats.me` file like we did in the previous step. + +Try to include the following code to our validator. This is meant as task, try to find the missing pieces on your own :-). + +```ts +checkNotGreetingYourself(greeting: Greeting, accept: ValidationAcceptor): void { + const document = getDocument(greeting); + const configFilePath = join(document.uri.fsPath, '..', 'thats.me'); + const configDocument = this.documents.getOrCreateDocument(URI.file(configFilePath)); + if (greeting.person.ref) { + if (configDocument && isConfigurationUnit(configDocument.parseResult.value)) { + if(configDocument.parseResult.value.who.ref === greeting.person.ref) { + accept('warning', 'You cannot greet yourself 🙄!', { node: greeting, property: 'person' }); + } + } + } +} +``` + +After doing so, your name should display a warning, stating that you cannot greet yourself. + +# Troubleshooting + +In this section we will list common mistakes. + +* One prominent mistake is forgetting to build Langium and Typescript files, before running the extension. + +* Since we are basically just copy-pasting given configuration, be aware of what you are pasting. Make sure that the code still makes sense after copying. You probably forgot to adapt the pasted code. + +If you encounter any problems, we are happy to help in our [discussions](https://github.com/eclipse-langium/langium/discussions) page or our [issue tracker](https://github.com/eclipse-langium/langium/issues). +