Skip to content

Latest commit

Β 

History

History
827 lines (639 loc) Β· 33.5 KB

scip.md

File metadata and controls

827 lines (639 loc) Β· 33.5 KB

Semantic Code Intelligence Protocol (SCIP) reference

Descriptor

Name Type Description
name string
disambiguator string
suffix Suffix NOTE: If you add new fields here, make sure to update the prepareSlot() function responsible for parsing symbols.

Suffix

Number Name Description
0 UnspecifiedSuffix
1 Namespace Unit of code abstraction and/or namespacing.
1 Package Use Namespace instead.
2 Type
3 Term
4 Method
5 TypeParameter
6 Parameter
7 Meta Can be used for any purpose.
8 Local
9 Macro

Additional notes on Namespace:

Unit of code abstraction and/or namespacing.

NOTE: This corresponds to a package in Go and JVM languages.

Diagnostic

Represents a diagnostic, such as a compiler error or warning, which should be reported for a document.

Name Type Description
severity Severity Should this diagnostic be reported as an error, warning, info, or hint?
code string (optional) Code of this diagnostic, which might appear in the user interface.
message string Message of this diagnostic.
source string (optional) Human-readable string describing the source of this diagnostic, e.g. 'typescript' or 'super lint'.
repeated tags DiagnosticTag

Document

Document defines the metadata about a source file on disk.

Name Type Description
language string The string ID for the programming language this file is written in. The Language enum contains the names of most common programming languages. This field is typed as a string to permit any programming language, including ones that are not specified by the Language enum.
relative_path string (Required) Unique path to the text document.
repeated occurrences Occurrence Occurrences that appear in this file.
repeated symbols SymbolInformation Symbols that are "defined" within this document.
text string (optional) Text contents of the this document. Indexers are not expected to include the text by default. It's preferrable that clients read the text contents from the file system by resolving the absolute path from joining Index.metadata.project_root and Document.relative_path. This field was introduced to support SymbolInformation.signature_documentation, but it can be used for other purposes as well, for example testing or when working with virtual/in-memory documents.
position_encoding PositionEncoding Specifies the encoding used for source ranges in this Document.

Additional notes on relative_path:

(Required) Unique path to the text document.

  1. The path must be relative to the directory supplied in the associated Metadata.project_root.
  2. The path must not begin with a leading '/'.
  3. The path must point to a regular file, not a symbolic link.
  4. The path must use '/' as the separator, including on Windows.
  5. The path must be canonical; it cannot include empty components ('//'), or '.' or '..'.

Additional notes on symbols:

Symbols that are "defined" within this document.

This should include symbols which technically do not have any definition, but have a reference and are defined by some other symbol (see Relationship.is_definition).

Additional notes on position_encoding:

Specifies the encoding used for source ranges in this Document.

Usually, this will match the type used to index the string type in the indexer's implementation language in O(1) time.

  • For an indexer implemented in JVM/.NET language or JavaScript/TypeScript, use UTF16CodeUnitOffsetFromLineStart.
  • For an indexer implemented in Python, use UTF32CodeUnitOffsetFromLineStart.
  • For an indexer implemented in Go, Rust or C++, use UTF8ByteOffsetFromLineStart.

Index

Index represents a complete SCIP index for a workspace this is rooted at a single directory. An Index message payload can have a large memory footprint and it's therefore recommended to emit and consume an Index payload one field value at a time. To permit streaming consumption of an Index payload, the metadata field must appear at the start of the stream and must only appear once in the stream. Other field values may appear in any order.

Name Type Description
metadata Metadata Metadata about this index.
repeated documents Document Documents that belong to this index.
repeated external_symbols SymbolInformation (optional) Symbols that are referenced from this index but are defined in an external package (a separate Index message). Leave this field empty if you assume the external package will get indexed separately. If the external package won't get indexed for some reason then you can use this field to provide hover documentation for those external symbols.

Additional notes on external_symbols:

(optional) Symbols that are referenced from this index but are defined in an external package (a separate Index message). Leave this field empty if you assume the external package will get indexed separately. If the external package won't get indexed for some reason then you can use this field to provide hover documentation for those external symbols.

IMPORTANT: When adding a new field to Index here, add a matching function in IndexVisitor and update ParseStreaming.

Metadata

Name Type Description
version ProtocolVersion Which version of this protocol was used to generate this index?
tool_info ToolInfo Information about the tool that produced this index.
project_root string URI-encoded absolute path to the root directory of this index. All documents in this index must appear in a subdirectory of this root directory.
text_document_encoding TextEncoding Text encoding of the source files on disk that are referenced from Document.relative_path. This value is unrelated to the Document.text field, which is a Protobuf string and hence must be UTF-8 encoded.

Occurrence

Occurrence associates a source position with a symbol and/or highlighting information.

If possible, indexers should try to bundle logically related information across occurrences into a single occurrence to reduce payload sizes.

Name Type Description
repeated range int32 Half-open [start, end) range of this occurrence. Must be exactly three or four elements:
symbol string (optional) The symbol that appears at this position. See SymbolInformation.symbol for how to format symbols as strings.
symbol_roles int32 (optional) Bitset containing SymbolRoles in this occurrence. See SymbolRole's documentation for how to read and write this field.
repeated override_documentation string (optional) CommonMark-formatted documentation for this specific range. If empty, the Symbol.documentation field is used instead. One example where this field might be useful is when the symbol represents a generic function (with abstract type parameters such as List<T>) and at this occurrence we know the exact values (such as List<String>).
syntax_kind SyntaxKind (optional) What syntax highlighting class should be used for this range?
repeated diagnostics Diagnostic (optional) Diagnostics that have been reported for this specific range.
repeated enclosing_range int32 (optional) Using the same encoding as the sibling range field, half-open source range of the nearest non-trivial enclosing AST node. This range must enclose the range field. Example applications that make use of the enclosing_range field:

Additional notes on range:

Half-open [start, end) range of this occurrence. Must be exactly three or four elements:

  • Four elements: [startLine, startCharacter, endLine, endCharacter]
  • Three elements: [startLine, startCharacter, endCharacter]. The end line is inferred to have the same value as the start line.

It is allowed for the range to be empty (i.e. start==end).

Line numbers and characters are always 0-based. Make sure to increment the line/character values before displaying them in an editor-like UI because editors conventionally use 1-based numbers.

The 'character' value is interpreted based on the PositionEncoding for the Document.

Historical note: the original draft of this schema had a Range message type with start and end fields of type Position, mirroring LSP. Benchmarks revealed that this encoding was inefficient and that we could reduce the total payload size of an index by 50% by using repeated int32 instead. The repeated int32 encoding is admittedly more embarrassing to work with in some programming languages but we hope the performance improvements make up for it.

Additional notes on override_documentation:

(optional) CommonMark-formatted documentation for this specific range. If empty, the Symbol.documentation field is used instead. One example where this field might be useful is when the symbol represents a generic function (with abstract type parameters such as List<T>) and at this occurrence we know the exact values (such as List<String>).

This field can also be used for dynamically or gradually typed languages, which commonly allow for type-changing assignment.

Additional notes on enclosing_range:

(optional) Using the same encoding as the sibling range field, half-open source range of the nearest non-trivial enclosing AST node. This range must enclose the range field. Example applications that make use of the enclosing_range field:

  • Call hierarchies: to determine what symbols are references from the body of a function
  • Symbol outline: to display breadcrumbs from the cursor position to the root of the file
  • Expand selection: to select the nearest enclosing AST node.
  • Highlight range: to indicate the AST expression that is associated with a hover popover

For definition occurrences, the enclosing range should indicate the start/end bounds of the entire definition AST node, including documentation.

const n = 3
      ^ range
^^^^^^^^^^^ enclosing_range

/** Parses the string into something */
^ enclosing_range start --------------------------------------|
function parse(input string): string {                        |
         ^^^^^ range                                          |
    return input.slice(n)                                     |
}                                                             |
^ enclosing_range end <---------------------------------------|

Any attributes/decorators/attached macros should also be part of the enclosing range.

@cache
^ enclosing_range start---------------------|
def factorial(n):                           |
    return n * factorial(n-1) if n else 1   |
< enclosing_range end-----------------------|

For reference occurrences, the enclosing range should indicate the start/end bounds of the parent expression.

const a = a.b
            ^ range
          ^^^ enclosing_range
const b = a.b(41).f(42).g(43)
                  ^ range
          ^^^^^^^^^^^^^ enclosing_range

Package

Unit of packaging and distribution.

NOTE: This corresponds to a module in Go and JVM languages.

Name Type Description
manager string
name string
version string

Relationship

Name Type Description
symbol string
is_reference bool When resolving "Find references", this field documents what other symbols should be included together with this symbol. For example, consider the following TypeScript code that defines two symbols Animal#sound() and Dog#sound(): ts interface Animal { ^^^^^^ definition Animal# sound(): string ^^^^^ definition Animal#sound() } class Dog implements Animal { ^^^ definition Dog#, relationships = [{symbol: "Animal#", is_implementation: true}] public sound(): string { return "woof" } ^^^^^ definition Dog#sound(), references_symbols = Animal#sound(), relationships = [{symbol: "Animal#sound()", is_implementation:true, is_reference: true}] } const animal: Animal = new Dog() ^^^^^^ reference Animal# console.log(animal.sound()) ^^^^^ reference Animal#sound() Doing "Find references" on the symbol Animal#sound() should return references to the Dog#sound() method as well. Vice-versa, doing "Find references" on the Dog#sound() method should include references to the Animal#sound() method as well.
is_implementation bool Similar to is_reference but for "Find implementations". It's common for is_implementation and is_reference to both be true but it's not always the case. In the TypeScript example above, observe that Dog# has an is_implementation relationship with "Animal#" but not is_reference. This is because "Find references" on the "Animal#" symbol should not return "Dog#". We only want "Dog#" to return as a result for "Find implementations" on the "Animal#" symbol.
is_type_definition bool Similar to references_symbols but for "Go to type definition".
is_definition bool Allows overriding the behavior of "Go to definition" and "Find references" for symbols which do not have a definition of their own or could potentially have multiple definitions.

Additional notes on is_definition:

Allows overriding the behavior of "Go to definition" and "Find references" for symbols which do not have a definition of their own or could potentially have multiple definitions.

For example, in a language with single inheritance and no field overriding, inherited fields can reuse the same symbol as the ancestor which declares the field. In such a situation, is_definition is not needed.

On the other hand, in languages with single inheritance and some form of mixins, you can use is_definition to relate the symbol to the matching symbol in ancestor classes, and is_reference to relate the symbol to the matching symbol in mixins.

NOTE: At the moment, due to limitations of the SCIP to LSIF conversion, only global symbols in an index are allowed to use is_definition. The relationship may not get recorded if either symbol is local.

Update registerInverseRelationships on adding a new field here.

Symbol

Symbol is similar to a URI, it identifies a class, method, or a local variable. SymbolInformation contains rich metadata about symbols such as the docstring.

Symbol has a standardized string representation, which can be used interchangeably with Symbol. The syntax for Symbol is the following:

# (<x>)+ stands for one or more repetitions of <x>
# (<x>)? stands for zero or one occurrence of <x>
<symbol>               ::= <scheme> ' ' <package> ' ' (<descriptor>)+ | 'local ' <local-id>
<package>              ::= <manager> ' ' <package-name> ' ' <version>
<scheme>               ::= any UTF-8, escape spaces with double space. Must not be empty nor start with 'local'
<manager>              ::= any UTF-8, escape spaces with double space. Use the placeholder '.' to indicate an empty value
<package-name>         ::= same as above
<version>              ::= same as above
<descriptor>           ::= <namespace> | <type> | <term> | <method> | <type-parameter> | <parameter> | <meta> | <macro>
<namespace>            ::= <name> '/'
<type>                 ::= <name> '#'
<term>                 ::= <name> '.'
<meta>                 ::= <name> ':'
<macro>                ::= <name> '!'
<method>               ::= <name> '(' (<method-disambiguator>)? ').'
<type-parameter>       ::= '[' <name> ']'
<parameter>            ::= '(' <name> ')'
<name>                 ::= <identifier>
<method-disambiguator> ::= <simple-identifier>
<identifier>           ::= <simple-identifier> | <escaped-identifier>
<simple-identifier>    ::= (<identifier-character>)+
<identifier-character> ::= '_' | '+' | '-' | '$' | ASCII letter or digit
<escaped-identifier>   ::= '`' (<escaped-character>)+ '`', must contain at least one non-<identifier-character>
<escaped-characters>   ::= any UTF-8, escape backticks with double backtick.
<local-id>             ::= <simple-identifier>

The list of descriptors for a symbol should together form a fully qualified name for the symbol. That is, it should serve as a unique identifier across the package. Typically, it will include one descriptor for every node in the AST (along the ancestry path) between the root of the file and the node corresponding to the symbol.

Local symbols MUST only be used for entities which are local to a Document, and cannot be accessed from outside the Document.

Name Type Description
scheme string
package Package
repeated descriptors Descriptor

SymbolInformation

SymbolInformation defines metadata about a symbol, such as the symbol's docstring or what package it's defined it.

Name Type Description
symbol string Identifier of this symbol, which can be referenced from Occurence.symbol. The string must be formatted according to the grammar in Symbol.
repeated documentation string (optional, but strongly recommended) The markdown-formatted documentation for this symbol. Use SymbolInformation.signature_documentation to document the method/class/type signature of this symbol. Due to historical reasons, indexers may include signature documentation in this field by rendering markdown code blocks. New indexers should only include non-code documentation in this field, for example docstrings.
repeated relationships Relationship (optional) Relationships to other symbols (e.g., implements, type definition).
kind Kind The kind of this symbol. Use this field instead of SymbolDescriptor.Suffix to determine whether something is, for example, a class or a method.
display_name string (optional) The name of this symbol as it should be displayed to the user. For example, the symbol "com/example/MyClass#myMethod(+1)." should have the display name "myMethod". The symbol field is not a reliable source of the display name for several reasons:
signature_documentation Document (optional) The signature of this symbol as it's displayed in API documentation or in hover tooltips. For example, a Java method that adds two numbers this would have Document.language = "java" and Document.text = "void add(int a, int b)". The languageandtextfields are required while other fields such asDocumentation.occurrences` can be optionally included to support hyperlinking referenced symbols in the signature.
enclosing_symbol string (optional) The enclosing symbol if this is a local symbol. For non-local symbols, the enclosing symbol should be parsed from the symbol field using the Descriptor grammar.

Additional notes on display_name:

(optional) The name of this symbol as it should be displayed to the user. For example, the symbol "com/example/MyClass#myMethod(+1)." should have the display name "myMethod". The symbol field is not a reliable source of the display name for several reasons:

  • Local symbols don't encode the name.
  • Some languages have case-insensitive names, so the symbol is all-lowercase.
  • The symbol may encode names with special characters that should not be displayed to the user.

Additional notes on enclosing_symbol:

(optional) The enclosing symbol if this is a local symbol. For non-local symbols, the enclosing symbol should be parsed from the symbol field using the Descriptor grammar.

The primary use-case for this field is to allow local symbol to be displayed in a symbol hierarchy for API documentation. It's OK to leave this field empty for local variables since local variables usually don't belong in API documentation. However, in the situation that you wish to include a local symbol in the hierarchy, then you can use enclosing_symbol to locate the "parent" or "owner" of this local symbol. For example, a Java indexer may choose to use local symbols for private class fields while providing an enclosing_symbol to reference the enclosing class to allow the field to be part of the class documentation hierarchy. From the perspective of an author of an indexer, the decision to use a local symbol or global symbol should exclusively be determined whether the local symbol is accessible outside the document, not by the capability to find the enclosing symbol.

Kind

(optional) Kind represents the fine-grained category of a symbol, suitable for presenting information about the symbol's meaning in the language.

For example:

  • A Java method would have the kind Method while a Go function would have the kind Function, even if the symbols for these use the same syntax for the descriptor SymbolDescriptor.Suffix.Method.
  • A Go struct has the symbol kind Struct while a Java class has the symbol kind Class even if they both have the same descriptor: SymbolDescriptor.Suffix.Type.

Since Kind is more fine-grained than Suffix:

  • If two symbols have the same Kind, they should share the same Suffix.
  • If two symbols have different Suffixes, they should have different Kinds.
Number Name Description
0 UnspecifiedKind
66 AbstractMethod A method which may or may not have a body. For Java, Kotlin etc.
72 Accessor For Ruby's attr_accessor
1 Array
2 Assertion For Alloy
3 AssociatedType
4 Attribute For C++
5 Axiom For Lean
6 Boolean
7 Class
8 Constant
9 Constructor
62 Contract For Solidity
10 DataFamily For Haskell
73 Delegate For C# and F#
11 Enum
12 EnumMember
63 Error
13 Event
84 Extension For Dart
14 Fact For Alloy
15 Field
16 File
17 Function
18 Getter For 'get' in Swift, 'attr_reader' in Ruby
19 Grammar For Raku
20 Instance For Purescript and Lean
21 Interface
22 Key
23 Lang For Racket
24 Lemma For Lean
64 Library For solidity
25 Macro
26 Method
74 MethodAlias For Ruby
27 MethodReceiver Analogous to 'ThisParameter' and 'SelfParameter', but for languages like Go where the receiver doesn't have a conventional name.
67 MethodSpecification Analogous to 'AbstractMethod', for Go.
28 Message For Protobuf
85 Mixin For Dart
65 Modifier For Solidity
29 Module
30 Namespace
31 Null
32 Number
33 Object
34 Operator
35 Package
36 PackageObject
37 Parameter
38 ParameterLabel
39 Pattern For Haskell's PatternSynonyms
40 Predicate For Alloy
41 Property
42 Protocol Analogous to 'Trait' and 'TypeClass', for Swift and Objective-C
68 ProtocolMethod Analogous to 'AbstractMethod', for Swift and Objective-C.
69 PureVirtualMethod Analogous to 'AbstractMethod', for C++.
43 Quasiquoter For Haskell
44 SelfParameter 'self' in Python, Rust, Swift etc.
45 Setter For 'set' in Swift, 'attr_writer' in Ruby
46 Signature For Alloy, analogous to 'Struct'.
75 SingletonClass For Ruby
76 SingletonMethod Analogous to 'StaticMethod', for Ruby.
77 StaticDataMember Analogous to 'StaticField', for C++
78 StaticEvent For C#
79 StaticField For C#
80 StaticMethod For Java, C#, C++ etc.
81 StaticProperty For C#, TypeScript etc.
82 StaticVariable For C, C++
48 String
49 Struct
47 Subscript For Swift
50 Tactic For Lean
51 Theorem For Lean
52 ThisParameter Method receiver for languages 'this' in JavaScript, C++, Java etc.
53 Trait Analogous to 'Protocol' and 'TypeClass', for Rust, Scala etc.
70 TraitMethod Analogous to 'AbstractMethod', for Rust, Scala etc.
54 Type Data type definition for languages like OCaml which use type rather than separate keywords like struct and enum.
55 TypeAlias
56 TypeClass Analogous to 'Trait' and 'Protocol', for Haskell, Purescript etc.
71 TypeClassMethod Analogous to 'AbstractMethod', for Haskell, Purescript etc.
57 TypeFamily For Haskell
58 TypeParameter
59 Union For C, C++, Capn Proto
60 Value
61 Variable Next = 86; Feel free to open a PR proposing new language-specific kinds.

ToolInfo

Name Type Description
name string Name of the indexer that produced this index.
version string Version of the indexer that produced this index.
repeated arguments string Command-line arguments that were used to invoke this indexer.

DiagnosticTag

Number Name Description
0 UnspecifiedDiagnosticTag
1 Unnecessary
2 Deprecated

Language

Language standardises names of common programming languages that can be used for the Document.language field. The primary purpose of this enum is to prevent a situation where we have a single programming language ends up with multiple string representations. For example, the C++ language uses the name "CPP" in this enum and other names such as "cpp" are incompatible. Feel free to send a pull-request to add missing programming languages.

Number Name Description
0 UnspecifiedLanguage
60 ABAP
96 Apex
49 APL
39 Ada
45 Agda
86 AsciiDoc
58 Assembly
66 Awk
68 Bat
81 BibTeX
34 C
59 COBOL
35 CPP C++ (the name "CPP" was chosen for consistency with LSP)
26 CSS
1 CSharp
8 Clojure
21 Coffeescript
9 CommonLisp
47 Coq
97 CUDA
3 Dart
57 Delphi
88 Diff
80 Dockerfile
50 Dyalog
17 Elixir
18 Erlang
42 FSharp
65 Fish
24 Flow
56 Fortran
91 Git_Commit
89 Git_Config
92 Git_Rebase
33 Go
98 GraphQL
7 Groovy
30 HTML
20 Hack
90 Handlebars
44 Haskell
46 Idris
72 Ini
51 J
75 JSON
6 Java
22 JavaScript
93 JavaScriptReact
76 Jsonnet
55 Julia
109 Justfile
4 Kotlin
83 LaTeX
48 Lean
27 Less
12 Lua
108 Luau
79 Makefile
84 Markdown
52 Matlab
110 Nickel https://nickel-lang.org/
77 Nix
41 OCaml
36 Objective_C
37 Objective_CPP
99 Pascal
19 PHP
70 PLSQL
13 Perl
67 PowerShell
71 Prolog
100 Protobuf
15 Python
54 R
11 Racket
14 Raku
62 Razor
102 Repro Internal language for testing SCIP
85 ReST
16 Ruby
40 Rust
61 SAS
29 SCSS
43 SML
69 SQL
28 Sass
5 Scala
10 Scheme
64 ShellScript Bash
78 Skylark
107 Slang
95 Solidity
106 Svelte
2 Swift
101 Tcl
73 TOML
82 TeX
103 Thrift
23 TypeScript
94 TypeScriptReact
104 Verilog
105 VHDL
63 VisualBasic
25 Vue
53 Wolfram
31 XML
32 XSL
74 YAML
38 Zig NextLanguage = 111; Steps add a new language: 1. Copy-paste the "NextLanguage = N" line above 2. Increment "NextLanguage = N" to "NextLanguage = N+1" 3. Replace "NextLanguage = N" with the name of the new language. 4. Move the new language to the correct line above using alphabetical order 5. (optional) Add a brief comment behind the language if the name is not self-explanatory

PositionEncoding

Encoding used to interpret the 'character' value in source ranges.

Number Name Description
0 UnspecifiedPositionEncoding Default value. This value should not be used by new SCIP indexers so that a consumer can process the SCIP index without ambiguity.
1 UTF8CodeUnitOffsetFromLineStart The 'character' value is interpreted as an offset in terms of UTF-8 code units (i.e. bytes).
2 UTF16CodeUnitOffsetFromLineStart The 'character' value is interpreted as an offset in terms of UTF-16 code units (each is 2 bytes).
3 UTF32CodeUnitOffsetFromLineStart The 'character' value is interpreted as an offset in terms of UTF-32 code units (each is 4 bytes).

Additional notes on UTF8CodeUnitOffsetFromLineStart:

The 'character' value is interpreted as an offset in terms of UTF-8 code units (i.e. bytes).

Example: For the string "πŸš€ Woo" in UTF-8, the bytes are [240, 159, 154, 128, 32, 87, 111, 111], so the offset for 'W' would be 5.

Additional notes on UTF16CodeUnitOffsetFromLineStart:

The 'character' value is interpreted as an offset in terms of UTF-16 code units (each is 2 bytes).

Example: For the string "πŸš€ Woo", the UTF-16 code units are ['\ud83d', '\ude80', ' ', 'W', 'o', 'o'], so the offset for 'W' would be 3.

Additional notes on UTF32CodeUnitOffsetFromLineStart:

The 'character' value is interpreted as an offset in terms of UTF-32 code units (each is 4 bytes).

Example: For the string "πŸš€ Woo", the UTF-32 code units are ['πŸš€', ' ', 'W', 'o', 'o'], so the offset for 'W' would be 2.

ProtocolVersion

Number Name Description
0 UnspecifiedProtocolVersion

Severity

Number Name Description
0 UnspecifiedSeverity
1 Error
2 Warning
3 Information
4 Hint

SymbolRole

SymbolRole declares what "role" a symbol has in an occurrence. A role is encoded as a bitset where each bit represents a different role. For example, to determine if the Import role is set, test whether the second bit of the enum value is defined. In pseudocode, this can be implemented with the logic: const isImportRole = (role.value & SymbolRole.Import.value) > 0.

Number Name Description
0 UnspecifiedSymbolRole This case is not meant to be used; it only exists to avoid an error from the Protobuf code generator.
1 Definition Is the symbol defined here? If not, then this is a symbol reference.
2 Import Is the symbol imported here?
4 WriteAccess Is the symbol written here?
8 ReadAccess Is the symbol read here?
16 Generated Is the symbol in generated code?
32 Test Is the symbol in test code?
64 ForwardDefinition Is this a signature for a symbol that is defined elsewhere?

Additional notes on ForwardDefinition:

Is this a signature for a symbol that is defined elsewhere?

Applies to forward declarations for languages like C, C++ and Objective-C, as well as val declarations in interface files in languages like SML and OCaml.

SyntaxKind

Number Name Description
0 UnspecifiedSyntaxKind
1 Comment Comment, including comment markers and text
2 PunctuationDelimiter ; . ,
3 PunctuationBracket (), {}, [] when used syntactically
4 Keyword if, else, return, class, etc.
4 IdentifierKeyword
5 IdentifierOperator +, *, etc.
6 Identifier non-specific catch-all for any identifier not better described elsewhere
7 IdentifierBuiltin Identifiers builtin to the language: min, print in Python.
8 IdentifierNull Identifiers representing null-like values: None in Python, nil in Go.
9 IdentifierConstant xyz in const xyz = "hello"
10 IdentifierMutableGlobal var X = "hello" in Go
11 IdentifierParameter Parameter definition and references
12 IdentifierLocal Identifiers for variable definitions and references within a local scope
13 IdentifierShadowed Identifiers that shadow other identifiers in an outer scope
14 IdentifierNamespace Identifier representing a unit of code abstraction and/or namespacing.
14 IdentifierModule
15 IdentifierFunction Function references, including calls
16 IdentifierFunctionDefinition Function definition only
17 IdentifierMacro Macro references, including invocations
18 IdentifierMacroDefinition Macro definition only
19 IdentifierType non-builtin types
20 IdentifierBuiltinType builtin types only, such as str for Python or int in Go
21 IdentifierAttribute Python decorators, c-like attribute
22 RegexEscape \b
23 RegexRepeated *, +
24 RegexWildcard .
25 RegexDelimiter (, ), [, ]
26 RegexJoin `
27 StringLiteral Literal strings: "Hello, world!"
28 StringLiteralEscape non-regex escapes: "\t", "\n"
29 StringLiteralSpecial datetimes within strings, special words within a string, {} in format strings
30 StringLiteralKey "key" in { "key": "value" }, useful for example in JSON
31 CharacterLiteral 'c' or similar, in languages that differentiate strings and characters
32 NumericLiteral Literal numbers, both floats and integers
33 BooleanLiteral true, false
34 Tag Used for XML-like tags
35 TagAttribute Attribute name in XML-like tags
36 TagDelimiter Delimiters for XML-like tags

Additional notes on IdentifierNamespace:

Identifier representing a unit of code abstraction and/or namespacing.

NOTE: This corresponds to a package in Go and JVM languages, and a module in languages like Python and JavaScript.

TextEncoding

Number Name Description
0 UnspecifiedTextEncoding
1 UTF8
2 UTF16