Adding languages

Submodules

To add a new language, you should first add a tree-sitter submodule. To do this, you can run the command

git submodule add -f <repository> helix-syntax/languages/tree-sitter-<name>

For example, to add tree-sitter-ocaml you would run

git submodule add -f https://github.com/tree-sitter/tree-sitter-ocaml helix-syntax/languages/tree-sitter-ocaml

Make sure the submodule is shallow by doing

git config -f .gitmodules submodule.helix-syntax/languages/tree-sitter-<name>.shallow true

or you can manually add shallow = true to .gitmodules.

languages.toml

Next, you need to add the language to the languages.toml found in the root of the repository; this languages.toml file is included at compilation time, and is distinct from the language.toml file in the user's configuration directory.

These are the available keys and descriptions for the file.

KeyDescription
nameThe name of the language
scopeA string like source.js that identifies the language. Currently, we strive to match the scope names used by popular TextMate grammars and by the Linguist library. Usually source.<name> or text.<name> in case of markup languages
injection-regexregex pattern that will be tested against a language name in order to determine whether this language should be used for a potential language injection site.
file-typesThe filetypes of the language, for example ["yml", "yaml"]
shebangsThe interpreters from the shebang line, for example ["sh", "bash"]
rootsA set of marker files to look for when trying to find the workspace root. For example Cargo.lock, yarn.lock
auto-formatWhether to autoformat this language when saving
diagnostic-severityMinimal severity of diagnostic for it to be displayed. (Allowed values: Error, Warning, Info, Hint)
comment-tokenThe token to use as a comment-token
indentThe indent to use. Has sub keys tab-width and unit
configLanguage server configuration

Queries

For a language to have syntax-highlighting and indentation among other things, you have to add queries. Add a directory for your language with the path runtime/queries/<name>/. The tree-sitter website gives more info on how to write queries.

NOTE: When evaluating queries, the first matching query takes precedence, which is different from other editors like neovim where the last matching query supercedes the ones before it. See this issue for an example.

Common Issues

  • If you get errors when building after switching branches, you may have to remove or update tree-sitter submodules. You can update submodules by running

    git submodule sync; git submodule update --init
    
  • Make sure to not use the --remote flag. To remove submodules look inside the .gitmodules and remove directories that are not present inside of it.

  • If a parser is segfaulting or you want to remove the parser, make sure to remove the submodule and the compiled parser in runtime/grammar/<name>.so

  • The indents query is indents.toml, not indents.scm. See this issue for more information.