To add a new language, you need to add a `language` entry to the
[`languages.toml`][languages.toml] found in the root of the repository;
this `languages.toml` file is included at compilation time, and is
distinct from the `languages.toml` file in the user's [configuration
directory](../configuration.md).
```toml
[[language]]
name = "mylang"
scope = "scope.mylang"
injection-regex = "^mylang$"
file-types = ["mylang", "myl"]
comment-token = "#"
indent = { tab-width = 2, unit = " " }
```
or you can manually add `shallow = true` to `.gitmodules`.
These are the available keys and descriptions for the file.
## languages.toml
| Key | Description |
| ---- | ----------- |
| `name` | The name of the language |
| `scope` | A string like `source.js` that identifies the language. Currently, we strive to match the scope names used by popular TextMate grammars and by the Linguist library. Usually `source.<name>` or `text.<name>` in case of markup languages |
| `injection-regex` | regex pattern that will be tested against a language name in order to determine whether this language should be used for a potential [language injection][treesitter-language-injection] site. |
| `file-types` | The filetypes of the language, for example `["yml", "yaml"]`. Extensions and full file names are supported. |
| `shebangs` | The interpreters from the shebang line, for example `["sh", "bash"]` |
| `roots` | A set of marker files to look for when trying to find the workspace root. For example `Cargo.lock`, `yarn.lock` |
| `auto-format` | Whether to autoformat this language when saving |
| `diagnostic-severity` | Minimal severity of diagnostic for it to be displayed. (Allowed values: `Error`, `Warning`, `Info`, `Hint`) |
| `comment-token` | The token to use as a comment-token |
| `indent` | The indent to use. Has sub keys `tab-width` and `unit` |
| `config` | Language server configuration |
| `grammar` | The tree-sitter grammar to use (defaults to the value of `name`) |
Next, you need to add the language to the [`languages.toml`][languages.toml] found in the root of
the repository; this `languages.toml` file is included at compilation time, and
is distinct from the `language.toml` file in the user's [configuration
directory](../configuration.md).
## Grammar configuration
These are the available keys and descriptions for the file.
If a tree-sitter grammar is available for the language, add a new `grammar`
| scope | A string like `source.js` that identifies the language. Currently, we strive to match the scope names used by popular TextMate grammars and by the Linguist library. Usually `source.<name>` or `text.<name>` in case of markup languages |
| injection-regex | regex pattern that will be tested against a language name in order to determine whether this language should be used for a potential [language injection][treesitter-language-injection] site. |
| file-types | The filetypes of the language, for example `["yml", "yaml"]` |
| shebangs | The interpreters from the shebang line, for example `["sh", "bash"]` |
| roots | A set of marker files to look for when trying to find the workspace root. For example `Cargo.lock`, `yarn.lock` |
| auto-format | Whether to autoformat this language when saving |
| diagnostic-severity | Minimal severity of diagnostic for it to be displayed. (Allowed values: `Error`, `Warning`, `Info`, `Hint`) |
| comment-token | The token to use as a comment-token |
| indent | The indent to use. Has sub keys `tab-width` and `unit` |
| config | Language server configuration |
| --- | ----------- |
| `name` | The name of the tree-sitter grammar |
| `path` | A path within the grammar directory which should be built. Some grammar repositories host multiple grammars (for example `tree-sitter-typescript` and `tree-sitter-ocaml`) in subdirectories. This key is used to point `hx --build-grammars` to the correct path for compilation. When ommitted, the root of the grammar directory is used |
| `source` | The method of fetching the grammar - a table with a schema defined below |
Where `source` is a table with either these keys when using a grammar from a
git repository:
| Key | Description |
| --- | ----------- |
| `git` | A git remote URL from which the grammar should be cloned |
| `rev` | The revision (commit hash or tag) which should be fetched |
Or a `path` key with an absolute path to a locally available grammar directory.
## Queries
@ -51,18 +74,14 @@ gives more info on how to write queries.
> NOTE: When evaluating queries, the first matching query takes
precedence, which is different from other editors like neovim where
the last matching query supercedes the ones before it. See
the last matching query supersedes the ones before it. See
[this issue][neovim-query-precedence] for an example.
## Common Issues
- If you get errors when building after switching branches, you may have to remove or update tree-sitter submodules. You can update submodules by running
```sh
git submodule sync; git submodule update --init
```
- Make sure to not use the `--remote` flag. To remove submodules look inside the `.gitmodules` and remove directories that are not present inside of it.
- If you get errors when running after switching branches, you may have to update the tree-sitter grammars. Run `hx --fetch-grammars` to fetch the grammars and `hx --build-grammars` to build any out-of-date grammars.
- If a parser is segfaulting or you want to remove the parser, make sure to remove the submodule *and* the compiled parser in `runtime/grammar/<name>.so`
- If a parser is segfaulting or you want to remove the parser, make sure to remove the compiled parser in `runtime/grammar/<name>.so`
- The indents query is `indents.toml`, *not*`indents.scm`. See [this](https://github.com/helix-editor/helix/issues/114) issue for more information.
@ -4,10 +4,28 @@ Language-specific settings and settings for particular language servers can be c
Changes made to the `languages.toml` file in a user's [configuration directory](./configuration.md) are merged with helix's defaults on start-up, such that a user's settings will take precedence over defaults in the event of a collision. For example, the default `languages.toml` sets rust's `auto-format` to `true`. If a user wants to disable auto-format, they can change the `languages.toml` in their [configuration directory](./configuration.md) to make the rust entry read like the example below; the new key/value pair `auto-format = false` will override the default when the two sets of settings are merged on start-up:
```
```toml
# in <config_dir>/helix/languages.toml
[[language]]
name = "rust"
auto-format = false
```
## Tree-sitter grammars
Tree-sitter grammars can also be configured in `languages.toml`:
If a user has a `languages.toml`, only the grammars in that `languages.toml` are evaluated when running `hx --fetch-grammars` and `hx --build-grammars`. Otherwise the [default `languages.toml`](https://github.com/helix-editor/helix/blob/master/languages.toml) is used.