Adding indent queries
Helix uses tree-sitter to correctly indent new lines. This requires a tree-
sitter grammar and an indent.scm
query file placed in runtime/queries/ {language}/indents.scm
. The indentation for a line is calculated by traversing
the syntax tree from the lowest node at the beginning of the new line (see
Indent queries). Each of these nodes contributes to the total
indent when it is captured by the query (in what way depends on the name of
the capture.
Note that it matters where these added indents begin. For example, multiple indent level increases that start on the same line only increase the total indent level by 1. See Capture types.
Indent queries
When Helix is inserting a new line through o
, O
, or <ret>
, to determine
the indent level for the new line, the query in indents.scm
is run on the
document. The starting position of the query is the end of the line above where
a new line will be inserted.
For o
, the inserted line is the line below the cursor, so that starting
position of the query is the end of the current line.
#![allow(unused)] fn main() { fn need_hero(some_hero: Hero, life: Life) -> { matches!(some_hero, Hero { // ←─────────────────╮ strong: true,//←╮ ↑ ↑ │ fast: true, // │ │ ╰── query start │ sure: true, // │ ╰───── cursor ├─ traversal soon: true, // ╰──────── new line inserted │ start node }) && // │ // ↑ │ // ╰───────────────────────────────────────────────╯ some_hero > life } }
For O
, the newly inserted line is the current line, so the starting position
of the query is the end of the line above the cursor.
#![allow(unused)] fn main() { fn need_hero(some_hero: Hero, life: Life) -> { // ←─╮ matches!(some_hero, Hero { // ←╮ ↑ │ strong: true,// ↑ ╭───╯ │ │ fast: true, // │ │ query start ─╯ │ sure: true, // ╰───┼ cursor ├─ traversal soon: true, // ╰ new line inserted │ start node }) && // │ some_hero > life // │ } // ←──────────────────────────────────────────────╯ }
From this starting node, the syntax tree is traversed up until the root node. Each indent capture is collected along the way, and then combined according to their capture types and scopes to a final indent level for the line.
Capture types
@indent
(default scopetail
): Increase the indent level by 1. Multiple occurrences in the same line do not stack. If there is at least one@indent
and one@outdent
capture on the same line, the indent level isn't changed at all.@outdent
(default scopeall
): Decrease the indent level by 1. The same rules as for@indent
apply.@indent.always
(default scopetail
): Increase the indent level by 1. Multiple occurrences on the same line do stack. The final indent level is@indent.always
–@outdent.always
. If an@indent
and an@indent.always
are on the same line, the@indent
is ignored.@outdent.always
(default scopeall
): Decrease the indent level by 1. The same rules as for@indent.always
apply.@align
(default scopeall
): Align everything inside this node to some anchor. The anchor is given by the start of the node captured by@anchor
in the same pattern. Every pattern with an@align
should contain exactly one@anchor
. Indent (and outdent) for nodes below (in terms of their starting line) the@align
node is added to the indentation required for alignment.@extend
: Extend the range of this node to the end of the line and to lines that are indented more than the line that this node starts on. This is useful for languages like Python, where for the purpose of indentation some nodes (like functions or classes) should also contain indented lines that follow them.@extend.prevent-once
: Prevents the first extension of an ancestor of this node. For example, in Python a return expression always ends the block that it is in. Note that this only stops the extension of the next@extend
capture. If multiple ancestors are captured, only the extension of the innermost one is prevented. All other ancestors are unaffected (regardless of whether the innermost ancestor would actually have been extended).
@indent
/ @outdent
Consider this example:
#![allow(unused)] fn main() { fn shout(things: Vec<Thing>) { // ↑ // ├───────────────────────╮ indent level // @indent ├┄┄┄┄┄┄┄┄┄┄┄┄┄┄ // │ let it_all = |out| { things.filter(|thing| { // │ 1 // ↑ ↑ │ // ├───────────────────────┼─────┼┄┄┄┄┄┄┄┄┄┄┄┄┄┄ // @indent @indent │ // │ 2 thing.can_do_with(out) // │ })}; // ├┄┄┄┄┄┄┄┄┄┄┄┄┄┄ //↑↑↑ │ 1 } //╰┼┴──────────────────────────────────────────────┴┄┄┄┄┄┄┄┄┄┄┄┄┄┄ // 3x @outdent }
((block) @indent)
["}" ")"] @outdent
Note how on the second line, we have two blocks begin on the same line. In this
case, since both captures occur on the same line, they are combined and only
result in a net increase of 1. Also note that the closing }
s are part of the
@indent
captures, but the 3 @outdent
s also combine into 1 and result in that
line losing one indent level.
@extend
/ @extend.prevent-once
For an example of where @extend
can be useful, consider Python, which is
whitespace-sensitive.
]
(parenthesized_expression)
(function_definition)
(class_definition)
] @indent
class Hero:
def __init__(self, strong, fast, sure, soon):# ←─╮
self.is_strong = strong # │
self.is_fast = fast # ╭─── query start │
self.is_sure = sure # │ ╭─ cursor │
self.is_soon = soon # │ │ │
# ↑ ↑ │ │ │
# │ ╰──────╯ │ │
# ╰─────────────────────╯ │
# ├─ traversal
def need_hero(self, life): # │ start node
return ( # │
self.is_strong # │
and self.is_fast # │
and self.is_sure # │
and self.is_soon # │
and self > life # │
) # ←─────────────────────────────────────────╯
Without braces to catch the scope of the function, the smallest descendant of the cursor on a line feed ends up being the entire inside of the class. Because of this, it will miss the entire function node and its indent capture, leading to an indent level one too small.
To address this case, @extend
tells helix to "extend" the captured node's span
to the line feed and every consecutive line that has a greater indent level than
the line of the node.
(parenthesized_expression) @indent
]
(function_definition)
(class_definition)
] @indent @extend
class Hero:
def __init__(self, strong, fast, sure, soon):# ←─╮
self.is_strong = strong # │
self.is_fast = fast # ╭─── query start ├─ traversal
self.is_sure = sure # │ ╭─ cursor │ start node
self.is_soon = soon # │ │ ←───────────────╯
# ↑ ↑ │ │
# │ ╰──────╯ │
# ╰─────────────────────╯
def need_hero(self, life):
return (
self.is_strong
and self.is_fast
and self.is_sure
and self.is_soon
and self > life
)
Furthermore, there are some cases where extending to everything with a greater
indent level may not be desirable. Consider the need_hero
function above. If
our cursor is on the last line of the returned expression.
class Hero:
def __init__(self, strong, fast, sure, soon):
self.is_strong = strong
self.is_fast = fast
self.is_sure = sure
self.is_soon = soon
def need_hero(self, life):
return (
self.is_strong
and self.is_fast
and self.is_sure
and self.is_soon
and self > life
) # ←─── cursor
#←────────── where cursor should go on new line
In Python, the are a few tokens that will always end a scope, such as a return
statement. Since the scope ends, so should the indent level. But because the
function span is extended to every line with a greater indent level, a new line
would just continue on the same level. And an @outdent
would not help us here
either, since it would cause everything in the parentheses to become outdented
as well.
To help, we need to signal an end to the extension. We can do this with
@extend.prevent-once
.
(parenthesized_expression) @indent
]
(function_definition)
(class_definition)
] @indent @extend
(return_statement) @extend.prevent-once
@indent.always
/ @outdent.always
As mentioned before, normally if there is more than one @indent
or @outdent
capture on the same line, they are combined.
Sometimes, there are cases when you may want to ensure that every indent capture is additive, regardless of how many occur on the same line. Consider this example in YAML.
- foo: bar
# ↑ ↑
# │ ╰─────────────── start of map
# ╰───────────────── start of list element
baz: quux # ←─── cursor
# ←───────────── where the cursor should go on a new line
garply: waldo
- quux:
bar: baz
xyzzy: thud
fred: plugh
In YAML, you often have lists of maps. In these cases, the syntax is such that
the list element and the map both start on the same line. But we really do want
to start an indentation for each of these so that subsequent keys in the map
hang over the list and align properly. This is where @indent.always
helps.
((block_sequence_item) @item @indent.always @extend
(#not-one-line? @item))
((block_mapping_pair
key: (_) @key
value: (_) @val
(#not-same-line? @key @val)
) @indent.always @extend
)
Predicates
In some cases, an S-expression cannot express exactly what pattern should be matched.
For that, tree-sitter allows for predicates to appear anywhere within a pattern,
similar to how #set!
declarations work:
(some_kind
(child_kind) @indent
(#predicate? arg1 arg2 ...)
)
The number of arguments depends on the predicate that's used.
Each argument is either a capture (@name
) or a string ("some string"
).
The following predicates are supported by tree-sitter:
-
#eq?
/#not-eq?
: The first argument (a capture) must/must not be equal to the second argument (a capture or a string). -
#match?
/#not-match?
: The first argument (a capture) must/must not match the regex given in the second argument (a string).
Additionally, we support some custom predicates for indent queries:
-
#not-kind-eq?
: The kind of the first argument (a capture) must not be equal to the second argument (a string). -
#same-line?
/#not-same-line?
: The captures given by the 2 arguments must/must not start on the same line. -
#one-line?
/#not-one-line?
: The captures given by the fist argument must/must span a total of one line.
Scopes
Added indents don't always apply to the whole node. For example, in most cases when a node should be indented, we actually only want everything except for its first line to be indented. For this, there are several scopes (more scopes may be added in the future if required):
tail
: This scope applies to everything except for the first line of the captured node.all
: This scope applies to the whole captured node. This is only different fromtail
when the captured node is the first node on its line.
For example, imagine we have the following function
#![allow(unused)] fn main() { fn aha() { // ←─────────────────────────────────────╮ let take = "on me"; // ←──────────────╮ scope: │ let take = "me on"; // ├─ "tail" ├─ (block) @indent let ill = be_gone_days(1 || 2); // │ │ } // ←───────────────────────────────────┴──────────┴─ "}" @outdent // scope: "all" }
We can write the following query with the #set!
declaration:
((block) @indent
(#set! "scope" "tail"))
("}" @outdent
(#set! "scope" "all"))
As we can see, the "tail" scope covers the node, except for the first line.
Everything up to and including the closing brace gets an indent level of 1.
Then, on the closing brace, we encounter an outdent with a scope of "all", which
means the first line is included, and the indent level is cancelled out on this
line. (Note these scopes are the defaults for @indent
and @outdent
—they are
written explicitly for demonstration.)