* significantly improve treesitter performance while editing large files
* Apply stylistic suggestions from code review
Co-authored-by: Michael Davis <mcarsondavis@gmail.com>
* use PartialEq and Hash instead of a freestanding function
Co-authored-by: Michael Davis <mcarsondavis@gmail.com>
The current `:tree-sitter-subtree` has a bug for field-names when the
field name belongs to an unnamed child node. Take this ruby example:
def self.method_name
true
end
The subtree given by tree-sitter-cli is:
(singleton_method [2, 0] - [4, 3]
object: (self [2, 4] - [2, 8])
name: (identifier [2, 9] - [2, 20])
body: (body_statement [3, 2] - [3, 6]
(true [3, 2] - [3, 6])))
But the `:tree-sitter-subtree` output was
(singleton_method
object: (self)
body: (identifier)
(body_statement (true)))
The `singleton_method` rule defines the `name` and `body` fields in an
unnamed helper rule `_method_rest` and the old implementation of
`pretty_print_tree_impl` would pass the `field_name` down from the
named `singleton_method` node.
To fix it we switch to the [TreeCursor] API which is recommended by
the tree-sitter docs for traversing the tree. `TreeCursor::field_name`
accurately determines the field name for the current cursor position
even when the node is unnamed.
[TreeCursor]: https://docs.rs/tree-sitter/0.20.9/tree_sitter/struct.TreeCursor.html
This fixes an edge case for completing shellwords. With a file
"a b.txt" in the current directory, the sequence `:open a\<tab>`
will result in the prompt containing `:open aa\ b.txt`. This is
because the length of the input which is trimmed when replacing with
completion is calculated on the part of the input which is parsed by
shellwords and then escaped (in a separate operation), which is lossy.
In this case it loses the trailing backslash.
The fix provided here refactors shellwords to track both the _words_
(shellwords with quotes and escapes resolved) and the _parts_ (chunks
of the input which turned into each word, with separating whitespace
removed). When calculating how much of the input to delete when
replacing with the completion item, we now use the length of the last
part.
This also allows us to eliminate the duplicate work done in the
`ends_with_whitespace` check.
* add tree sitter match limit to avoid slowdowns for larger files
Affects all tree sitter queries and should speedup both
syntax highlighting and text object queries.
This has been shown to fix significant slowdowns with textobjects
for rust files as small as 3k loc.
* Apply suggestions from code review
Co-authored-by: Blaž Hrastnik <blaz@mxxn.io>
Co-authored-by: Blaž Hrastnik <blaz@mxxn.io>
`deepest_preceding` is known to be a descendant of `node`. Repeated
calls of `Node::parent` _should_ eventually turn `deepest_preceding`
into `node`, but when the node is errored (the tree contains a syntax
error), `Node::parent` returns None.
In the typescript case:
if(true) &&true
// ^ press enter here
The tree is:
(program [0, 0] - [1, 0]
(if_statement [0, 0] - [0, 15]
condition: (parenthesized_expression [0, 2] - [0, 8]
(true [0, 3] - [0, 7]))
consequence: (expression_statement [0, 8] - [0, 15]
(binary_expression [0, 8] - [0, 15]
left: (identifier [0, 8] - [0, 8])
right: (true [0, 11] - [0, 15])))))
`node` is the `program` node and `deepest_preceding` is the
`binary_expression`. The tree is errored on the `binary_expression`
node with `(MISSING identifier [0, 8] - [0, 8])`.
In the C++ case:
; <<
// press enter after the ';'
The tree is:
(translation_unit [0, 0] - [1, 0]
(expression_statement [0, 0] - [0, 1])
(ERROR [0, 1] - [0, 4]
(identifier [0, 1] - [0, 1])))
`node` is the `translation_unit` and `deepest_preceding` is the `ERROR`
node.
In both cases, `Node::parent` on the errored node returns None.
This changes the completion items to be rendered with shellword
escaping, so a file `a b.txt` is rendered as `a\ b.txt` which matches
how it should be inputted.
8584b38cfb switched to shellwords for
completion in command-mode. This changes the conditions for choosing
whether to complete the command or use the command's completer.
This change processes the input as shellwords up-front and uses
shellword logic about whitespace to determine whether the command
or argument should be completed.
The change in d801a6693c to search for
suffixes in `file-types` is too permissive: files like the tutor or
`*.txt` files are now mistakenly interpreted as R or perl,
respectively.
This change changes the syntax for specifying a file-types entry that
matches by suffix:
```toml
file-types = [{ suffix = ".git/config" }]
```
And changes the file-type detection to first search for any non-suffix
patterns and then search for suffixes only with the file-types entries
marked explicitly as suffixes.
Just like for grammars we currently force a lower-case of the name for
some actions (like filesystem lookup). To make this consistent and less
surprising for users, we remove this lower-casing here.
Note: it is still the preferred way to name both language and grammar in
lower-case
Signed-off-by: Christian Speich <cspeich@emlix.com>
* Fix shellwords delimiter handling
This allows commands such as `:set statusline.center ["file-type"]` to
work. Before the quotes within the list would mess it up.
Also added a test to ensure correct behavior
* Rename Delimiter -> OnWhitespace
* Fix test::print for Unicode
The print function was not generating correct translations when
the input has Unicode (non-ASCII) in it. This is due to its use of
String::len, which gives the length in bytes, not chars.
* Fix multi-code point auto pairs
The current code for auto pairs is counting offsets by summing the
length of the open and closing chars with char::len_utf8. Unfortunately,
this gives back bytes, and the offset needs to be in chars.
Additionally, it was discovered that there was a preexisting bug where
the selection was not computed correctly in the case that the cursor
was:
1. a single grapheme in width
2. this grapheme was more than one char
3. the direction of the cursor is backwards
4. a secondary range
In this case, the offset was not being added into the anchor. This was
fixed.
* migrate auto pairs tests to integration
* review comments
* feat(syntax): add strategy to associate file to language through pattern
File path will match if it ends with any of the file types provided in the config.
Also used this feature to add support for the .git/config and .ssh/config files
* Add /etc/ssh/ssh_config to languages.toml
* cargo xtask docgen
* Update languages.md
* Update languages.md
* Update book/src/languages.md
Co-authored-by: Ivan Tham <pickfire@riseup.net>
* Update book/src/languages.md
Co-authored-by: Ivan Tham <pickfire@riseup.net>
Co-authored-by: Ivan Tham <pickfire@riseup.net>
Info logs don't show up in the log file by default, but this line
should: failures to load tree-sitter parser objects are useful errors.
A parser might fail to load it is misconfigured
(https://github.com/helix-editor/helix/pull/4303#discussion_r996448543)
or if the file does not exist.
This changes the behavior of operations like `]f`/`[f` to set the
direction of the new range to the direction of the action.
The original behavior was to always use the head of the next function.
This is inconsistent with the behavior of goto_next_paragraph and makes
it impossible to create extend variants of the textobject motions.
This causes a behavior change when there are nested functions. The
behavior in the parent commit is that repeated uses of `]f` will
select every function in the file even if nested. With this commit,
functions are skipped.
It's notable that it's possible to emulate the original behavior by
using the `ensure_selections_forward` (A-:) command between invocations
of `]f`.
* Split helix_core::find_root and helix_loader::find_local_config_dirs
The documentation of find_root described the following priority for
detecting a project root:
- Top-most folder containing a root marker in current git repository
- Git repository root if no marker detected
- Top-most folder containing a root marker if not git repository detected
- Current working directory as fallback
The commit contained in https://github.com/helix-editor/helix/pull/1249
extracted and changed the implementation of find_root in find_root_impl,
actually reversing its result order (since that is the order that made
sense for the local configuration merge, from innermost to outermost
ancestors).
Since the two uses of find_root_impl have different requirements (and
it's not a matter of reversing the order of results since, e.g., the top
repository dir should be used by find_root only if there's not marker in
other dirs), this PR splits the two implementations in two different
specialized functions.
In doing so, find_root_impl is removed and the implementation is moved
back in find_root, moving it closer to the documented behaviour thus
making it easier to verify it's actually correct
* helix-core: remove Option from find_root return type
It always returns some result, so Option is not needed