The change in d801a6693c to search for
suffixes in `file-types` is too permissive: files like the tutor or
`*.txt` files are now mistakenly interpreted as R or perl,
respectively.
This change changes the syntax for specifying a file-types entry that
matches by suffix:
```toml
file-types = [{ suffix = ".git/config" }]
```
And changes the file-type detection to first search for any non-suffix
patterns and then search for suffixes only with the file-types entries
marked explicitly as suffixes.
Just like for grammars we currently force a lower-case of the name for
some actions (like filesystem lookup). To make this consistent and less
surprising for users, we remove this lower-casing here.
Note: it is still the preferred way to name both language and grammar in
lower-case
Signed-off-by: Christian Speich <cspeich@emlix.com>
* feat(syntax): add strategy to associate file to language through pattern
File path will match if it ends with any of the file types provided in the config.
Also used this feature to add support for the .git/config and .ssh/config files
* Add /etc/ssh/ssh_config to languages.toml
* cargo xtask docgen
* Update languages.md
* Update languages.md
* Update book/src/languages.md
Co-authored-by: Ivan Tham <pickfire@riseup.net>
* Update book/src/languages.md
Co-authored-by: Ivan Tham <pickfire@riseup.net>
Co-authored-by: Ivan Tham <pickfire@riseup.net>
Info logs don't show up in the log file by default, but this line
should: failures to load tree-sitter parser objects are useful errors.
A parser might fail to load it is misconfigured
(https://github.com/helix-editor/helix/pull/4303#discussion_r996448543)
or if the file does not exist.
* Fix nondeterministic highlighting
This is done by prefering matches in the begining, ie for
`keyword.function`, `keyword` is a better match than `function`.
* Use all positions and not just leftmost
Fixes possible edgecase with something like `function.method.builtin`
and the queries `function.builtin` and `function.method`
* Switch to bitmask for slightly better performance
* Make matches from the start of string
Also change comments to match new behaviour
* Change default formatter for any language
* Fix clippy error
* Close stdin for Stdio formatters
* Better indentation and pattern matching
* Return Result<Option<...>> for fn format instead of Option
* Remove unwrap for stdin
* Handle FormatterErrors instead of Result<Option<...>>
* Use Transaction instead of LspFormatting
* Use Transaction directly in Document::format
* Perform stdin type formatting asynchronously
* Rename formatter.type values to kebab-case
* Debug format for displaying io::ErrorKind (msrv fix)
* Solve conflict?
* Use only stdio type formatters
* Remove FormatterType enum
* Remove old comment
* Check if the formatter exited correctly
* Add formatter configuration to the book
* Avoid allocations when writing to stdin and formatting errors
* Remove unused import
Co-authored-by: Gokul Soumya <gokulps15@gmail.com>
* allows passing extra formatting options to LSPs
- adds optional field 'format' to [[language]] sections in 'languages.toml'
- passes specified options the LSPs via FormattingOptions
* cleaner conversion of formatting properties
* move formatting options inside lsp::Client
* cleans up formatting properties merge
* add reflow command
Users need to be able to hard-wrap text for many applications, including
comments in code, git commit messages, plaintext documentation, etc. It
often falls to the user to manually insert line breaks where appropriate
in order to hard-wrap text.
This commit introduces the "reflow" command (both in the TUI and core
library) to automatically hard-wrap selected text to a given number of
characters (defined by Unicode "extended grapheme clusters"). It handles
lines with a repeated prefix, such as comments ("//") and indentation.
* reflow: consider newlines to be word separators
* replace custom reflow impl with textwrap crate
* Sync reflow command docs with book
* reflow: add default max_line_len language setting
Co-authored-by: Vince Mutolo <vince@mutolo.org>
* log textobject query construction errors
The current behavior is that invalid queries are discarded silently
which makes it difficult to debug invalid textobjects (either invalid
syntax or an update may have come through that changed the valid set
of nodes).
* fix golang textobject query
`method_spec_list` used to be a named node but was removed (I think
for Helix, it was when updated to pull in the support for generics).
Instead of a named node for the list of method specs we have a bunch
of `method_spec` children nodes now. We can match on the set of them
with a `+` wildcard.
Example go for this query:
type Shape interface {
area() float64
perimeter() float64
}
Which is parsed as:
(source_file
(type_declaration
(type_spec
name: (type_identifier)
type: (interface_type
(method_spec
name: (field_identifier)
parameters: (parameter_list)
result: (type_identifier))
(method_spec
name: (field_identifier)
parameters: (parameter_list)
result: (type_identifier))))))
* Add runtime language configuration (#1794)
* Add set-language typable command to change the language of current buffer.
* Add completer for available language options.
* Update set-language to refresh language server as well
* Add language id based config lookup on `syntax::Loader`.
* Add `Document::set_language3` to set programming language based on language
id.
* Update `Editor::refresh_language_server` to try language detection only if
language is not already set.
* Remove language detection from Editor::refresh_language_server
* Move document language detection to where the scratch buffer is saved.
* Rename Document::set_language3 to Document::set_language_by_language_id.
* Remove unnecessary clone in completers::language
* WIP: Rework indentation system
* Add ComplexNode for context-aware indentation (including a proof of concept for assignment statements in rust)
* Add switch statements to Go indents.toml (fixes the second half of issue #1523)
Remove commented-out code
* Migrate all existing indentation queries.
Add more options to ComplexNode and use them to improve C/C++ indentation.
* Add comments & replace Option<Vec<_>> with Vec<_>
* Add more detailed documentation for tree-sitter indentation
* Improve code style in indent.rs
* Use tree-sitter queries for indentation instead of TOML config.
Migrate existing indent queries.
* Add documentation for the new indent queries.
Change xtask docgen to look for indents.scm instead of indents.toml
* Improve code style in indent.rs.
Fix an issue with the rust indent query.
* Move indentation test sources to separate files.
Add `#not-kind-eq?`, `#same-line?` and `#not-same-line` custom predicates.
Improve the rust and c indent queries.
* Fix indent test.
Improve rust indent queries.
* Move indentation tests to integration test folder.
* Improve code style in indent.rs.
Reuse tree-sitter cursors for indentation queries.
* Migrate HCL indent query
* Replace custom loading in indent tests with a designated languages.toml
* Update indent query file name for --health command.
* Fix single-space formatting in indent queries.
* Add explanation for unwrapping.
Co-authored-by: Triton171 <triton0171@gmail.com>
This avoids costly conversions via byte_to_char (which are then
reversed back into bytes internally in Ropey).
Reduces time spent in slice/byte_to_char from ~24% to ~5%.
This is a rather large refactor that moves most of the code for
loading, fetching, and building grammars into a new helix-loader
module. This works well with the [[grammars]] syntax for
languages.toml defined earlier: we only have to depend on the types
for GrammarConfiguration in helix-loader and can leave all the
[[language]] entries for helix-core.
The vision with 'use-grammars' is to allow the long-requested feature
of being able to declare your own set of grammars that you would like.
A simple schema with only/except grammar names controls the list
of grammars that is fetched and built. It does not (yet) control which
grammars may be loaded at runtime if they already exist.
This is not strictly speaking necessary. tree_sitter_library was used by
just one grammar: llvm-mir-yaml, which uses the yaml grammar. This will
make the language more consistent, though. Each language can explicitly
say that they use Some(grammar), defaulting when None to the grammar that
has a grammar_id matching the language's language_id.
helix-syntax mostly existed for the sake of the build task which
checks and compiles the submodules. Since we won't be relying on
that process anymore, it doesn't end up making much sense to have
a very thin crate just for some functions that we could port to
helix-core.
The remaining build-related code is moved to helix-term which will
be able to provide grammar builds through the --build-grammars CLI
flag.
Here we add syntax to the languages.toml languge
[[grammar]]
name = "<name>"
source = { .. }
Which can be used to specify a tree-sitter grammar separately of
the language that defines it, and we make this distinction for
two reasons:
* In later commits, we will separate this code from helix-core
and bring it to a new helix-loader crate. Using separate schemas
for language and grammar configurations allows for a nice divide
between the types needed to be declared in helix-loader and in
helix-core/syntax
* Two different languages may use the same grammar. This is currently
the case with llvm-mir-yaml and yaml. We could accomplish a config
that works for this with just `[[languages]]`, but it gets a bit
dicey with languages depending on one another. If you enable
llvm-mir-yaml and disable yaml, does helix still need to fetch and
build tree-sitter-yaml? It could be a matter of interpretation.
* Move runtime file location definitions to core
* Add basic --health command
* Add language specific --health
* Show summary for all langs with bare --health
* Use TsFeature from xtask for --health
* cargo fmt
Co-authored-by: Blaž Hrastnik <blaz@mxxn.io>
Treesitter captures can contain multiple nodes like so:
```
(line_comment)+ @comment
```
This would match each line in a comment as a separate
`@comment` capture when what we actually want is the
whole set of contiguous `line_comment` nodes to be
captured under the `@comment` capture. This commit enables
this behaviour.
* impl auto pairs config
Implements configuration for which pairs of tokens get auto completed.
In order to help with this, the logic for when *not* to auto complete
has been generalized from a specific hardcoded list of characters to
simply testing if the next/prev char is alphanumeric.
It is possible to configure a global list of pairs as well as at the
language level. The language config will take precedence over the
global config.
* rename AutoPair -> Pair
* clean up insert_char command
* remove Rc
* remove some explicit cloning with another impl
* fix lint
* review comments
* global auto-pairs = false takes precedence over language settings
* make clippy happy
* print out editor config on startup
* move auto pairs accessor into Document
* rearrange auto pair doc comment
* use pattern in Froms
* Add injection regex for more languages
To support embedding them in other languages like markdown.
* Add llvm-mir highlighting
LLVM Machine IR is dumped as yaml files that can embed LLVM IR and
Machine IR.
To support this, add a llvm-mir-yaml language that uses the yaml
parser, but uses different injections to highlight IR and MIR.
* Update submodule with fixed multiline comments
Co-authored-by: Blaž Hrastnik <blaz@mxxn.io>
* feat(lsp): configurable diagnostic severity
Allow severity of diagnostic messages to be configured.
E.g. allow turning of Hint level diagnostics.
Fixes: https://github.com/helix-editor/helix/issues/1007
* Use language_config() method
* Add documentation for diagnostic_severity
* Use unreachable for unknown severity level
* fix: documentation for diagnostic_severity config
* Add treesitter textobject queries
Only for Go, Python and Rust for now.
* Add tree-sitter textobjects
Only has functions and class objects as of now.
* Fix tests
* Add docs for tree-sitter textobjects
* Add guide for creating new textobject queries
* Add parameter textobject
Only parameter.inside is implemented now, parameter.around
will probably require custom predicates akin to nvim' `make-range`
since we want to select a trailing comma too (a comma will be
an anonymous node and matching against them doesn't work similar
to named nodes)
* Simplify TextObject cell init
* allow language.config (in languages.toml) to be passed in as a toml object
* Change config field for languages from json string to toml object
* remove indents on languages.toml config
* fix: remove patch version from serde_json import in helix-core
* Use same tree-sitter-zig as upstream/master