The new `RopeSliceExt::ceil_char_boundary` from the parent commits can
be used to implement `RopeSliceExt::byte_to_next_char` when used with
`RopeSlice::byte_to_char`. That function had only one caller and that
caller will eventually disappear when we switch to Ropey v2 and drop
character indexing, so we can drop `byte_to_next_char` now and replace
its caller with `byte_to_char` plus `ceil_char_boundary`.
This change keeps the unit tests for `byte_to_next_char` and checks them
against a polyfill of `byte_to_char` plus `ceil_char_boundary` to ensure
that `byte_to_next_char`'s intended behavior is not changed.
This is a good example use-case of the `floor_char_boundary` and
`ceil_char_boundary` functions added in the parent commit. In the
single-width, single-selection case in `goto_file` we cap the search
to either the current line or 1000 bytes before or after the cursor
(whichever case comes earlier). That byte index might not lie on a
character boundary so it needs to be fixed to either the prior or
later boundary.
These functions mimic `str::floor_char_boundary` and
`str::floor_char_boundary` (currently unstable under
`round_char_boundary`). They're useful for correcting a byte index
which may not lie on a character boundary. For example you might limit
a search within a slice to some fixed number of bytes. The fixed number
might not lie on a boundary though so it needs to be corrected to
either the earlier (floor) or later (ceil) boundary.
This change adds tree-sitter grammar caching to Check, Lints and Docs
jobs which all previously downloaded grammars in the `helix-term` build
script fresh per job. This should increase reliability and mitigate the
effects of an ongoing SourceHut outage
(<https://status.sr.ht/issues/2025-01-23-git.sr.ht-ddos/>).
This is also a nice speed boost for these jobs:
| Job name | Example time before | Example time after |
|--- |--- |--- |
| Check | 2m20s | 47s |
| Lints | 2m56s | 1m10s |
| Docs | 4m56s | 2m35s |
The `Name` variant's inner type can be switched to `RopeSlice` since
the parent commit removed the usage of `&str`. In doing this we need to
switch from a regular `Regex` to a `rope::Regex`, which is mostly a
matter of renaming the type.
The `Filename` and `Shebang` variants can also switch to `RopeSlice`
which avoids allocations in cases where the text doesn't reside on
different chunks of the rope. Previously `Filename`'s `Cow` was always
the owned variant because of the conversion to a `PathBuf`.
This splits the `InjectionLanguageMarker::Name` into two: one that
preforms the previous behavior (using the language configurations'
`injection_regex` fields and performing a match) and a new variant that
looks up directly by `language_id` with equality.
The old variant is used when capturing the injection language like we
do in the markdown queries for codefences. That captured text is part of
the document being highlighted so we might need a regex to recognize a
language like JavaScript as either "js" or "javascript". But the text
passed in the `(#set! injection.language "name")` property can be
looked up directly. This property is in the query code so there's no
need to be flexible in what we accept: we can require that the
`(#set! injection.language ..)` properties refer to languages by their
configured ID. This should save a noticeable amount of work for the
common case of injections: `(#set! injection.language)` is used much
more often than `@injection.language`.