Bike 1.7 Preview (92)

jessegrosjean · December 5, 2022, 10:09pm

Changed to wavy spelling, grammar, and correction underlines
Changed to flash instead of slide caret for text replacements
Fixed view update when learning and ignoring spelling errors
Fixed crash when typing word breaking character mid-word
Fixed crash when pasting text that matches existing text
Fixed crash when inserting combining characters
Fixed drawing to better include diacritics

I still still need to fix bugs in the code that adds/removes/updates spelling and grammar underlines…

This release mostly fixes crashes and bugs that came up recently. Please test it out if you were effected by any of the bugs or crashes!

To get preview releases through Bike’s software update select: Bike > Preferences > General > Include “preview” releases.

Bike Releases

complexpoint · December 5, 2022, 10:52pm

Composed niqqud vowels now typing well, and fully visible with the System font and standard row settings.

Thank you !

The only unfamiliar element that remains in editing these composed characters is the scope of backspace.

What I’m used to with such characters (e.g. in Microsoft Word, Visual Studio Code, Mellel, and here in Discourse) is that after adding a vowel (or other secondary element) to the composition, backspace will remove only the dot(s), rather than the whole consonant + dot composite.

So, for example, with the Hebrew - QWERTY IME:

keyboard K might type the consonant: כ
tapping the ~ key might then add a tiny dot (or dagesh) inside it: כּ
and backspace would now remove just the dot: כ

In build 92, backspace seems to remove both vowel and consonant, so that if we change our mind about the dot, we have to retype the consonant too.

( Not sure if the same pattern typically applies to other Unicode composites )

Another example, still with Hebrew - QWERTY IME

Typing the 'a' key gives: א
a following ⌥5 adds a pair of dots below: אֵ
and backspace removes just those dots: א

jessegrosjean · December 5, 2022, 11:39pm

There must be some text editing rule that I don’t understand

My general understanding and implementation always deletes back on grapheme/character at a time. I thought that was the general rule. For example on macOS editors they generally delete characters and accents (such as ê all at the same time). Recently added you can Control-Delete to delete one unicode point at a time instead of one grapheme at a time.

So this brings me to learning time… why is אֵ treated differently in all editors? And how to I determine which combine characters should be deleted individually with standard delete and which should only be deleted individually with Control-Delete?

Thanks for help!

complexpoint · December 5, 2022, 11:44pm

Am I right in thinking that whereas the Hebrew consonants + vowels are always compositions, the roman letter + diacritic forms generally tend to have an uncomposed unicode position of their own (even if it may also be possible to derive them by composition) ?

jessegrosjean · December 5, 2022, 11:54pm

I’m not really sure

I do know that I can have:

ẽ: U+0065, U+0303
כּ: U+05DB, U+05BC

And all text editors seem to know to treat the first case as a single unite when deleting, and the second case as two parts when deleting. I’m not sure how they know to do it.

jessegrosjean · December 6, 2022, 12:06am

I think maybe my misunderstanding is thinking that there is a single kind of grapheme cluster? But I guess there are different kinds, and maybe “extended” grapheme clusters need to be handled differently. I’ll look into it tomorrow.

complexpoint · December 6, 2022, 9:05am

Perhaps the picture is that:

where there is a single unicode point corresponding to the outcome of a composition (mainly Roman glyphs with diacritics, like the ẽ case), these editors are performing canonical composition, so that
by the time backspace is tapped, the character behind it is often (esp Roman diacritic cases) no longer a composite anyway …

i.e. normalization to available non-composites is happening upstream by default in TextEdit etc, so the backspace logic doesn’t have anything to think about – it always removes the more recent component of a composition.

“Unicode normalization form” in the Apple interfaces ?

e.g. perhaps:

CFStringNormalizationForm | Apple Developer Documentation

which seems to support:

Normalization Form C (NFC)
(Canonical Decomposition, followed by Canonical Composition)

as defined in:

UAX #15: Unicode Normalization Forms

jessegrosjean · December 9, 2022, 1:12pm

I think these issues are resolved in Bike 1.7 Preview (93)