RTF Copy doesn't preserve accents


#1

Dear all,
In FT v. 2.1, when I “copy as Rich text” some French text (accentuated), the copied text doesn’t preserve accentuated text. For example:
"FoldingText est un éditeur de textes. "
Is rendered once RTF copied/pasted as:
“FoldingText est un éditeur de textes.”

Would you like to investigate what’s wrong? I’m pretty sure that this problem was fixed in FT v. 2.0.
thanks!
best,
ph


#2

I’m seeing the same here – there seems to have been an accidental loss of UTF-8 in the conversion to RTF and a switch to something reported by Applescript as class ut16

The plain copy as puts UTF8 into the clipboard

but in the conversion it’s as if utf8 wasn’t specified for an invocation of textutil or a Cocoa equivalent.

je suis éditeur של משפחה ->

the clipboard as record

->

"je suis éditeur de ???©???—?•?ª
"

For an interim fix, if you want to convert the clipboard from HTML to RTF yourself, you could use something like:

do shell script "echo " & quoted form of (the clipboard as text) & "  | textutil -format html -convert rtf -inputencoding UTF-8 -stdin -stdout | pbcopy -Prefer rtf"

(Note the -inputencoding UTF-8 switch)

More generally of course, you can get RTF (with all UTF8 characters) by combining FT with Brett Terpstra’s Marked


#3

Dear Jesse,
thanks for your reply; I hope this issue will be fixed in a next version of FT!
best
ph


#4

Hi,

Thanks for the report and details! :slight_smile: I’ve added an issue here to track and fix this.

Regards,
Mutahhir


#5

Hi @Dessus_Ph,

I looked into the issue and this is due to the MMD processing library we’re using. I updated the library in hopes that it fixes the issue, but that didn’t work. The bug report for this was in https://github.com/mutahhir/foldingtext-issues/issues/13, feel free to leave a comment or any new findings there. For now, the status is that I don’t think this will be fixed in the upcoming versions. Sorry :disappointed:

Regards,
Mutahhir


#6

hi Hayat,
thx for informing me;
since lots of people use diacritics signs, RTF-based paste function is useless, and you’d consider removing it from its menu, till this is fixed, shouldn’t you?
best
ph


#7

Hi,

I think it still has it’s uses for people not using diacritics, and the normal copy maintains the diacritics, as well as copy as HTML. I’m not sure if I would remove it.

Regards,
Mutahhir


#8

Yes, useful for English people :wink: The normal copy doesn’t maintain any text enrichment (bold, titles) and the HTML inserts ugly tags. What a pity! I’m a bit disappointed by this point. Hopping that you’ll fix this problem asap.


#9

Out of interest – is that Fletcher Penney’s MMD suite ?

( It seems unusually anglo-centric for any library not to support UTF-8 these days : - )


#10

Yes, I think it’s the same. The problem isn’t that it’s Anglo-centric, just that the RTF export isn’t as complete as the HTML export. I’ll be looking for alternative libraries for this though

Regards,
Mutahhir


#11

Got it …

I guess one approach (for the app or for users) might then be to use the fuller HTML export, and pipe it through textutil to get the RTF, with something in the shell like:

echo ...    | textutil -format html -convert rtf -inputencoding UTF-8 -stdin -stdout | pbcopy -Prefer rtf

?


#12

AppleScript to convert HTML in clipboard to RTF:

do shell script "echo " & quoted form of (the clipboard as text) & "  | textutil -format html -convert rtf -inputencoding UTF-8 -stdin -stdout | pbcopy -Prefer rtf"

Keyboard Maestro macro to replace ⌘^C in FT2 (with an action which copies as RTF with non-English content preserved):


#13

PS - not sure if this is still the case, but I notice that at one stage MMD itself used an HTML -> textutil -> RTF route.

http://fletcherpenney.net/2011/05/mmd_3.0_and_rtf

If it is still doing that, it may simply need to use textutil's

-inputencoding UTF-8

switch.

Rob


#14

This bug also affects the general punctuation unicode block, so it does affect use in all languages using a Roman script — does this make it a higher priority?


#15

Hi @alanreedwrite,

The problem about this is not of priority, just that we’re using an external library which doesn’t support unicode properly. I’ll keep a lookout for a fix, but I can’t comment on when / if I’ll be able to fix this.

Regards,
​Mutahhir