You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am writing a REPL for the Lambda Calculus and I incorporated line noise-Swift to provide line editing functionality. Unfortunately, The Greek letter lambda (λ) is encoded in UTF-8 as two bytes: CEBB. linenoise-swift handles input one byte at a time and tries to split the λ. The same problem occurs for any Unicode code point that takes more than one byte to stop in UTF-8, i.e. everything except 7-bit US ASCII.
How to Reproduce
Run the linenoiseDemo command line app. Type in a few characters and then a λ. The cursor will be repositioned at the start of the line and garbage appended to the end of the line. Here is an example:
Type 'exit' to quit
gdggfdsgdsλ
utput: gdggfdsgdsλ
?
If you are having trouble producing a λ from your keyboard, the problem still manifests if you copy-paste it from the text of this issue.
Further Information
I made an attempt to fix the issue myself. You can see my attempt here. The patch is a lot bigger than you might expect because adding support for multibyte UTF-8 exposes another more subtle bug.
Consider the following code in class EditLine
func insertCharacter(_ char: Character) {
let origLoc = location
let origEnd = buffer.endIndex
buffer.insert(char, at: location)
location = buffer.index(after: location)
if origLoc == origEnd {
location = buffer.endIndex
}
}
Calling this method invalidates any existing indices for use with this string
This means that location, origLoc and origEnd are all invalid after the insert. If it's a single byte character we get away with it. If not, location ends up as a garbage value and causes a process abort when it is next used. I ended up changing the types of buffer to [Character] and location to Int as the easy way out.
NB I can give you a pull request or a patch, if it helps, but it hasn't been extensively tested and probably still breaks with composed characters e.g. emoji.
The text was updated successfully, but these errors were encountered:
Unfortunately as you found, LineNoise doesn't support UTF-8 (I don't believe that the original LineNoise library does either, but I could be wrong). I'd be happy to accept a pull request with the added functionality, but it would probably be best if UTF-8 support was a enabled with a flag, as not all terminals support it
Description
I am writing a REPL for the Lambda Calculus and I incorporated line noise-Swift to provide line editing functionality. Unfortunately, The Greek letter lambda (λ) is encoded in UTF-8 as two bytes:
CE
BB
. linenoise-swift handles input one byte at a time and tries to split the λ. The same problem occurs for any Unicode code point that takes more than one byte to stop in UTF-8, i.e. everything except 7-bit US ASCII.How to Reproduce
Run the linenoiseDemo command line app. Type in a few characters and then a λ. The cursor will be repositioned at the start of the line and garbage appended to the end of the line. Here is an example:
If you are having trouble producing a λ from your keyboard, the problem still manifests if you copy-paste it from the text of this issue.
Further Information
I made an attempt to fix the issue myself. You can see my attempt here. The patch is a lot bigger than you might expect because adding support for multibyte UTF-8 exposes another more subtle bug.
Consider the following code in class
EditLine
The Apple Documentation for insert(_:, at:) says
This means that
location
,origLoc
andorigEnd
are all invalid after the insert. If it's a single byte character we get away with it. If not,location
ends up as a garbage value and causes a process abort when it is next used. I ended up changing the types ofbuffer
to[Character]
andlocation
toInt
as the easy way out.NB I can give you a pull request or a patch, if it helps, but it hasn't been extensively tested and probably still breaks with composed characters e.g. emoji.
The text was updated successfully, but these errors were encountered: