You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# -*- coding: utf-8 -*-
from __future__ import print_function, unicode_literals
import pyte
if __name__ == "__main__":
emoji_string = "☁️"
print(emoji_string.encode("utf-8").hex())
print("---")
screen = pyte.Screen(80, 24)
stream = pyte.Stream(screen)
stream.feed(emoji_string)
for character in screen.display[0][:3]:
print(character.encode("utf-8").hex())
emoji_string contains one grapheme cluster,
that is displayed like in terminal/editor/etc:
This emoji is displayed as a single one, but it conists of two and.
Pyte seems to drop the second (the rest except the first part?) part of the cluster,
and so the output of the program looks like this:
e29881efb88f
---
e29881
20
20
We see that efb88f was dropped, and immediately after e29881, spaces follow (20).
Is it a bug in pyte or is it expected behaviour?
Maybe, I've missed some configuration mode?
The text was updated successfully, but these errors were encountered:
I have written a small workaround for this problem, it works fine for me, but I don't think that it is a good solution for this bug.
That is how I do it:
def _fix_graphemes(text):
"""
Extract long graphemes sequences that can't be handled
by pyte correctly because of the bug pyte#131.
Graphemes are omited and replaced with placeholders,
and returned as a list.
Return:
text_without_graphemes, graphemes
"""
output = ""
graphemes = []
for gra in grapheme.graphemes(text):
if len(gra) > 1:
character = "!"
graphemes.append(gra)
else:
character = gra
output += character
return output, graphemes
I extract the graphemes before rendering, like this:
text, graphemes = _fix_graphemes(text)
and then after rendering I put them back.
It works like it should, but I am not sure that this method is (1) general enough (2) good for pyte, because it introduces a new dependency: grapheme
Consider this Python 3 code:
emoji_string
contains one grapheme cluster,that is displayed like in terminal/editor/etc:
This emoji is displayed as a single one, but it conists of two
and
.Pyte seems to drop the second (the rest except the first part?) part of the cluster,
and so the output of the program looks like this:
We see that
efb88f
was dropped, and immediately aftere29881
, spaces follow (20
).Is it a bug in pyte or is it expected behaviour?
Maybe, I've missed some configuration mode?
The text was updated successfully, but these errors were encountered: