Interactive REPL does not support unicode/utf8
Copy the following and paste it at the REPL:
z = 'ÇØĚ'
In paste mode it works, but not at the REPL prompt where the string contents disappears.
[shared] [webassembly] pyexec_event_repl_process_char unable to understand unicode
Checks
-
I agree to follow the MicroPython Code of Conduct to ensure a safe and respectful space for everyone.
-
I've searched for existing issues matching this bug, and didn't find any.
Port, board and/or hardware
webassembly, linux shell
MicroPython version
latest
Reproduction
Open a MicroPython REPL or visit this page (which is half patched, but not fully): https://webreflection.github.io/coincident/test/micropython.html
try to type in it the following:
print("µpython")
on a native shell you'll see python instead of µpython, on the Web REPL you see even less because the count goes off due replProcessChar (even the Asyncify one) and this is the tip of the iceberg ... now try a combined emoji:
print("👩❤️👨")
... see emptiness or awkward results ...
Most emoji are indeed just broken out of the box unless you ask for these as an input(...):
fam = input("> ")
# type 👩❤️👨
print(fam) # 👩❤️👨
fam # '\U0001f469\u200d\u2764\ufe0f\u200d\U0001f468'
Coincidentally, if you explicitly go into "REPL paste mode" (\5) you can past anything you like then get out (\4) and see all code pasted had no issues in being processed, just like the input(...) case.
Related PR that fixes at least the output side of affairs https://github.com/pyscript/pyscript/pull/2018 but it cannot fix users' typing on the terminal somehow as replProcessChar misses chars in the process (and yes, it has no linebuffer but it's the same with linebuffer, the issue is within the code behind replProcessChar to me).
Expected behaviour
if I type the following in the REPL I expect things to just work and output the correct result:
print("µpython")
# µpython
print("👩❤️👨")
# 👩❤️👨
Observed behaviour
if I type the following in the REPL this happens instead:
print("µpython")
# python or thon
print("👩❤️👨")
# ... nothing, awkward state
Additional Information
Pinging @dpgeorge as I've done already in Discord but this looks and feels like a broader issue with REPL because it's possible to reproduce it via native Linux port.