mpremote cat hangs on Windows interactive console with Unicode content
Port, board and/or hardware
Host Windows (any hardware)
MicroPython version
- mpremote 1.27.0
Reproduction
-
On Windows, have a MicroPython device with a file containing Unicode text:
/unicode_test/नेपाली_नाम.txt -
Run
mpremote catinteractively in PowerShell (NOT piped or captured):mpremote cat :/unicode_test/नेपाली_नाम.txt -
The command hangs indefinitely. Even with
PYTHONIOENCODING=utf-8set, it still hangs. -
Press Ctrl-C to interrupt.
Observations
The same command works correctly when:
- Output is piped:
mpremote cat :file.txt > output.txt - Output is captured by subprocess:
subprocess.run([...], capture_output=True) - Running on Linux (even in WSL2 on the same machine)
The hang ONLY occurs when stdout is a real Windows console (isatty=True).
Expected behaviour
$ mpremote cat :/unicode_test/नेपाली_नाम.txt
Nepali Name - नेपाली नाम
This file tests Nepali (Devanagari) script.
Script: Devanagari (Nepali variant)
Vowels: अ आ इ ई उ ऊ ऋ ए ऐ ओ औ
Consonants: क ख ग घ ङ च छ ज झ ञ ट ठ ड ढ ण त थ द ध न
...
for test files see : [https://github.com/[Josverl/unicode_mpy](https://github.com/Josverl/unicode_mpy)
https://github.com/Josverl/unicode_mpy/tree/main/test_data/South_Asian_Indic
### Observed behaviour
The command hangs indefinitely when run interactively on Windows with Unicode content. Ctrl-C shows the following traceback:
Traceback (most recent call last):
File "...\mpremote\main.py", line 614, in main
handler_func(state, args)
File "...\mpremote\commands.py", line 421, in do_filesystem
state.transport.fs_printfile(path)
File "...\mpremote\transport.py", line 129, in fs_printfile
self.exec(cmd, data_consumer=stdout_write_bytes)
File "...\mpremote\transport_serial.py", line 309, in exec
ret, ret_err = self.exec_raw(command, data_consumer=data_consumer)
File "...\mpremote\transport_serial.py", line 296, in exec_raw
return self.follow(timeout, data_consumer)
File "...\mpremote\transport_serial.py", line 204, in follow
data = self.read_until(1, b"\x04", timeout=timeout, data_consumer=data_consumer)
File "...\mpremote\transport_serial.py", line 146, in read_until
data_consumer(new_data)
File "...\mpremote\transport.py", line 36, in stdout_write_bytes
sys.stdout.buffer.flush()
KeyboardInterrupt
### Additional Information
**Initial hypothesis (DISPROVEN):** The hang was suspected to be in `stdout_write_bytes()`:
```python
def stdout_write_bytes(b):
sys.stdout.buffer.write(b)
sys.stdout.buffer.flush() # <-- Suspected to hang
Test result: A standalone Python script calling sys.stdout.buffer.write() + flush() with identical Unicode content does NOT hang.
Actual cause: The hang is NOT in CPython's stdout handling. The issue is elsewhere in mpremote, possibly:
- Different code path when
stdout.isatty()is True vs False - Threading/synchronization issues between reader/writer threads
- Windows console API interaction issues
- Serial/socket transport blocking behavior
Possibly Related Issues
- https://github.com/micropython/micropython/issues/15228 - Unable to print unicode characters when running repl with mpremote
Code of Conduct
Yes, I agree
mpremote fails with UnicodeEncodeError on Windows legacy consoles (cp1252)
Port, board and/or hardware
Windows (any hardware) - affects mpremote tool when run on Windows with legacy console (cp1252 or similar encoding).
MicroPython version
- mpremote 1.27.0
- Python 3.11.9 /3.13.1 (host)
- Tested against MicroPython 1.27.0 RP2, ESP32
Reproduction
This issue only occurs with legacy Windows consoles that use cp1252 or similar encodings. Modern terminals (Windows Terminal, VS Code) use UTF-8 by default and are not affected.
To reproduce, you must use a legacy console:
-
Open legacy cmd.exe (not Windows Terminal) or configure PowerShell to use cp1252:
# Force legacy encoding in Python $env:PYTHONIOENCODING = "cp1252" -
Create test files with non-ASCII characters:
mkdir unicode_test echo "test" > "unicode_test\Владимир_Петров.txt" -
Use mpremote to copy:
mpremote cp -rv unicode_test :
Alternatively, the issue can be demonstrated with this Python snippet:
import sys
sys.stdout.reconfigure(encoding='cp1252')
print('Владимир_Петров.txt') # Raises UnicodeEncodeError
Expected behaviour
mpremote cp should successfully copy files with Unicode characters in filenames on Windows. The tool should handle all Unicode characters in console output without raising encoding errors.
CPython's print() should use UTF-8 or properly handle the console encoding.
Observed behaviour
The command fails immediately with:
UnicodeEncodeError: 'charmap' codec can't encode characters in position X-Y: character maps to <undefined>
The file is not copied.
Additional Information
Workaround
Set the PYTHONIOENCODING environment variable before running mpremote:
$env:PYTHONIOENCODING = "utf-8"
mpremote connect COM3 cp -r . :
Or add to PowerShell $PROFILE for persistence:
$env:PYTHONIOENCODING = "utf-8"
$env:PYTHONUTF8 = "1"
Suggested Fix
Force UTF-8 encoding in mpremote on Windows:
import sys
import io
if sys.platform == 'win32':
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8', errors='replace')
sys.stderr = io.TextIOWrapper(sys.stderr.buffer, encoding='utf-8', errors='replace')
This would be a minimal change in the mpremote initialization code.
Code of Conduct
Yes, I agree