← index #18659PR #18670
Likely Duplicate · high · value 0.274
QUERY · ISSUE

mpremote cat hangs on Windows interactive console with Unicode content

openby Josverlopened 2026-01-07updated 2026-01-09
bugtoolsunicode

Port, board and/or hardware

Host Windows (any hardware)

MicroPython version

  • mpremote 1.27.0

Reproduction

  1. On Windows, have a MicroPython device with a file containing Unicode text:

    /unicode_test/नेपाली_नाम.txt
    
  2. Run mpremote cat interactively in PowerShell (NOT piped or captured):

    mpremote cat :/unicode_test/नेपाली_नाम.txt
    
  3. The command hangs indefinitely. Even with PYTHONIOENCODING=utf-8 set, it still hangs.

  4. Press Ctrl-C to interrupt.

Observations

The same command works correctly when:

  • Output is piped: mpremote cat :file.txt > output.txt
  • Output is captured by subprocess: subprocess.run([...], capture_output=True)
  • Running on Linux (even in WSL2 on the same machine)

The hang ONLY occurs when stdout is a real Windows console (isatty=True).

Expected behaviour

$ mpremote cat :/unicode_test/नेपाली_नाम.txt

Nepali Name - नेपाली नाम

This file tests Nepali (Devanagari) script.

Script: Devanagari (Nepali variant)

Vowels: अ आ इ ई उ ऊ ऋ ए ऐ ओ औ
Consonants: क ख ग घ ङ च छ ज झ ञ ट ठ ड ढ ण त थ द ध न
...



for test files see : [https://github.com/[Josverl/unicode_mpy](https://github.com/Josverl/unicode_mpy)

https://github.com/Josverl/unicode_mpy/tree/main/test_data/South_Asian_Indic


### Observed behaviour

The command hangs indefinitely when run interactively on Windows with Unicode content. Ctrl-C shows the following traceback:

Traceback (most recent call last):
File "...\mpremote\main.py", line 614, in main
handler_func(state, args)
File "...\mpremote\commands.py", line 421, in do_filesystem
state.transport.fs_printfile(path)
File "...\mpremote\transport.py", line 129, in fs_printfile
self.exec(cmd, data_consumer=stdout_write_bytes)
File "...\mpremote\transport_serial.py", line 309, in exec
ret, ret_err = self.exec_raw(command, data_consumer=data_consumer)
File "...\mpremote\transport_serial.py", line 296, in exec_raw
return self.follow(timeout, data_consumer)
File "...\mpremote\transport_serial.py", line 204, in follow
data = self.read_until(1, b"\x04", timeout=timeout, data_consumer=data_consumer)
File "...\mpremote\transport_serial.py", line 146, in read_until
data_consumer(new_data)
File "...\mpremote\transport.py", line 36, in stdout_write_bytes
sys.stdout.buffer.flush()
KeyboardInterrupt


### Additional Information


**Initial hypothesis (DISPROVEN):** The hang was suspected to be in `stdout_write_bytes()`:
```python
def stdout_write_bytes(b):
    sys.stdout.buffer.write(b)
    sys.stdout.buffer.flush()  # <-- Suspected to hang

Test result: A standalone Python script calling sys.stdout.buffer.write() + flush() with identical Unicode content does NOT hang.

Actual cause: The hang is NOT in CPython's stdout handling. The issue is elsewhere in mpremote, possibly:

  • Different code path when stdout.isatty() is True vs False
  • Threading/synchronization issues between reader/writer threads
  • Windows console API interaction issues
  • Serial/socket transport blocking behavior

Possibly Related Issues

  • https://github.com/micropython/micropython/issues/15228 - Unable to print unicode characters when running repl with mpremote

Code of Conduct

Yes, I agree

CANDIDATE · PULL REQUEST

Fix multiple unicode issues in mpremote.

closedby Josverlopened 2026-01-11updated 2026-02-20
toolsunicode

Summary

This pull request addresses multiple issues related to the handling of special characters and Unicode in the mpremote tool. It fixes the escaping of quotes in filenames, ensures proper parsing of filenames containing equals signs, and resolves Unicode encoding errors on Windows consoles. These improvements enhance the usability of mpremote when dealing with diverse file names and character sets.

  • Unicode-safe Windows console output: Detects modern consoles, sets UTF-8 code pages, wraps stdout/stderr when needed, and uses raw UTF-8 writes when possible; legacy consoles now handle split UTF-8 sequences safely.
  • Robust stdout handling: Buffers partial UTF-8 sequences and strips CTRL-D without losing characters, improving REPL/output correctness for multibyte text.
  • Safer path quoting: Filesystem commands now use repr-based quoting so filenames with quotes, backslashes, or Unicode work correctly (including equals-sign parsing).
  • CLI parsing fix: Command expansion no longer misinterprets arguments containing =, preventing unexpected-argument errors.
  • Transport write safety: Converts strings to UTF-8 bytes before writing to avoid encoding errors when writing unicode content to a host folder using mpremote mount <folder>

Fixes: #13055
Fixes: #15228
Fixes: #18658
Fixes: #18657

</p>
</details>

Testing

  • New test support: Adds ramdisk helper, enhanced test runner (device selection, skip handling), and Unicode/special-character coverage in the mpremote test suite.
  • New unicode tests have been added to the mpremote test suite, covering the scenarios mentioned in the issues.
    It should be noted that the current CI setup does not provide for Windows testing, so manual verification has been essential for confirming the fixes.

Manual testing was conducted on Windows pwsh , cmd.exe, MinGW and Linux in WSL2.

<img width="1468" height="1306" alt="image" src="https://github.com/user-attachments/assets/bb0ce2a3-4b83-40dc-bd8d-fb7b622e3863" />

Manual testing was needed due to the lack of Windows support in the bash based testing framework, ensuring that the fixes work as intended across different environments.
Also tests for the mpremote REPL and Console cannot be not covered by the bash test suite.

I started work of a pytest configuration for mpremote to adds the ability to test the REPL and Console, parts that the bash suite is unable to test at all.
That will be submitted in a separate PR once stable across multiple platforms.

Trade-offs and Alternatives

The changes made do not introduce significant trade-offs. The improvements in Unicode handling may slightly increase the complexity of the code, but they are necessary for robust functionality. Alternative approaches were considered, such as using different quoting mechanisms, but the current solutions provide the best balance of compatibility and simplicity.

Keyboard

j / / n
next pair
k / / p
previous pair
1 / / h
show query pane
2 / / l
show candidate pane
c
copy suggested comment
r
toggle reasoning
g i
go to index
?
show this help
esc
close overlays

press ? or esc to close

copied