QUERY · ISSUE

mpremote mount fails reading binary file

openby peterhinchopened 2024-05-22updated 2024-05-22

bugtools

Checks

I agree to follow the MicroPython Code of Conduct to ensure a safe and respectful space for everyone.
I've searched for existing issues matching this bug, and didn't find any.

Port, board and/or hardware

RP2, Pyboard 1.1

MicroPython version

MicroPython v1.22.0 on 2023-12-27; Raspberry Pi Pico with RP2040

Reproduction

Create a file rats15.py on the PC:

import os
fn = "delete_me"
with open(fn, "wb") as f:
    f.write(b"hello\n\xde\xad\xbe\xef")
with open(fn, "rb") as f:
    print(f.readline())
    print(f.read(4))
os.unlink(fn)

Run mpremote

$ mpremote mount .

At the REPL issue

import rats15

Expected behaviour

>>> import rats15
b'hello\n'
b'\xde\xad\xbe\xef'
>>>

This occurs if the script is run under CPython, under the Unix build, or if run locally on a MP target.

Observed behaviour

When run as described above via mpremote mount .:

MicroPython v1.22.2 on 2024-02-22; Raspberry Pi Pico with RP2040
Type "help()" for more information.
>>> import rats15
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "rats15.py", line 7, in <module>
  File "<stdin>", line 147, in readline
TypeError: unsupported types for __add__: 'str', 'bytes'
>>>

Additional Information

mpremote is V1.22.0. The fault occurs on readline(). It originally became evident accessing a pgm graphics file which contains four lines of \n terminated ASCII text followed by binary data

CANDIDATE · ISSUE

mpremote mount: file write of non-byte arrays fails due to incorrect length calculation

openby arachsysopened 2025-07-11updated 2025-09-05

bugtools

Port, board and/or hardware

rp2 port on an RP2350

MicroPython version

MicroPython v1.25.0-162.gf8fe70505 on 2025-06-03; Raspberry Pi Pico2 with RP2350

(Reproduced on a variety of other versions from released v1.25.0 to current git HEAD.)

Reproduction

Writing an array the type of whose elements is longer than one byte doesn't work properly (junk is leaked to the console) if the file is mounted from the host using mpremote mount:

# mpremote mount /tmp repl
Local directory /tmp is mounted at /remote
Connected to MicroPython at /dev/ttyACM0
Use Ctrl-] or Ctrl-x to exit this shell
>>> from array import array
>>> h = array('h', 0x4041 for _ in range(50))
>>> with open('testfile', 'wb') as f:
...     f.write(h)
...
A@A@A@A@A@A@A@A@A@A@A@A@A@A@A@A@A@A@A@A@A@A@A@A@A@50
>>>

Expected behaviour

This should work fine. It works with a write to the 'real' filesystem on the device, and when writing a bytearray() or array('b') to /remote mounted over mpremote:

>>> b = array('b', 0x40 for _ in range(50))
>>> with open('testfile', 'wb') as f:
...     f.write(b)
...
50
>>> h = array('h', 0x4041 for _ in range(50))
>>> with open('/not-remote', 'wb') as f:
...     f.write(h)
...
100
>>>

Even f.write(bytes(h)) over mpremote mount works okay:

>>> with open('testfile2', 'wb') as f:
...     f.write(bytes(h))
...
100
>>>

Note in these examples that f.write(h) correctly returns the number of bytes (100), not the number of items (50).

Observed behaviour

Junk is leaked to the interactive sessions, as shown above, and f.write(h) returns 50 (the numbers of items) not 100 (the number of bytes that should have been written).

Additional Information

I think this happens because the array (implementing the buffer protocol) is passed directly to RemoteCommand.wr_bytes() by RemoteFile.write() in tools/mpremote/mpremote/transport_serial.py. This looks like:

    def wr_bytes(self, b):
        self.wr_s32(len(b))
        self.fout.write(b)

But in the buffer protocol which array.array correctly implements, len(b) is the length in items not the length in bytes, so the length-prefix will be wrong when the item size is larger than 1.

In our original example, the length-prefix is (int32_t) 50 then 50 16-bit words are written, for a total of 100 bytes instead of 50 bytes. The extra 50 bytes leak to the console.

Is there an easy way to get the correct byte length of a buffer from micropython without allocating or copying? I know len(bytes(b)) would work, but that copies a potentially large buffer. I'm not sure what's guaranteed to be available and what features might be optional and compiled out on some ports. Presumably we must have a defined 'available' way to get either item size or total size from python - or a way to cast to a memoryview() in bytes rather than a memoryview() in items?

Do we also have a similar issue with readinto()'s handling of the capacity of the destination buffer, where len(buf) is sent as part of CMD_READ? We might be reading to a word array, for example.

    def readinto(self, buf):
        c = self.cmd
        c.begin(CMD_READ)
        c.wr_s8(self.fd)
        c.wr_s32(len(buf))
        n = c.rd_bytes(buf)
        c.end()
        return n

Code of Conduct

Yes, I agree

mpremote mount fails reading binary file

Checks

Port, board and/or hardware

MicroPython version

Reproduction

Expected behaviour

Observed behaviour

Additional Information

mpremote mount: file write of non-byte arrays fails due to incorrect length calculation

Port, board and/or hardware

MicroPython version

Reproduction

Expected behaviour

Observed behaviour

Additional Information

Code of Conduct

Keyboard