← index #3469Issue #2734
Related · high · value 2.100
QUERY · ISSUE

68c28174 breaks utf8 decode with ignore

openby ryannathansopened 2017-12-05updated 2026-01-16
bugpy-coreunicode

b'\xff\xfe'.decode('utf8', 'ignore')

used to work but now produces UnicodeError exception due to commit https://github.com/micropython/micropython/commit/68c28174d0e0ec3f6b1461aea3a0b6a1b84610bb

Tested on STM32

On CPython this still works as expected:

>>> b'\xff\xfe'.decode('utf8', 'ignore')
''

CANDIDATE · ISSUE

UnicodeDecodeError not raised when expected in bytes.decode()

closedby hiwayopened 2016-12-28updated 2017-09-06
bug

What is expected:

Python 3.5.2/ 3.6.0:

>>> bytes.decode(b"\xa1\x80", 'utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa1 in position 0: invalid start byte

What is happening:

MicroPython v1.8.6-260-gafc5063-dirty on 2016-12-28; darwin version:

>>> bytes.decode(b"\xa1\x80", 'utf-8')
'\u0840'

I am porting umsgpack (a small pure-python msgpack library, the 'u' is not related to upy) to micropython, and this particular test is failing since Micropython behaves differently from CPython. It may show up elsewhere as surprises if programs continue when they should fail.

Keyboard

j / / n
next pair
k / / p
previous pair
1 / / h
show query pane
2 / / l
show candidate pane
c
copy suggested comment
r
toggle reasoning
g i
go to index
?
show this help
esc
close overlays

press ? or esc to close

copied