QUERY · ISSUE
68c28174 breaks utf8 decode with ignore
bugpy-coreunicode
b'\xff\xfe'.decode('utf8', 'ignore')
used to work but now produces UnicodeError exception due to commit https://github.com/micropython/micropython/commit/68c28174d0e0ec3f6b1461aea3a0b6a1b84610bb
Tested on STM32
On CPython this still works as expected:
>>> b'\xff\xfe'.decode('utf8', 'ignore')
''
CANDIDATE · ISSUE
UnicodeDecodeError not raised when expected in bytes.decode()
bug
What is expected:
Python 3.5.2/ 3.6.0:
>>> bytes.decode(b"\xa1\x80", 'utf-8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa1 in position 0: invalid start byte
What is happening:
MicroPython v1.8.6-260-gafc5063-dirty on 2016-12-28; darwin version:
>>> bytes.decode(b"\xa1\x80", 'utf-8')
'\u0840'
I am porting umsgpack (a small pure-python msgpack library, the 'u' is not related to upy) to micropython, and this particular test is failing since Micropython behaves differently from CPython. It may show up elsewhere as surprises if programs continue when they should fail.