← index #7886Issue #657
Related · high · value 2.871
QUERY · ISSUE

Discussion of Python 3.7 support

openby mattytrentiniopened 2021-10-08updated 2025-10-03
py-core

This issue is intended to track the status of Python 3.7 core features as implemented by MicroPython. Not all of these changes should necessarily be implemented in MicroPython but documenting their status is important.

Python 3.7.0 (final) was released on the 27 June 2018. The Features for 3.7 are defined in PEP 537 and an explanation of the changes can be found in What's New in Python 3.7.

  • PEP 538 - Coercing the legacy C locale to a UTF-8 based locale
  • PEP 539 - A New C-API for Thread-Local Storage in CPython
  • PEP 540 - UTF-8 mode
  • PEP 552 - Deterministic pyc
  • PEP 553 - Built-in breakpoint()
  • PEP 557 - Data Classes
  • PEP 560 - Core support for typing module and generic types
  • PEP 562 - Module __getattr__ and __dir__; see partial implementation (__getattr__): 454cca6016afc96deb6d1ad5d1b3553ab9ad18dd
  • PEP 563 - Postponed Evaluation of Annotations
  • PEP 564 - Time functions with nanosecond resolution; see partial implementation: d4b61b00172ccc231307e3ef33f66f28cb6b051f
  • PEP 565 - Show DeprecationWarning in __main__
  • PEP 567 - Context Variables

Other language changes

  • async and await are now reserved keywords
  • dict objects must preserve insertion-order
  • More than 255 arguments can now be passed to a function, and a function can now have more than 255 parameters
  • bytes.fromhex() and bytearray.fromhex() now ignore all ASCII whitespace, not only spaces
  • str, bytes, and bytearray gained support for the new isascii() method, which can be used to test if a string or bytes contain only the ASCII characters
  • ImportError now displays module name and module __file__ path when from ... import ... fails
  • Circular imports involving absolute imports with binding a submodule to a name are now supported
  • object.__format__(x, '') is now equivalent to str(x) rather than format(str(self), '')
  • In order to better support dynamic creation of stack traces, types.TracebackType can now be instantiated from Python code, and the tb_next attribute on tracebacks is now writable
  • When using the -m switch, sys.path[0] is now eagerly expanded to the full starting directory path, rather than being left as the empty directory (which allows imports from the current working directory at the time when an import occurs)
  • The new -X importtime option or the PYTHONPROFILEIMPORTTIME environment variable can be used to show the timing of each module import

Changes to MicroPython built-in modules

  • asyncio (many, may need a separate ticket)
  • gc - New features: gc.freeze(), gc.unfreeze(), gc-get_freeze_count
  • math - math.remainder() added to implement IEEE 754-style remainder
  • re - A number of tidy up features including better support for splitting on empty strings and copy support for compiled expressions and match objects
  • sys - sys.breakpointhook() added. sys.get(/set)_coroutine_origin_tracking_depth() added.
  • time - Mostly updates to support nanosecond resolution in PEP564, see above.

(Changes to non-built-in modules will need to be documented elsewhere.)

CANDIDATE · ISSUE

Unicode support and PEP 393

closedby Rosuavopened 2014-06-03updated 2014-06-28

Opening this as a discussion issue, so it can all be kept track of.

Python 3.3's str type supports the full Unicode range, with semantics defined by PEP 393 http://www.python.org/dev/peps/pep-0393/ (although some of the details there are CPython-specific). Currently, micropython pretends that strings are bytes, C-style, and will output them to a console without modification - so, for instance, a Unix console will interpret "\xC3\xBD" as U+00FD LATIN SMALL LETTER Y WITH ACUTE. (I have no idea what embedded devices do, but presumably it's ASCII-compatible or this issue would have come up long ago.)

Ideally and ultimately, micropython should support all of Unicode. The advantages to the language are huge (if you need me to elaborate, I can do so); in brief, Python 3 forces everyone to be correct. Correctness in Unicode is on par with correctness in memory management; it has some costs, but we willingly pay those costs as the price of guaranteeing that we won't leak memory or have buffer overruns.

But if that can't be done, or can't be done immediately, I'd like to see some means of catching problems before they happen; for instance, documenting that all encodings used MUST be ASCII-compatible, and raising an exception if a str has any character >127 in it.

I've had a bit of a look at objstr.c, and it seems that the character/byte equivalence is, unfortunately, endemic. Not only is the representation all byte-based, but helpers like is_ws() are defined by ASCII. (In CPython, "spam\xA0spam\u3000spam".split() == ["spam","spam","spam"], because U+00A0 and U+3000 are flagged whitespace.) This could be changed, but it will likely mean significant changes, and will almost certainly result in code size increases; although one of the beauties of PEP 393 strings is that, for ASCII-only strings (and even Latin-1 strings), the string in memory is no larger than it would be if stored as bytes (modulo the two-bit flag in the header, stating what the size is).

The most important question is, how much do other parts of the code dip into strings, and therefore how much impact will a change of internal representation have? I tried adding an arbitrary member to the structure, and it seemed to compile okay, and there don't seem to be any other files referencing the structure directly.

How do you feel about me doing up some approximation of PEP 393 into objstr.c? It'd be a fairly significant change.

Keyboard

j / / n
next pair
k / / p
previous pair
1 / / h
show query pane
2 / / l
show candidate pane
c
copy suggested comment
r
toggle reasoning
g i
go to index
?
show this help
esc
close overlays

press ? or esc to close

copied