← index #7905Issue #657
Off-topic · high · value 2.402
QUERY · ISSUE

Discussion of Python 3.9 support

openby mattytrentiniopened 2021-10-15updated 2025-10-03
enhancementpy-core

This issue is intended to track the status of Python 3.9 core features as implemented by MicroPython.

Python 3.9.0 (final) was released on the 5th October 2020. The Features for 3.9 are defined in PEP 596 and a detailed description of the changes can be found in What's New in Python 3.9.

  • PEP 584, union operators added to dict;
  • PEP 585, type hinting generics in standard collections;
  • PEP 614, relaxed grammar restrictions on decorators.
  • PEP 616, string methods to remove prefixes and suffixes.
  • PEP 593, flexible function and variable annotations;
  • PEP 573, fast access to module state from methods of C extension types;
  • PEP 617, CPython now uses a new parser based on PEG;
  • PEP 615, the IANA Time Zone Database is now present in the standard library in the zoneinfo module;
  • PEP 602, CPython adopts an annual release cycle. Instead of annual, aiming for two month release cycle

Other language changes

  • __import__() now raises ImportError instead of ValueError; Done, see 53519e322a5a0bb395676cdaa132f5e82de22909
  • Python now gets the absolute path of the script filename specified on the command line (ex: python3 script.py): the __file__ attribute of the __main__ module became an absolute path, rather than a relative path.
  • By default, for best performance, the errors argument is only checked at the first encoding/decoding error and the encoding argument is sometimes ignored for empty strings.
  • "".replace("", s, n) now returns s instead of an empty string for all non-zero n. It is now consistent with "".replace("", s).
  • Any valid expression can now be used as a decorator. Previously, the grammar was much more restrictive.
  • Parallel running of aclose() / asend() / athrow() is now prohibited, and ag_running now reflects the actual running status of the async generator.
  • Unexpected errors in calling the __iter__ method are no longer masked by TypeError in the in operator and functions contains(), indexOf() and countOf() of the operator module.
  • Unparenthesized lambda expressions can no longer be the expression part in an if clause in comprehensions and generator expressions.

Changes to MicroPython built-in modules

  • asyncio
    • Due to significant security concerns, the reuse_address parameter of asyncio.loop.create_datagram_endpoint() is no longer supported
    • Added a new coroutine shutdown_default_executor() that schedules a shutdown for the default executor that waits on the ThreadPoolExecutor to finish closing. Also, asyncio.run() has been updated to use the new coroutine.
    • Added asyncio.PidfdChildWatcher, a Linux-specific child watcher implementation that polls process file descriptors
    • Added a new coroutine asyncio.to_thread()
    • When cancelling the task due to a timeout, asyncio.wait_for() will now wait until the cancellation is complete also in the case when timeout is <= 0, like it does with positive timeouts.
    • asyncio now raises TyperError when calling incompatible methods with an ssl.SSLSocket socket
  • gc
    • Garbage collection does not block on resurrected objects
    • Added a new function gc.is_finalized() to check if an object has been finalized by the garbage collector
  • math
    • Expanded the math.gcd() function to handle multiple arguments. Formerly, it only supported two arguments.
    • Added math.lcm(): return the least common multiple of specified arguments
    • Added math.nextafter(): return the next floating-point value after x towards y
    • Added math.ulp(): return the value of the least significant bit of a float
  • os
    • Exposed the Linux-specific os.pidfd_open() and os.P_PIDFD
    • The os.unsetenv() function is now also available on Windows
    • The os.putenv() and os.unsetenv() functions are now always available
    • Added os.waitstatus_to_exitcode() function: convert a wait status to an exit code
  • random - Added a new random.Random.randbytes method: generate random bytes
  • sys
    • Added a new sys.platlibdir attribute: name of the platform-specific library directory
    • Previously, sys.stderr was block-buffered when non-interactive. Now stderr defaults to always being line-buffered.

(Changes to non-built-in modules will need to be documented elsewhere.)

CANDIDATE · ISSUE

Unicode support and PEP 393

closedby Rosuavopened 2014-06-03updated 2014-06-28

Opening this as a discussion issue, so it can all be kept track of.

Python 3.3's str type supports the full Unicode range, with semantics defined by PEP 393 http://www.python.org/dev/peps/pep-0393/ (although some of the details there are CPython-specific). Currently, micropython pretends that strings are bytes, C-style, and will output them to a console without modification - so, for instance, a Unix console will interpret "\xC3\xBD" as U+00FD LATIN SMALL LETTER Y WITH ACUTE. (I have no idea what embedded devices do, but presumably it's ASCII-compatible or this issue would have come up long ago.)

Ideally and ultimately, micropython should support all of Unicode. The advantages to the language are huge (if you need me to elaborate, I can do so); in brief, Python 3 forces everyone to be correct. Correctness in Unicode is on par with correctness in memory management; it has some costs, but we willingly pay those costs as the price of guaranteeing that we won't leak memory or have buffer overruns.

But if that can't be done, or can't be done immediately, I'd like to see some means of catching problems before they happen; for instance, documenting that all encodings used MUST be ASCII-compatible, and raising an exception if a str has any character >127 in it.

I've had a bit of a look at objstr.c, and it seems that the character/byte equivalence is, unfortunately, endemic. Not only is the representation all byte-based, but helpers like is_ws() are defined by ASCII. (In CPython, "spam\xA0spam\u3000spam".split() == ["spam","spam","spam"], because U+00A0 and U+3000 are flagged whitespace.) This could be changed, but it will likely mean significant changes, and will almost certainly result in code size increases; although one of the beauties of PEP 393 strings is that, for ASCII-only strings (and even Latin-1 strings), the string in memory is no larger than it would be if stored as bytes (modulo the two-bit flag in the header, stating what the size is).

The most important question is, how much do other parts of the code dip into strings, and therefore how much impact will a change of internal representation have? I tried adding an arbitrary member to the structure, and it seemed to compile okay, and there don't seem to be any other files referencing the structure directly.

How do you feel about me doing up some approximation of PEP 393 into objstr.c? It'd be a fairly significant change.

Keyboard

j / / n
next pair
k / / p
previous pair
1 / / h
show query pane
2 / / l
show candidate pane
c
copy suggested comment
r
toggle reasoning
g i
go to index
?
show this help
esc
close overlays

press ? or esc to close

copied