Change to `mp_handle_pending' includes possible MICROPY_EVENT_POLL_HOOK and downstream code breakage
Port, board and/or hardware
all
MicroPython version
1.27+
Reproduction
For long-running processes handled by user C modules (eg: busy wait on e-ink update) I've been using the pattern:
extern void mp_handle_pending_internal(bool);
mp_handle_pending(true);
I tried to build against 1.27 today, and ran into this build error:
modules/c/ssd1680/ssd1680.cpp.o: in function `pimoroni::SSD1680::busy_wait()':
modules/c/ssd1680/ssd1680.cpp:54:(.text._ZN8pimoroni7SSD16809busy_waitEv+0x10): undefined reference to `mp_handle_pending'
This looks like it was introduced by https://github.com/micropython/micropython/commit/c57aebf790c40125b663231ec4307d2a3f3cf193
This PR seems to make the assumption that mp_handle_pending is only ever called inline in runtime.h, or in code that uses:
#include "runtime.h"
mp_handle_pending(true);
This assumption doesn't seem to hold true, for example the same "undefined reference" error should occur if an attempt is made to call "MICROPY_EVENT_POLL_HOOK" from Zephyr code:
https://github.com/micropython/micropython/blob/26c16969ab954db4d8d79bed154e3d45c12c087f/ports/zephyr/mpconfigport.h#L171-L172
But I guess runtime.h is implicitly available here? In any case making the "extern" line redundant and this pattern should probably be updated where it occurs.
Either way the inline wrapper to preserve the old mp_handle_pending behaviour does not cover all possible cases, causing a break in my code from 1.26 to 1.27.
I expect breakage, though, so I'd propose the implicit "example" uses in the MicroPython codebase (there are a few things like "MICROPY_EVENT_POLL_HOOK") are updated and this issue should, with any luck, steer any future encounters (I'll probably forget and run into this again) toward the correct usage.
Expected behaviour
Observed behaviour
Additional Information
No, I've provided everything above.
Code of Conduct
Yes, I agree
py/scheduler: Allow C scheduler callbacks to re-queue themselves.
Summary
Currently if a scheduler callbacks queues a new C scheduled callback, that callback will run in the same schedule pass. This means if a scheduler callback re-queues itself, it can "infinite loop" inside the scheduler.
This change means that any call to mp_handle_pending() will only process the callbacks which were queued when mp_handle_pending() started running. Therefore any callback which re-queues itself should have that callback handled the next time mp_handle_pending() is called.
This is an alternative to #17248 and should remove the need for #17264. Fixes #17246.
This does create potential for some additional callback latency (i.e. if an interrupt fires while a different scheduler callback is running, the interrupt callback is now deferred until the next scheduler run). However the worst-case scheduler latency remains similar (interrupt fires at end of one scheduler pass, has to wait until the next one). I think most MicroPython systems only sparsely call C scheduler callbacks, so this shouldn't be noticeable in most cases.
There is also potential for a misbehaving callback that re-schedules itself continuously to degrade performance of MicroPython (as the callback runs continuously with some allowance for Python code to run), whereas before it would have locked up MicroPython entirely which is more obviously broken.
This work was funded through GitHub Sponsors.
Testing
- I adapted the test case @andrewleech added in #17248 for the new behaviour and resubmitted here, so coverage test now includes C scheduler callbacks (testing the new logic).
- Manually ran the rp2 unit tests on RP2_PICO board. As this is a tickless port I felt it had the most potential for an interrupt timing bug to surface due to this change. All passed.
Trade-offs and Alternatives
- Possible to manually add anti-recursion checks in callbacks, i.e. approach in #17264, but it can get fiddly quickly.