Priority inversion and rp2 port
The entry/exit to a critical section on the rp2 port is currently:
// TODO need to look and see if these could/should be spinlock/mutex
#define MICROPY_BEGIN_ATOMIC_SECTION() save_and_disable_interrupts()
#define MICROPY_END_ATOMIC_SECTION(state) restore_interrupts(state)
The save_and_disable_interrupts function is an inline function coming from the Pico SDK, and this uses cpsid i to disable all interrupts. Unfortunately, this causes priority inversion on the RP2040 because of the XIP flash code, delaying high priority ISRs by many microseconds, When MICROPY_BEGIN_ATOMIC_SECTION appears in code that isn't marked time critical (i.e. will be in external flash and not RAM) there's a high chance of XIP cache misses and the CPU stalls while the code is fetched - and while there can be an ISR pending, causing all ISRs to have a latency as if they are in XIP flash even if placed in RAM.
The fix for this requires a revisit of the interrupt priority model and use of the NVIC priority system. There should be at least two priorities of interrupt, with high priority interrupts for hardware device drivers that are not delayed by MICROPY_BEGIN_ATOMIC_SECTION and MICROPY_END_ATOMIC_SECTION pairs (that essentially run above MicroPython). Then there should be a lower interrupt priority level for other things (timers, Python functions for ISRs, etc.). The atomic section should then use PRIMASK and not cpsid.
An alternative approach is to put code in an atomic section into RAM as time critical. Maybe the RAM penalty isn't too bad (I haven't looked).
This issue is unique to the RP2040 because of the external serial flash and cache architecture, but this seems to be more common so perhaps it will become a core MicroPython issue in the future.
rp2: fix atomic section
This is an attempt to fix #12980 and #13288.
There are two fixes to the MICROPY_BEGIN_ATOMIC_SECTION/ MICROPY_END_ATOMIC_SECTION on the rp2 port:
Use a recursive mutex to prevent deadlocks on the same thread, in the case an IRQ races against the thread level (on the same core) when acquiring the atomic section (mutex may be taken but IRQs not yet disabled, then the IRQ waits forever trying to acquire the mutux).Unlock the mutex if it was locked, not ifcore1_entryis still non-null, to fix the case where the second core finishes (and setscore1_entrytoNULL) while the first core is in the middle of an atomic operation.
Edit: the two changes are now:
- split out multicore lockout handling so it's only done when doing a flash erase/write
- change the ATOMIC_SECTION macros so they use new mutex enter/exit functions that also disable/restore interrupts (atomically).