tests: thread/thread_gc1.py intermittent failure on CI
The thread_gc1.py test fails intermittently on CI with False instead of
True. This is the single biggest contributor to CI flakiness on master,
attributed to ~62 of 103 failed runs over 14 months (575 runs sampled).
Observed in settrace_stackless (6 times), coverage (3 times) in a 20-run
window with available logs. The test was already excluded from macos,
qemu_mips, qemu_arm, and qemu_riscv64 jobs prior to PR #18861.
The test spawns threads that perform garbage collection and checks a
boolean result. The failure pattern suggests a race condition in the GC
or thread interaction, not a test logic issue — the test is correctly
detecting a real bug.
Estimated per-execution failure rate: ~1.3% across the 8 CI jobs that
run it.
PR #18861 now ignores this failure in CI so it doesn't block other work,
but the underlying issue should be fixed.
See analysis: https://gist.github.com/andrewleech/5686ed5242e0948d8679c432579e002e
tests: thread stress tests intermittent failures under QEMU (stress_aes, stress_recurse, stress_schedule)
Three thread stress tests fail intermittently under QEMU emulation on CI:
thread/stress_aes.py — times out on QEMU ARM/MIPS/RISCV64. Execution
time approaches or exceeds the configured timeout (70-180s depending on
arch). Observed 7 times in a 20-run log window. Attributed to ~28 of 103
failed runs over 14 months. On RISCV64 it's excluded entirely because it
takes ~180s against a 200s timeout.
thread/stress_recurse.py — was already excluded from qemu_mips,
qemu_arm, qemu_riscv64 with "is flaky" comments. No direct log
observations in the sample window since it was excluded, but the
exclusion predates the analysis period.
thread/stress_schedule.py — crashed once (expected PASS, got
CRASH) on qemu_riscv64 in the 20-run window. Low frequency but a
crash rather than a timeout suggests a real issue.
These may be QEMU-specific timing/emulation issues rather than bugs in
MicroPython's threading, but the crash in stress_schedule suggests at
least some of these are real.
PR #18861 now ignores these failures in CI. stress_aes.py is additionally
excluded on RISCV64 to avoid burning ~180s of CI time on each timeout.
See analysis: https://gist.github.com/andrewleech/5686ed5242e0948d8679c432579e002e