tests: thread/thread_gc1.py intermittent failure on CI
The thread_gc1.py test fails intermittently on CI with False instead of
True. This is the single biggest contributor to CI flakiness on master,
attributed to ~62 of 103 failed runs over 14 months (575 runs sampled).
Observed in settrace_stackless (6 times), coverage (3 times) in a 20-run
window with available logs. The test was already excluded from macos,
qemu_mips, qemu_arm, and qemu_riscv64 jobs prior to PR #18861.
The test spawns threads that perform garbage collection and checks a
boolean result. The failure pattern suggests a race condition in the GC
or thread interaction, not a test logic issue — the test is correctly
detecting a real bug.
Estimated per-execution failure rate: ~1.3% across the 8 CI jobs that
run it.
PR #18861 now ignores this failure in CI so it doesn't block other work,
but the underlying issue should be fixed.
See analysis: https://gist.github.com/andrewleech/5686ed5242e0948d8679c432579e002e
tests: cmdline/repl_lock.py and repl_cont.py intermittent failures
Two REPL tests fail intermittently on CI:
cmdline/repl_lock.py — fails on QEMU ARM and RISCV64. The expected
output shows >>> micropython.heap_lock() but the actual output drops
the >>> prompt prefix. Observed 3 times in 20 runs with logs. This is
a REPL prompt timing issue under QEMU emulation.
cmdline/repl_cont.py — fails on macOS. Differences in quote escaping
in REPL continuation prompts ("'" vs '\''). Observed once in 20 runs.
The macOS job has historically been the second most failure-prone job
(4.3% failure rate, 25 failures over 14 months) with all failures
attributed to REPL-related issues. The August 2025 spike (11 macOS
failures) correlates with the GitHub Actions macOS 15 runner migration.
PR #18861 now ignores these failures in CI.
See analysis: https://gist.github.com/andrewleech/5686ed5242e0948d8679c432579e002e