← index #18870Issue #18866
Related · high · value 5.115
QUERY · ISSUE

tests: thread/stress_heap.py intermittent failure on macOS

openby andrewleechopened 2026-02-25updated 2026-03-19
tests

The stress_heap.py test was excluded from the macOS CI job with a "is
flaky" comment prior to the analysis period. No direct log observations
are available since it was already excluded, but it's listed in
FLAKY_TESTS restricted to the darwin platform.

PR #18861 now handles this via ignore-on-failure instead of exclusion,
so the test runs and its output is visible but doesn't block CI.

See analysis: https://gist.github.com/andrewleech/5686ed5242e0948d8679c432579e002e

CANDIDATE · ISSUE

tests: thread/thread_gc1.py intermittent failure on CI

openby andrewleechopened 2026-02-25updated 2026-03-19
tests

The thread_gc1.py test fails intermittently on CI with False instead of
True. This is the single biggest contributor to CI flakiness on master,
attributed to ~62 of 103 failed runs over 14 months (575 runs sampled).

Observed in settrace_stackless (6 times), coverage (3 times) in a 20-run
window with available logs. The test was already excluded from macos,
qemu_mips, qemu_arm, and qemu_riscv64 jobs prior to PR #18861.

The test spawns threads that perform garbage collection and checks a
boolean result. The failure pattern suggests a race condition in the GC
or thread interaction, not a test logic issue — the test is correctly
detecting a real bug.

Estimated per-execution failure rate: ~1.3% across the 8 CI jobs that
run it.

PR #18861 now ignores this failure in CI so it doesn't block other work,
but the underlying issue should be fixed.

See analysis: https://gist.github.com/andrewleech/5686ed5242e0948d8679c432579e002e

Keyboard

j / / n
next pair
k / / p
previous pair
1 / / h
show query pane
2 / / l
show candidate pane
c
copy suggested comment
r
toggle reasoning
g i
go to index
?
show this help
esc
close overlays

press ? or esc to close

copied