crash in btree module
Port, board and/or hardware
unix coverage build
MicroPython version
MicroPython v1.26.0-preview.387.g67acac257f.dirty on 2025-07-19; linux [GCC 12.2.0] version
Reproduction
Run the following script on the unix coverage build:
import btree, io
N = 62
db = btree.open(io.BytesIO(), pagesize=512)
e = b"a" * 78
for i in range(N):
db[b"thekey{}".format(i)] = e + str(i)
for i in range(N):
db[b"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" + str(i)] = e + str(i)
The script is a hand-minimized version of a longer version found by automated fuzzing. Small changes to the final long key string can cause the test to succeed or to fail instead with a MemoryError.
Expected behaviour
The script completes. It produces no useful output.
Observed behaviour
The script crashes.
With added instrumentation, I was able to trace the problem back to the use of an undefined value in __bt_split:
188 case P_BLEAF:
189 bl = GETBLEAF(rchild, 0);
190 VALGRIND_CHECK_VALUE_IS_DEFINED(bl->ksize);
191 nbytes = NBINTERNAL(bl->ksize);
192 if (t->bt_pfx && !(bl->flags & P_BIGKEY) &&
==2795256== Uninitialised byte(s) found during client check request
==2795256== at 0x6424C1: __bt_split (bt_split.c:191)
==2795256== by 0x635343: __bt_put (bt_put.c:202)
==2795256== by 0x471C19: btree_subscr (modbtree.c:328)
==2795256== by 0x3BB297: mp_obj_subscr (obj.c:575)
==2795256== by 0x446E75: mp_execute_bytecode (vm.c:503)
==2795256== by 0x3DA130: fun_bc_call (objfun.c:295)
==2795256== by 0x39F14F: mp_call_function_n_kw (runtime.c:727)
==2795256== by 0x3A41DB: mp_call_function_0 (runtime.c:701)
==2795256== by 0x658253: execute_from_lexer (main.c:162)
==2795256== by 0x658346: do_file (main.c:311)
==2795256== by 0x65A865: main_ (main.c:741)
==2795256== by 0x65ADE9: main (main.c:494)
==2795256== Address 0x55d164f is 463 bytes inside a block of size 560 alloc'd
==2795256== at 0x48417B4: malloc (vg_replace_malloc.c:381)
==2795256== by 0x649EAF: mpool_bkt (mpool.c:335)
==2795256== by 0x64B846: mpool_new (mpool.c:124)
==2795256== by 0x632B96: __bt_new (bt_page.c:96)
==2795256== by 0x63F954: bt_page (bt_split.c:371)
==2795256== by 0x640BC6: __bt_split (bt_split.c:110)
==2795256== by 0x635343: __bt_put (bt_put.c:202)
==2795256== by 0x471C19: btree_subscr (modbtree.c:328)
==2795256== by 0x3BB297: mp_obj_subscr (obj.c:575)
==2795256== by 0x446E75: mp_execute_bytecode (vm.c:503)
==2795256== by 0x3DA130: fun_bc_call (objfun.c:295)
==2795256== by 0x39F14F: mp_call_function_n_kw (runtime.c:727)
==2795256== Uninitialised value was created by a heap allocation
==2795256== at 0x48417B4: malloc (vg_replace_malloc.c:381)
==2795256== by 0x649EAF: mpool_bkt (mpool.c:335)
==2795256== by 0x64B846: mpool_new (mpool.c:124)
==2795256== by 0x62E481: nroot (bt_open.c:360)
==2795256== by 0x63094B: __bt_open (bt_open.c:306)
==2795256== by 0x47179E: mod_btree_open (modbtree.c:416)
==2795256== by 0x3DB1BF: fun_builtin_var_call (objfun.c:118)
==2795256== by 0x39F14F: mp_call_function_n_kw (runtime.c:727)
==2795256== by 0x39FC01: mp_call_method_n_kw (runtime.c:743)
==2795256== by 0x43EDFB: mp_execute_bytecode (vm.c:1068)
==2795256== by 0x3DA130: fun_bc_call (objfun.c:295)
==2795256== by 0x39F14F: mp_call_function_n_kw (runtime.c:727)
Additional Information
The cause for the bug is almost certainly in the btree submodule but as I'm not 100% sure, I chose to file the issue here.
Code of Conduct
Yes, I agree
Crash compiling(?) unusual code
Port, board and/or hardware
unix port, coverage, x86_64
MicroPython version
MicroPython v1.26.0-preview.521.g658a2e3dbd on 2025-08-02; linux [GCC 12.2.0] version
Reproduction
Run micropython with a snippet of unusual code:
$ ./build-coverage/micropython -c 'ans = (-1) ** 2.3; aa'
Segmentation fault
Expected behaviour
A NameError, because aa is not defined
Observed behaviour
A segfault
Additional Information
The stack trace is corrupt. ubsan asan all failed to give more useful info.
Program received signal SIGSEGV, Segmentation fault.
0x0000555555756530 in mp_state_ctx ()
(gdb) where
#0 0x0000555555756530 in mp_state_ctx ()
#1 0x0000000000000000 in ?? ()
valgrind produced multiple diagnostics beginning with this, which looks interesting:
==1669951== Invalid write of size 8
==1669951== at 0x16C3B4: nlr_jump (nlrx64.c:104)
==1669951== by 0x1B23DE: fun_bc_call (objfun.c:352)
==1669951== by 0x19E61E: mp_call_function_n_kw (runtime.c:727)
==1669951== by 0x1A0DEA: mp_call_function_0 (runtime.c:701)
==1669951== by 0x264DB8: execute_from_lexer (main.c:162)
==1669951== by 0x264E67: do_str (main.c:315)
==1669951== by 0x2658D3: main_ (main.c:656)
==1669951== by 0x26619F: main (main.c:494)
==1669951== Address 0x1ffefff888 is on thread 1's stack
==1669951== 232 bytes below stack pointer
I think there's something going on where an nlr jmp_buf registered inside fold_constants is somehow coming into play later when the NameError is thrown:
Breakpoint 1, nlr_push (nlr=nlr@entry=0x7fffffffdb10) at ../../py/nlrx64.c:55
55 unsigned int nlr_push(nlr_buf_t *nlr) {
(gdb) where
#0 nlr_push (nlr=nlr@entry=0x7fffffffdb10) at ../../py/nlrx64.c:55
#1 0x00005555556b0ae5 in execute_from_lexer (source_kind=source_kind@entry=1,
source=0x7fffffffe1a9, input_kind=input_kind@entry=MP_PARSE_FILE_INPUT,
is_repl=is_repl@entry=false) at main.c:123
#2 0x00005555556b0e68 in do_str (str=<optimized out>) at main.c:315
#3 0x00005555556b18d4 in main_ (argc=argc@entry=3, argv=argv@entry=0x7fffffffddd8)
at main.c:656
#4 0x00005555556b21a0 in main (argc=3, argv=0x7fffffffddd8) at main.c:494
(gdb) c
Continuing.
Breakpoint 1, nlr_push (nlr=nlr@entry=0x7fffffffd900) at ../../py/nlrx64.c:55
55 unsigned int nlr_push(nlr_buf_t *nlr) {
(gdb) where
#0 nlr_push (nlr=nlr@entry=0x7fffffffd900) at ../../py/nlrx64.c:55
#1 0x00005555555c5ea3 in binary_op_maybe (op=op@entry=MP_BINARY_OP_POWER,
lhs=0xffffffffffffffff, rhs=0x7ffff7c491e0, res=res@entry=0x7fffffffd998)
at ../../py/parse.c:672
#2 0x00005555555c6d42 in fold_constants (parser=parser@entry=0x7fffffffda30,
rule_id=rule_id@entry=42 '*', num_args=2) at ../../py/parse.c:780
#3 0x00005555555c6ac2 in push_result_rule (parser=parser@entry=0x7fffffffda30, src_line=1,
rule_id=rule_id@entry=42 '*', num_args=<optimized out>) at ../../py/parse.c:1033
#4 0x00005555555c86b7 in mp_parse (lex=lex@entry=0x7ffff7c48bc0,
input_kind=input_kind@entry=MP_PARSE_FILE_INPUT) at ../../py/parse.c:1263
#5 0x00005555556b0b5c in execute_from_lexer (source_kind=source_kind@entry=1,
source=<optimized out>, input_kind=input_kind@entry=MP_PARSE_FILE_INPUT,
is_repl=is_repl@entry=false) at main.c:147
#6 0x00005555556b0e68 in do_str (str=<optimized out>) at main.c:315
#7 0x00005555556b18d4 in main_ (argc=argc@entry=3, argv=argv@entry=0x7fffffffddd8)
at main.c:656
#8 0x00005555556b21a0 in main (argc=3, argv=0x7fffffffddd8) at main.c:494
(gdb) c
Continuing.
Breakpoint 1, nlr_push (nlr=nlr@entry=0x7fffffffd990) at ../../py/nlrx64.c:55
55 unsigned int nlr_push(nlr_buf_t *nlr) {
(gdb) where
#0 nlr_push (nlr=nlr@entry=0x7fffffffd990) at ../../py/nlrx64.c:55
#1 0x00005555556218fa in mp_execute_bytecode (code_state=code_state@entry=0x7fffffffda20,
inject_exc=<optimized out>, inject_exc@entry=0x0) at ../../py/vm.c:301
#2 0x00005555555fe288 in fun_bc_call (self_in=0x7ffff7c48be0, n_args=0, n_kw=0, args=0x0)
at ../../py/objfun.c:295
#3 0x00005555555ea61f in mp_call_function_n_kw (fun_in=0x7ffff7c48be0,
n_args=n_args@entry=0, n_kw=n_kw@entry=0, args=args@entry=0x0) at ../../py/runtime.c:727
#4 0x00005555555ecdeb in mp_call_function_0 (fun=<optimized out>) at ../../py/runtime.c:701
#5 0x00005555556b0db9 in execute_from_lexer (source_kind=source_kind@entry=1,
source=<optimized out>, input_kind=input_kind@entry=MP_PARSE_FILE_INPUT,
is_repl=is_repl@entry=false) at main.c:162
#6 0x00005555556b0e68 in do_str (str=<optimized out>) at main.c:315
#7 0x00005555556b18d4 in main_ (argc=argc@entry=3, argv=argv@entry=0x7fffffffddd8)
at main.c:656
#8 0x00005555556b21a0 in main (argc=3, argv=0x7fffffffddd8) at main.c:494
(gdb) c
Continuing.
Breakpoint 2, nlr_jump (val=0x7ffff7c48ba0) at ../../py/nlrx64.c:103
103 MP_NORETURN void nlr_jump(void *val) {
(gdb) p mp_thread_get_state ()->nlr_top
$3 = (nlr_buf_t *) 0x7fffffffd990
(gdb) c
Continuing.
Breakpoint 2, nlr_jump (val=val@entry=0x7ffff7c48ba0) at ../../py/nlrx64.c:103
103 MP_NORETURN void nlr_jump(void *val) {
(gdb) p mp_thread_get_state ()->nlr_top
$4 = (nlr_buf_t *) 0x7fffffffd900
(gdb) where
#0 nlr_jump (val=val@entry=0x7ffff7c48ba0) at ../../py/nlrx64.c:103
#1 0x00005555555fe3df in fun_bc_call (self_in=<optimized out>, n_args=0, n_kw=0, args=0x0)
at ../../py/objfun.c:352
#2 0x00005555555ea61f in mp_call_function_n_kw (fun_in=0x7ffff7c48be0,
n_args=n_args@entry=0, n_kw=n_kw@entry=0, args=args@entry=0x0) at ../../py/runtime.c:727
#3 0x00005555555ecdeb in mp_call_function_0 (fun=<optimized out>) at ../../py/runtime.c:701
#4 0x00005555556b0db9 in execute_from_lexer (source_kind=source_kind@entry=1,
source=<optimized out>, input_kind=input_kind@entry=MP_PARSE_FILE_INPUT,
is_repl=is_repl@entry=false) at main.c:162
#5 0x00005555556b0e68 in do_str (str=<optimized out>) at main.c:315
#6 0x00005555556b18d4 in main_ (argc=argc@entry=3, argv=argv@entry=0x7fffffffddd8)
at main.c:656
#7 0x00005555556b21a0 in main (argc=3, argv=0x7fffffffddd8) at main.c:494
notice how the last nlr_buf_t in nlr_jmp is equal to the one inside the stack including binary_op_maybe called from fold_constants even though those are no longer on the stack.
This crash was found with AFLplusplus and minimized manually.
Code of Conduct
Yes, I agree