Segmentation Fault in mp_asm_base_get_cur_to_write_bytes (x64 native emitter)
Port, board and/or hardware
unix
MicroPython version
v1.27.0 and master-branch
Issue Report
Description
We discovered a Segmentation Fault vulnerability in MicroPython. The crash occurs within mp_asm_base_get_cur_to_write_bytes when compiling a specific Python script targeting the x64 native emitter.
The ASAN report indicates a READ memory access violation at an invalid address (0x0000bfff800a), suggesting corruption of the assembler state or an invalid pointer calculation during the code emission phase.
Environment
- OS: Linux x86_64
- Complier: gcc 11.5.0
- Tools: AddressSanitizer
- Affected Version:
master branch - Build Configure:
make CFLAGS_EXTRA="-fsanitize=address --param asan-use-after-return=0" \
LDFLAGS_EXTRA="-fsanitize=address --param asan-use-after-return=0" \
CC=gcc STRIP= -j$(nproc)
Vulnerability Details
- Target: MicroPython (Unix Port)
- Vulnerability Type: Segmentation Fault (READ access violation)
- Function: mp_asm_base_get_cur_to_write_bytes
- Location: py/asmbase.c:70
- Root Cause Analysis: The crash happens during the compilation phase (mp_compile), specifically when the native emitter is processing a return value (emit_native_return_value). The stack trace shows the flow:
#0 mp_asm_base_get_cur_to_write_bytes
#1 asm_x64_get_cur_to_write_bytes
#2 asm_x64_write_byte_2
#3 asm_x64_mov_mem64_to_r64
#4 emit_native_mov_reg_const
The function mp_asm_base_get_cur_to_write_bytes likely attempts to access the current code buffer pointer or size limit from the mp_asm_base_t structure. The invalid address 0x0000bfff800a suggests that the structure pointer itself is corrupted or one of its member pointers (like code_base) has been calculated incorrectly due to the malformed input script. This specifically affects the x64 Native/Viper code generation.
Reproduce
- Compile the micropython with gcc compiler and AddressSanitizer enabled
- Run the micropython with the POC input.
Proof of Concept:
def f():
try:
try:0
finally:()
except:()
@micropython.viper
def f():
a**0
try:0
except:()
a=0
ASAN report
==36122==ERROR: AddressSanitizer: SEGV on unknown address 0x0000bfff800a (pc 0x55fb9e917cc8 bp 0x000000000048 sp 0x7ffcb8c34f00 T0)
==36122==The signal is caused by a READ memory access.
#0 0x55fb9e917cc8 in mp_asm_base_get_cur_to_write_bytes ../../py/asmbase.c:70
#1 0x55fb9e917ea4 in asm_x64_get_cur_to_write_bytes ../../py/asmx64.c:125
#2 0x55fb9e917ea4 in asm_x64_write_byte_2 ../../py/asmx64.c:136
#3 0x55fb9e9184c7 in asm_x64_mov_mem64_to_r64 ../../py/asmx64.c:318
#4 0x55fb9e9201fa in emit_native_mov_reg_const ../../py/emitnative.c:332
#5 0x55fb9e9201fa in emit_native_return_value ../../py/emitnative.c:2904
#6 0x55fb9e90f8a4 in compile_scope ../../py/compile.c:3190
#7 0x55fb9e915184 in mp_compile_to_raw_code ../../py/compile.c:3598
#8 0x55fb9e915184 in mp_compile ../../py/compile.c:3693
#9 0x55fb9ea1d262 in parse_compile_execute ../../shared/runtime/pyexec.c:120
#10 0x55fb9ea157e0 in do_file /src/repro/micropython/ports/unix/main.c:269
#11 0x55fb9ea157e0 in main_ /src/repro/micropython/ports/unix/main.c:692
#12 0x7fcf58a401c9 (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9)
#13 0x7fcf58a4028a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a)
#14 0x55fb9e8f2fd4 in _start (/src/repro/micropython/ports/unix/build-standard/micropython+0x84fd4)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV ../../py/asmbase.c:70 in mp_asm_base_get_cur_to_write_bytes
==36122==ABORTING
What does this issue allow an attacker to do?
Denial of Service (DoS). By supplying a Python script that utilizes native code generation (e.g., via decorators or specific syntax), an attacker can crash the MicroPython compiler process. In a scenario where MicroPython allows users to upload and compile code (e.g., WebREPL or a multi-user environment), this leads to a service outage.
How does the attacker exploit this issue?
The attacker provides a malformed Python script designed to trigger edge cases in the x64 native emitter. The script likely contains a specific combination of operations inside a function decorated with @micropython.native or @micropython.viper. The stack trace (emit_native_return_value) suggests the issue is triggered when generating assembly code for a return statement, possibly involving a large constant or a complex memory reference that causes the assembler to calculate an invalid memory address.
Code of Conduct
Yes, I agree
Binary operations on undefined variables crash the native emitter.
Port, board and/or hardware
unix
MicroPython version
MicroPython v1.24.0-preview.206.ge9814e987.dirty on 2024-08-16; linux [GCC 14.2.1] version
Reproduction
Run the following code:
@micropython.native
def f():
a += 0 # Or anything else, really
f()
Expected behaviour
A NameError exception should be raised. (CPython raises NameError: name 'a' is not defined)
Observed behaviour
This is the backtrace of the crash when running the interpreter under GDB:
Program received signal SIGSEGV, Segmentation fault.
mp_obj_get_type (o_in=o_in@entry=0x0) at ../../py/obj.c:61
61 return o->type;
(gdb) bt
#0 mp_obj_get_type (o_in=o_in@entry=0x0) at ../../py/obj.c:61
#1 0x0000555555591333 in mp_binary_op (op=MP_BINARY_OP_INPLACE_ADD, lhs=0x0, rhs=0x1) at ../../py/runtime.c:630
#2 0x00007ffff7fc0053 in ?? ()
#3 0x00007ffff7a06440 in ?? ()
#4 0x00007ffff7fc006d in ?? ()
#5 0x00007fffffff9610 in ?? ()
#6 0x0000000000000003 in ?? ()
#7 0x00007fffffff9760 in ?? ()
#8 0x0000000000000000 in ?? ()
Additional Information
No, I've provided everything above.
Code of Conduct
Yes, I agree