← index #17924PR #3799
Related · high · value 0.863
QUERY · ISSUE

Clang undefined behavior sanitizer diagnostics (mostly uninteresting??)

openby jepleropened 2025-08-15updated 2025-08-15
bug

Port, board and/or hardware

unix port, coverage build, x86_64 linux, clang-19

MicroPython version

v1.27.0-preview-15-g744270ac1b

Reproduction

perform the undefined behavior sanitizer build but with CC=clang, then try doing pretty much anything (such as starting micropython to the repl)

Expected behaviour

It works and is essentially free of undefined behavior diagnostics.

Observed behaviour

Several classes of diagnostic appear almost immediately.

I investigated two main classes of diagnostic:

  • Applying zero offsets to NULL pointers
  • Calling functions without exactly matching prototypes

Here's an example of each kind:

../../py/map.c:193:37: runtime error: applying zero offset to null pointer
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../py/map.c:193:37 
../../py/stream.c:60:28: runtime error: call to function vfs_posix_file_write through pointer to incorrect function type 'unsigned long (*)(void *, void *, unsigned long, int *)'
/home/jepler/src/micropython/ports/unix/../../extmod/vfs_posix_file.c:129: note: vfs_posix_file_write defined here

These are both classes of "technically forbidden per the C specification but work fine almost always in practice".

The first can be replaced by an extra guard check, but at the possible cost of code. For example,

-    const mp_obj_t *kwargs = args + n_args;
+    const mp_obj_t *kwargs = args ? args + n_args : NULL;

As discussed in the old sanitizer threads, I think this specific behavior is set to become defined ( (NULL+0 is NULL) in a future C standard.

The second is harder to resolve. For instance, this technically means the trick of calling either a read or write func through a function pointer with the read type is incorrect (the prototypes differ only by whether the data argument is const:

    if (flags & MP_STREAM_RW_WRITE) {
        io_func = (io_func_t)stream_p->write;
    } else {
        io_func = stream_p->read;                                  
    }
... mp_uint_t out_sz = io_func(stream, buf, size, errcode); ...

I didn't find a fine grained method to turn off these diagnostics. For instance, the first one is under the general umbrella of "pointer overflow" checks, which includes actual overflow in pointer arithmetic like uint32_t *ptr; ptr[large] when large * sizeof(uint32_t) makes the address wrap around.

Additional Information

I was interested in clang ubsan because the AFLplusplus fuzzer can be run in a mode where it treats sanitizer diagnostics as crashes. However, it defaulted to using clang rather than gcc, so I discovered that it really doesn't like the current state of micropython and so it can't make any interesting findings.

Oh here's a bonus that I found when preparing this issue. It occurs when building an empty list (and, probably, tuple). It results because unsigned subtraction is being used but the intent is to grow the stack by an element. Technically it is an overflowed subtraction so it is undefined behavior. but not interesting. More uninteresting signed overflows appear in vm.c and touching any of them is likely to cause code growth without benefit.

Starting program: /home/jepler/src/micropython/ports/unix/build-coverage/micropython -c '[]'
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
../../py/vm.c:832:24: runtime error: subtraction of unsigned offset from 0x7fffffffd920 overflowed to 0x7fffffffd928
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../py/vm.c:832:24 

Code of Conduct

Yes, I agree

CANDIDATE · PULL REQUEST

emitbc: Avoid undefined behavior calling memset()

closedby jepleropened 2018-05-19updated 2025-03-27

When micropython is built with 'clang -fsanitize=undefined', a diagnostic like the following will occur:

$ UBSAN_OPTIONS=abort_on_error=1 ./micropython_fuzzing  -c 'print(1)'
../../py/emitbc.c:319:16: runtime error: null pointer passed as argument 1, which is declared to never be null
/usr/include/string.h:62:62: note: nonnull attribute specified here
Aborted

Traditionally, memset(NULL, value, 0) has been accepted without causing problems. However, it is not standards-compliant behavior; and for instance Ted Unangst of the OpenBSD project notes that "A smart C compiler may observe a call to memcpy, flag both pointers as valid, and then delete any null checks. Forwards and backwards."
https://www.tedunangst.com/flak/post/zero-size-objects

Since micropython is using -fdelete-null-pointer-checks ("enabled by default on most targets") and it is probably giving good code size improvements, we have to pay a modest price and add a few checks.

Keyboard

j / / n
next pair
k / / p
previous pair
1 / / h
show query pane
2 / / l
show candidate pane
c
copy suggested comment
r
toggle reasoning
g i
go to index
?
show this help
esc
close overlays

press ? or esc to close

copied