← index #17924PR #17414
Related · high · value 1.439
QUERY · ISSUE

Clang undefined behavior sanitizer diagnostics (mostly uninteresting??)

openby jepleropened 2025-08-15updated 2025-08-15
bug

Port, board and/or hardware

unix port, coverage build, x86_64 linux, clang-19

MicroPython version

v1.27.0-preview-15-g744270ac1b

Reproduction

perform the undefined behavior sanitizer build but with CC=clang, then try doing pretty much anything (such as starting micropython to the repl)

Expected behaviour

It works and is essentially free of undefined behavior diagnostics.

Observed behaviour

Several classes of diagnostic appear almost immediately.

I investigated two main classes of diagnostic:

  • Applying zero offsets to NULL pointers
  • Calling functions without exactly matching prototypes

Here's an example of each kind:

../../py/map.c:193:37: runtime error: applying zero offset to null pointer
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../py/map.c:193:37 
../../py/stream.c:60:28: runtime error: call to function vfs_posix_file_write through pointer to incorrect function type 'unsigned long (*)(void *, void *, unsigned long, int *)'
/home/jepler/src/micropython/ports/unix/../../extmod/vfs_posix_file.c:129: note: vfs_posix_file_write defined here

These are both classes of "technically forbidden per the C specification but work fine almost always in practice".

The first can be replaced by an extra guard check, but at the possible cost of code. For example,

-    const mp_obj_t *kwargs = args + n_args;
+    const mp_obj_t *kwargs = args ? args + n_args : NULL;

As discussed in the old sanitizer threads, I think this specific behavior is set to become defined ( (NULL+0 is NULL) in a future C standard.

The second is harder to resolve. For instance, this technically means the trick of calling either a read or write func through a function pointer with the read type is incorrect (the prototypes differ only by whether the data argument is const:

    if (flags & MP_STREAM_RW_WRITE) {
        io_func = (io_func_t)stream_p->write;
    } else {
        io_func = stream_p->read;                                  
    }
... mp_uint_t out_sz = io_func(stream, buf, size, errcode); ...

I didn't find a fine grained method to turn off these diagnostics. For instance, the first one is under the general umbrella of "pointer overflow" checks, which includes actual overflow in pointer arithmetic like uint32_t *ptr; ptr[large] when large * sizeof(uint32_t) makes the address wrap around.

Additional Information

I was interested in clang ubsan because the AFLplusplus fuzzer can be run in a mode where it treats sanitizer diagnostics as crashes. However, it defaulted to using clang rather than gcc, so I discovered that it really doesn't like the current state of micropython and so it can't make any interesting findings.

Oh here's a bonus that I found when preparing this issue. It occurs when building an empty list (and, probably, tuple). It results because unsigned subtraction is being used but the intent is to grow the stack by an element. Technically it is an overflowed subtraction so it is undefined behavior. but not interesting. More uninteresting signed overflows appear in vm.c and touching any of them is likely to cause code growth without benefit.

Starting program: /home/jepler/src/micropython/ports/unix/build-coverage/micropython -c '[]'
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
../../py/vm.c:832:24: runtime error: subtraction of unsigned offset from 0x7fffffffd920 overflowed to 0x7fffffffd928
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../py/vm.c:832:24 

Code of Conduct

Yes, I agree

CANDIDATE · PULL REQUEST

unix: Introduce sanitize_undefined variant, use during CI.

closedby jepleropened 2025-06-02updated 2025-06-19
port-unix

Summary

gcc's "undefined behavior" sanitizer can catch a range of misbehaviors at runtime that normally go unnoticed. These include integer and pointer operations that are "undefined" per the relevant C specification. Over time, most of these problems have been fixed through other PRs but if micropython desn't have regular CI time checks, regressions are inevitable.

This PR fixes current undefined behavior detected under gcc 12.2.0 (debian stable/bookworm) on an x64 system, then enables it during a new unix "coverage-like" build.

Testing

I built and ran the unix tests locally, iterating until there were no remaining diagnostics.

Trade-offs and Alternatives

#15303 is an alternate implementation, but it's become stale and did not cleanly merge with current micropython. That's why I'm re-opening my version of the changes.

Not all gcc sanitizers can be enabled simultaneously. So, a choice has to be made (mainly between -fsanitize=undefined and -fsanitize=memory). I chose the undefined checker, but implemented it so that an override of the makefile variable is possible.

-fsanitize=memory is may be error free now and can be added as a separate PR.

I read that a future C specifiction will make e.g., memset(NULL, 0, 0) (setting zero bytes of a NULL pointer) not-undefined-behavior. For this reason, and because my inspection didn't find any current incorrect optimizations due to -fdelete-null-pointer-checks [for sites that hit UB sanitizer messages], I chose to simplify this PR by disabling the use of nonnull-pointer annotations by the sanitizer.

Different gcc versions might be too different in what they affect, as any diagnostics will make make test fail due to the unexpected output.

Pattern-based tests which might inadvertently skip over output text that comes from the sanitizer, giving false negatives.

Keyboard

j / / n
next pair
k / / p
previous pair
1 / / h
show query pane
2 / / l
show candidate pane
c
copy suggested comment
r
toggle reasoning
g i
go to index
?
show this help
esc
close overlays

press ? or esc to close

copied