QUERY · ISSUE

Emitted code epilogue block contains a superflous jump to the next available code address.

openby agattiopened 2025-01-17updated 2025-01-17

bug

Port, board and/or hardware

mpy-cross built from git commit 6db29978ac8954f3686f9eb59dd71b55c3495456 (current master)

MicroPython version

MicroPython v1.25.0-preview.216.g6db29978a.dirty on 2025-01-17; mpy-cross emitting mpy v6.3

Reproduction

Run mpy-cross -X emit=native -march=debug tests/basics/0prelim.py
Look at the last part of the output for a jump to label_0.

This applies to any supported architecture with a native emitter, but -march=debug is used in the command line to make the problem visible without using a disassembler.

Expected behaviour

There should be no empty jump being emitted at the end of a block.

Observed behaviour

At the end of an emitted block of code this can be seen (taken from tests/basics/0prelim.py):

    ...
    jump(label_0)
label(label_0)
EXIT(0)

This applies also to functions that need a more involved cleanup procedure:

(tests/basics/async_for.py):

    ...
    jump(label_0)
    dead_code load(r_temp0, r_fun_table, 0)
    dead_code store(r_temp0, r_local2, 5)
    dead_code mov_reg_imm(r_temp0, 40=0x28)
    dead_code add(r_temp0, r_local2)
    dead_code store(r_temp0, r_local2, 2)
    dead_code mov_reg_imm(r_temp0, 0=0x0)
    dead_code mov_local_reg(local_3, r_temp0)
    dead_code jump(label_0)
label(label_0)
    call_ind(nlr_pop)
    mov_reg_local(r_ret, local_3)
EXIT(0)

(tests/basics/array_micropython.py):

    ...
    mov_local_reg(local_3, r_ret)
    jump(label_0)
label(label_0)
    mov_reg_local(r_arg1, local_6)
    call_ind(native_swap_globals)
    call_ind(nlr_pop)
    mov_reg_local(r_ret, local_3)
EXIT(0)

Additional Information

This is not specific to mpy-cross as the same issue occurs when emitting a block of code at runtime as well.

Code of Conduct

Yes, I agree

CANDIDATE · PULL REQUEST

py/emit: Improve the logic to detect and eliminate dead code

mergedby dpgeorgeopened 2022-06-17updated 2022-06-20

py-core

The existing dead-code finding logic - that used last_emit_was_return_value - was not very good.

This new logic tracks when an unconditional jump/raise occurs in the emitted code stream (bytecode or native machine code) and suppresses all subsequent code, until a label is assigned. This eliminates a lot of cases of dead code, with relatively simple logic.

This PR has the following code size change:

   bare-arm:   -16 -0.028%
minimal x86:   -60 -0.036%
   unix x64:  -368 -0.070%
unix nanbox:   -80 -0.017%
      stm32:  -204 -0.052% PYBV10
     cc3200:    +0 +0.000%
    esp8266:  -228 -0.033% GENERIC
      esp32:  -224 -0.015% GENERIC[incl -40(data)]
     mimxrt:  -192 -0.054% TEENSY40
 renesas-ra:  -200 -0.032% RA6M2_EK
        nrf:   +28 +0.015% pca10040
        rp2:  -256 -0.050% PICO
       samd:   -12 -0.009% ADAFRUIT_ITSYBITSY_M4_EXPRESS

Also generated bytecode is sometimes smaller (that's the whole point!). For example compiling all of uasyncio, this new optimisation reduces it by 13 bytes (from 8464 down to 8451 for sum of all uasyncio .mpy files).

One example of an optimisation is when there is a raise at the end of a function. In that case it no longer emits a redundant return None at the end of the function (saving 2 bytes).

This also uncovered a latent bug in the VM which is fixed here.