Obscure bug with locals not captured by function
This is a bug for the record. It exists because a compiled bytecode function captures only its globals, not its locals. The only way I can make the bug appear (ie the only time a function needs to know the locals it was compiled in context of) is using the builtin compile function, and execute pre compiled code within another module, as follows.
Put this in t1.py:
x = 1
import t2
c = compile('print(x)', 'me', 'exec')
t2.C().ex(c)
and this in t2.py:
x = 2
c = compile('print(x)', 'me2', 'exec')
class C:
x = 3
exec('print(x)')
exec(c)
def ex(self, c):
exec(c)
Then run micropython t1.py. Check against CPythons output.
The above code tests more than just this bug, in order that one can see exactly what contexts are used where.
GC can reclaim bytecode while it's executing
In some rare cases it is possible that the GC reclaims bytecode while it is executing. It seems that this bug has existed for a very long time but only recently surfaced.
In particular it surfaced at this commit: 05fe66f68a1cf1b7587c55149472ab7bca843631
One can test the bug by running the tests/basics/string_format2.py script with the full test enabled (change int_tests to True to enable all tests). Using the x86-64 version of the unix port, this test will randomly fail after a GC (mostly due to "bytecode not implemented" exceptions).
The problem (at least with the x86-64 unix version, no other port seems to have the bug) is described by the following:
- the code is compiled and mp_compile returns a function object (a pointer) corresponding to the outer module's code
- this single object pointer is the only reference to the whole compiled bytecode and is stored in a register
- this pointer is passed to mp_call_function_0 for execution, via a register (still the only copy of the pointer)
- the mp_code_state_t structure is set up for execution of the bytecode, and it contains a pointer to the interior of the bytecode, not the start
- the only pointer to the start of the bytecode is the one in the function object (type mp_obj_fun_bc_t)
- the compiler is smart enough to realise that the function object is no longer needed and so doesn't keep a pointer to it (it was only ever in a register and that register is reused)
- the bytecode is executed by mp_execute_bytecode
- at this point there are no remaining pointers to the start of the bytecode (nor to the function object for the outer module)
- when the GC runs it reclaims the bytecode memory
- bytecode execution fails because memory is overwritten