Towards possibility of precise garbage collector
It seems that many approaches of further evolution of GC in uPy depend on being able to do precise GC, i.e. tell which fields with an object are pointers are which are not. To achieve this, each allocated memory block has to be explicitly typed (and from type, internal layout can be inferred). Way to achieve this while staying compatible with conservative GC and without updating code largely is to have to set of memory allocation routines: one set is to deal with uPy objects (which already have type header is structure member) and another set to deal with "raw" memory allocations (which will need type header added implicitly for precise GC, and nothing added for conservative GC).
We currently have ~dozen functions to do memalloc. The above means doubling them. Other approach would be to reserve all current functions to uPy objects, add just add 2 funcs for "raw" memory: m_malloc() (will use implicit "all pointers inside" type) and m_malloc_typed() which takes explicit type for allocation.
Other thoughts?
add syntax to hint to gc lifetime of variable
On PC, memory is almost a non-issue but with uPy running on microcontrollers, issues with heap - fragmentation, garbage collection runtime overhead - must sometimes (?often) be taken into consideration by the .py author.
#2057: Failure to allocate when free memory is very large - severe fragmentation?
There are efforts to ameliorate some heap-issues: issue #3586 introduces concept of long- vs short-lived heap to reduce fragmentation.
However, these efforts are leaving out a critical piece of info that impacts the effectiveness of this approach: the context of the code in .py
My personal example:
# busObj1, busObj2 allocate memory on heap and return it as mp_obj
long_term_msgs = []
isDone = False
while isDone is False:
# m1 - stored "long-term"
m1 = busObj1.AllocateAndReturn()
# m2 - only exists for duration of while loop;if only there was a way to signal
# this to the memory-management system so it could better manage the heap.
m2 = busObj2.AllocateAndReturn()
isDone = IsLoopDone(m2)
long_term_msgs.append(m1)
The author knows that m2 is a "temp" so it would be nice to be able to indicate this to the memory management system to make use of this info.
Yes, .AllocateAndReturn() could be modified with an arg to indicate the temporary nature but that seems messy: every function that allocates memory in C - even ones that are not obvious to a uPy-only author - would have to have this "isTemp" flag added.
A better way is to introduce some syntax to indicate this.
I'm not an expert on Py syntax.
long_term_msgs = []
isDone = False
while isDone is False:
# m1 - stored "long-term"
m1 = busObj1.AllocateAndReturn()
# m2 - only exists for duration of while loop
# something like this - this is the "signal" that "m2" is meant to be short-term
@micropython.temp:
# m2 - only exists for duration of while loop;
m2 = busObj2.AllocateAndReturn()
isDone = IsLoopDone(m2)
long_term_msgs.append(m1)
And how would it be implemented?
I'm thinking: @micropython.temp serves as a context which results in a C-call to switch a malloc function pointer between
+#define m_new(type, num) ((type*)(m_malloc(sizeof(type) * (num), false)))
and
+#define m_new_ll(type, num) ((type*)(m_malloc(sizeof(type) * (num), true)))
See commit link in #3586