Descriptor attribute access on a class doesn't call __get__
CPython calls __get__ with obj as None when a descriptor is accessed on the class itself rather than an instance. MicroPython returns the descriptor instance instead of calling __get__.
Optimise instance load/store/delete by skipping special accessors when possible
This PR aims to optimise load/store/delete of attributes in user defined classes by not looking up special accessors (property, __get__, __delete__, __set__, __setattr__ and __getattr_) if they are guaranteed not to exist in the class.
Currently, if you do my_obj.foo() then the runtime has to do a few checks to see if foo is a property or has __get__, and if so delegate the call. And for stores things like my_obj.foo = 1 has to first check if foo is a property or has __set__ defined on it.
Doing all those checks each and every time the attribute is accessed has a performance penalty. This PR eliminates all those checks for cases when it's guaranteed that the checks will always fail, ie no attributes are properties nor have any special accessor methods defined on them.
To make this guarantee it checks all attributes of a user-defined class when it is first created. If any of the attributes of the user class are properties or have special accessors, or any of the base classes of the user class have them, then it sets a flag in the class to indicate that special accessors must be checked for. Then in the load/store/delete code it checks this flag to see if it can take the short cut and optimise the lookup.
Code size increase with this PR is:
bare-arm: +16
minimal x86: +32
unix x64: +200
unix nanbox: +256
stm32: +80
cc3200: +80
esp8266: +436
esp32: +108
Bare-arm and minimal increase because of the introduction of a flags entry in mp_obj_type_t, which will be generally useful for other things in the future as well. esp8266 increases by more than the others because it has overhead for loading non-32-bit values from RAM.
With this PR performance increases by about 6%, which is quite an improvement. More importantly, MICROPY_PY_DESCRIPTORS can now be enabled without any additional overhead (see related discussion in #3644).
Performance tests that were done:
- on unix x86-64, pystone improved by about 5%
- on pyboard, pystone improved by about 6.5%, from 1683 up to 1794
- on pyboard, bm_chaos (from CPython benchmark suite) improved by about 5%
- on esp32, pystone improved by about 30%, but caching effects probably play a role here
- on esp32, bm_chaos improved by about 11%
One important downside of this PR: because checks for properties and special accessors is done at the time of creation of a user-defined class, the optimisation trick here will fail if classes have properties or special methods added after they are created. For example
class A:
pass
def getter(self):
print('get')
return 1
A.foo = property(getter) # dynamically add property to the class
print(A().foo) # won't work with this PR, just returns a property object
The reason it doesn't work now is because properties/special accessors are only checked for during class creation. It would be possible to check when attributes are added to a class if they are a property/special accessor, but that only solves part of the problem: there's still the chance that a base class is changed dynamically, and the derived class has no way of knowing this.
In summary: this PR gives a really great performance boost and allows to enable descriptors with no cost to classes that don't use them. But is it worth giving up on the ability to dynamically add properties/special accessors to a class (does anyone ever do that)?