unix-ffi re module throws when looking at match containing optional groups
Problem description
When a regex contains optional capture groups, for example (a)?b, the PCREMatch.group() method throws an overflow error for that group. I believe this is because PCRE is representing a nonexistent group as SIZE_MAX and re isnʼt checking for that.
Additional fallout
The unix-ffi json library requires the unix-ffi re library, and currently cannot parse numbers unless they have an integer part, a fractional part, and and exponential part; instead, it throws this same error.
To reproduce
>>> import re
>>> r = re.compile(r'(a)?b')
>>> m = r.match('b')
>>> (m.group(0), m.group(1))
Expected (cpython, and micropython built-in re library):
('b', None)
Actual (micropython re library)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/micropython/re.py", line 65, in group
OverflowError: overflow converting long int to machine word
Fix unix-ffi/re: Fix OverflowError in re.groups().
Adjust the re.groups() methods to properly handle the PCRE2_UNSET value for unmatched optional groups.
This change prevents OverflowError when calling groups() on a match with no content.
The return matches CPython's.
A test case for an empty string match has been added to verify expected behavior.
Fixes micropython/micropython#18877
Thanks for this, the fix looks good!
Can I suggest adding more tests, ie:
That will test
group()behaviour and default value.Sure, happy to add.
I noticed that the readme mentions 'unsupported' for the unix-ffi folder, and I do not think these modules are included in CI testing either.
So AFIKT the tests need to be run manually, so still useful for validating.
I think it is run under CI, see
tools/ci.sh:ci_package_tests_run.@Josverl are you able to add those few tests I list above?
sorry , got sidetracked.
You are correct that the tests are run in CI.
Added the tests as requested.