Exposing number of matches in a match object?
The Match objects returned by re.match and re.search internally stores the number of match groups:
>>> r = re.compile('hello my ([^ ]+) is ([^ .]+).')
>>> m = r.search('hello my name is lars')
>>> m
<match num=3>
This value isn't currently exposed in Python. The workaround for iterating through available matches, which we see in some of the tests, is sort of clunky:
def print_groups(match):
print('----')
try:
i = 0
while True:
print(match.group(i))
i += 1
except IndexError:
pass
I've read through the contributor guidelines, and since there exists a workaround I understand that simply exposing the match count may not be a priority. Would you accept a documentation pr that includes some variant of the above example in the ure module documentation?
unix-ffi re module throws when looking at match containing optional groups
Problem description
When a regex contains optional capture groups, for example (a)?b, the PCREMatch.group() method throws an overflow error for that group. I believe this is because PCRE is representing a nonexistent group as SIZE_MAX and re isnʼt checking for that.
Additional fallout
The unix-ffi json library requires the unix-ffi re library, and currently cannot parse numbers unless they have an integer part, a fractional part, and and exponential part; instead, it throws this same error.
To reproduce
>>> import re
>>> r = re.compile(r'(a)?b')
>>> m = r.match('b')
>>> (m.group(0), m.group(1))
Expected (cpython, and micropython built-in re library):
('b', None)
Actual (micropython re library)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/micropython/re.py", line 65, in group
OverflowError: overflow converting long int to machine word