QUERY · ISSUE
ure: character classes not working in sets
enhancementextmod
All of these assert statements fail, but I don't think they should:
import ure
assert ure.match('[\\w]', "a") is not None
assert ure.match('[\\s]', " ") is not None
assert ure.match('[\\d]', "1") is not None
assert ure.match('[\\w.]*$', "boot.py") is not None
CANDIDATE · ISSUE
"ure" incorrectly handles escaped '-' and trailing literal '-' in character classes
extmod
According to the CPython docs, in a character class,
If
-is escaped (e.g.[a\-z]) or if it’s placed as the first or last character (e.g.[a-]), it will match a literal'-'.
The first assertion statement below succeeds, but the next four fail. As indicated, some of the failures are AssertionErrors (incorrect result was returned) and some are ValueErrors (input was rejected, no value returned).
import ure
# Passing:
assert ure.compile(r'[-a]').split('foo-bar') == ['foo', 'b', 'r']
# AssertionError:
assert ure.compile(r'[a\-x]').split('foo-bar') == ['foo', 'b', 'r']
assert ure.compile(r'[\-ax]').split('foo-bar') == ['foo', 'b', 'r']
# ValueError: Error in regex
assert ure.compile(r'[ax\-]').split('foo-bar') == ['foo', 'b', 'r']
assert ure.compile(r'[a-]').split('foo-bar') == ['foo', 'b', 'r']
All 5 of these work fine in CPython with re and in MicroPython with re-pcre.