← index #7108Issue #3178
Related · high · value 1.370
QUERY · ISSUE

ure: character classes not working in sets

openby Gattagopened 2021-04-09updated 2024-09-13
enhancementextmod

All of these assert statements fail, but I don't think they should:

import ure
assert ure.match('[\\w]', "a") is not None
assert ure.match('[\\s]', " ") is not None
assert ure.match('[\\d]', "1") is not None
assert ure.match('[\\w.]*$', "boot.py") is not None
CANDIDATE · ISSUE

"ure" incorrectly handles escaped '-' and trailing literal '-' in character classes

closedby alex-robbinsopened 2017-06-29updated 2019-10-18
extmod

According to the CPython docs, in a character class,

If - is escaped (e.g. [a\-z]) or if it’s placed as the first or last character (e.g. [a-]), it will match a literal '-'.


The first assertion statement below succeeds, but the next four fail. As indicated, some of the failures are AssertionErrors (incorrect result was returned) and some are ValueErrors (input was rejected, no value returned).

import ure

# Passing:
assert ure.compile(r'[-a]').split('foo-bar') == ['foo', 'b', 'r']

# AssertionError:
assert ure.compile(r'[a\-x]').split('foo-bar') == ['foo', 'b', 'r']
assert ure.compile(r'[\-ax]').split('foo-bar') == ['foo', 'b', 'r']

# ValueError: Error in regex
assert ure.compile(r'[ax\-]').split('foo-bar') == ['foo', 'b', 'r']
assert ure.compile(r'[a-]').split('foo-bar') == ['foo', 'b', 'r']

All 5 of these work fine in CPython with re and in MicroPython with re-pcre.

Keyboard

j / / n
next pair
k / / p
previous pair
1 / / h
show query pane
2 / / l
show candidate pane
c
copy suggested comment
r
toggle reasoning
g i
go to index
?
show this help
esc
close overlays

press ? or esc to close

copied