-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ability to query minimum/maximum length of regex #513
Comments
Does you expect the minimum and maximum to be accurate? That would be difficult if there were references to capture groups or calls. |
accurate in the sense that all possible matches will fall into this range, even if no match with that length actually exists, yes. And preferably of-course the 0/1+ distinction of min should be fully accurate. For our usecase is would actually be fully fine if this is only correctly supported for purely regular syntax, i.e. no backrefrences or non-regular extensions like nested calls. |
The minimum length is already available internally, except that it doesn't include references or calls (it assumes that they have zero length). The maximum length is more of a problem... |
I too would love this functionality in a dependable way, in my case for the |
For the lark parsing library we use the (sadly private) stdlib
re._parser
library to query the minimum and maximum length of a regex:https://github.com/lark-parser/lark/blob/942366b49247e996e387cb901ed96c7d861382a0/lark/utils.py#L132-L156
As can be seen from the snippet, since we also support using
regex
instead ofre
, we need to take special care when encountering regex specific syntax, like nested sets category patterns. The only value that needs to be correct is if minimum length is 0 or greater since we depend on Regular Expressions being non-empty in a few places.It would be nice if there was a way a query the minimum and maximum match size from a compiled regex object. The stdlib re module is lower priority since there there is at least a way to accesses this information reliably, but I am probably also going to make a request there.
The text was updated successfully, but these errors were encountered: