Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It's impossible to create a regex that is MULTILINE but not DOT_ALL #14869

Closed
ralsina opened this issue Aug 6, 2024 · 2 comments · Fixed by #14870
Closed

It's impossible to create a regex that is MULTILINE but not DOT_ALL #14869

ralsina opened this issue Aug 6, 2024 · 2 comments · Fixed by #14870

Comments

@ralsina
Copy link
Contributor

ralsina commented Aug 6, 2024

Bug Report

The expected behaviour when using a MULTILINE regex like #.*$ is that it will match a whole line that begins with #

Because crystal conflates MULTILINE and DOT_ALL, that regex will, when set to MULTILINE, match also any subsequent lines until the end of the file.

Regex.new("#.*$", Regex::Options::MULTILINE).match("# this should be matched\nthis shouldn't")

=> Regex::MatchData("# this should be matched\nthis shouldn't")

To keep backwards compatibility, the behaviour of MULTILINE in crystal should not be changed, but I propose a change that adds a new MULTILINE_ONLY flag that just doesn't set DOT_ALL when creating the underlying PCRE2 object.

@ralsina ralsina added the kind:bug A bug in the code. Does not apply to documentation, specs, etc. label Aug 6, 2024
@Blacksmoke16 Blacksmoke16 added kind:feature topic:stdlib:text and removed kind:bug A bug in the code. Does not apply to documentation, specs, etc. labels Aug 6, 2024
@ralsina
Copy link
Contributor Author

ralsina commented Aug 6, 2024

PR: #14870

@Blacksmoke16 Blacksmoke16 linked a pull request Aug 6, 2024 that will close this issue
@oprypin
Copy link
Member

oprypin commented Aug 8, 2024

Some previous discussion was here: #8062

Indeed it's sad that DOTALL is bunched together with MULTILINE.

/(?m)#.*$/ achieves the desired behavior of just MULTILINE, by the way - confusingly it's totally different from /#.*$/m

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants
@ralsina @oprypin @Blacksmoke16 and others