Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regex: support duplicated named captures #5061

Conversation

makenowjust
Copy link
Contributor

@makenowjust makenowjust commented Sep 30, 2017

Currently we cannot use duplicated named captures by default:

/(?<foo>a)|(?<foo>b)/ # compile error!

But we can use (?J) for this:

/(?J)(?<foo>a)|(?<foo>b)/ # it can compile, but...

Unfortunately, Regex::MatchData does not support duplicated named captures, so this code is broken:

"a" =~ /(?J)(?<foo>a)|(?<foo>b)/
pp $~["foo"] # => nil (!!?)

This PR fixes Regex::MatchData to support duplicated named captures, and Regexp accepts duplicated named captures by default. Duplicated named captures are useful when writing complex regexp. Thank you.

@makenowjust makenowjust force-pushed the feature/regexp/duplicated-named-capture branch from a8a3f63 to d7dcfee Compare September 30, 2017 01:10
@makenowjust makenowjust force-pushed the feature/regexp/duplicated-named-capture branch from d7dcfee to 84725bd Compare September 30, 2017 01:31
@makenowjust
Copy link
Contributor Author

makenowjust commented Sep 30, 2017

Current crystal compiler cannot compile regexp containing duplicated named captures, so CI is failed. Should I replace regexp literal to Regex.new "..." as work around?

@makenowjust makenowjust changed the title Regexp: support duplicated named captures Regex: support duplicated named captures Sep 30, 2017
@makenowjust
Copy link
Contributor Author

I should fix this document also.

@makenowjust
Copy link
Contributor Author

#5061 (comment) I want some opinion to this.

/cc @asterite @mverzilli @RX14

@RX14
Copy link
Contributor

RX14 commented Oct 1, 2017

Yeah, using Regex.new with a note to switch it back on the next release is a good idea.

@makenowjust
Copy link
Contributor Author

@RX14 fixed this.

private def named_capture_number(group_name)
first = Pointer(UInt8).null
last = Pointer(UInt8).null
name_entry_size = LibPCRE.get_stringtable_entries(@code, group_name, pointerof(first), pointerof(last))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can out first, out last be used here?

Copy link
Member

@asterite asterite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a tiny request to use out if possible, otherwise this is good, thank you!

@@ -157,10 +157,24 @@ class Regex
# "Crystal".match(/r(?<ok>ys)/).not_nil!["ok"]? # => "ys"
# "Crystal".match(/r(?<ok>ys)/).not_nil!["ng"]? # => nil
# ```
#
# When there are capture groups having same name, it returns the lastest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be "last". There is another place where this should be changed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes...

@asterite asterite added this to the Next milestone Oct 14, 2017
@asterite asterite merged commit c1e8b70 into crystal-lang:master Oct 14, 2017
@makenowjust makenowjust deleted the feature/regexp/duplicated-named-capture branch November 3, 2017 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants