Skip to content
This repository has been archived by the owner on Jan 29, 2020. It is now read-only.

Nested attribute selectors #21

Merged

Conversation

MidnightDesign
Copy link
Contributor

@MidnightDesign MidnightDesign commented Apr 5, 2018

6ae9938/2.7.0 introduced a bug where nested attribute selectors (div[class="foo"] div[class="bar"]) aren't transformed to XPath properly anymore. Does anyone have an idea how we could fix this?

@MidnightDesign
Copy link
Contributor Author

I've added a commit with an attempt to fix the problem. I don't know if it's ideal though.

@@ -88,7 +88,7 @@ public static function cssToXpath($path)

// Arbitrary attribute value contains whitespace
$path = preg_replace_callback(
'/\[\S+["\'](.+)["\']\]/',
'/\[\S+["\']([\w\s]+)["\']\]/',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... what about .+? (non greedy)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that (first), but it didn't work. Maybe [^"\']? The tests would pass, but that also seems pretty fragile to me.

Copy link
Member

@michalbundyra michalbundyra Apr 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I had a look on it, and .+? doesn't work because \S+ is greedy. So I would suggest then:

'/\[\S+?["\'](.+?)["\']\]/'

or with modifier U (see: http://php.net/manual/en/reference.pcre.pattern.modifiers.php):

'/\[\S+["\'](.+)["\']\]/U'

@MidnightDesign
Copy link
Contributor Author

MidnightDesign commented Apr 6, 2018

@webimpress I've updated the PR with your first suggestion. Thank you for your help!

@@ -88,7 +88,7 @@ public static function cssToXpath($path)

// Arbitrary attribute value contains whitespace
$path = preg_replace_callback(
'/\[\S+["\'](.+)["\']\]/',
'/\[\S+?["\'](.+?)["\']\]/',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thinking about this regexp here, and I think we should use backreference here to match match always "..." or '...' and not "...' or '...". But it's separate issue/improvement ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think that may be necessary. The problem is that if you have a ", and then a ' later, this will match... which will give you an incomplete match for the attribute value. Consider something like this: "javascript:alert('foo')". The above will return "javascript:alert'` as the match. So a backreference is definitely needed here to ensure that we search until we find a matching closing character.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weierophinney It is, yes. But that has nothing to with this PR. That problem already existed and should probably be fixed in a separate PR. If you want it to be fixed right here, you're welcome to update this one. I'm not good enough with regular expressions.

@weierophinney weierophinney merged commit 00a36de into zendframework:master Apr 9, 2018
weierophinney added a commit that referenced this pull request Apr 9, 2018
We need to match the initial quote, and then all characters up to but
not including a matching final quote, taking into account escaped quotes
as well.

This patch uses a technique learned from http://blog.stevenlevithan.com/archives/match-quoted-string
to do so, and updates existing regular expressions, including the one
modified for #21, to properly match quoted strings, introducing several
new test cases to verify the changes.
weierophinney added a commit that referenced this pull request Apr 9, 2018
weierophinney added a commit that referenced this pull request Apr 9, 2018
weierophinney added a commit that referenced this pull request Apr 9, 2018
@weierophinney
Copy link
Member

Thanks, @MidnightDesign! When merging, I added more tests, and included changes to ensure we properly identify quoted strings based opening quote.

@weierophinney weierophinney mentioned this pull request Apr 9, 2018
1 task
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants