Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/4343 add ignore case to list and dictionary comparisons #4690

Conversation

MobyNL
Copy link
Contributor

@MobyNL MobyNL commented Mar 14, 2023

As mentioned in issue #4343, this is my first time contributing. I would love to have feedback and I'm still working on complying to coding principles

Copy link
Member

@pekkaklarck pekkaklarck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Good work if this is your first contribution. There are pretty serious issues in the code especially related to normalizing values. I added added separate comments about that to few places, but they apply to rest of the code as well. I'll review the PR in more detail once these issues are fixed.

src/robot/libraries/Collections.py Outdated Show resolved Hide resolved
src/robot/libraries/Collections.py Outdated Show resolved Hide resolved
src/robot/libraries/Collections.py Outdated Show resolved Hide resolved
src/robot/libraries/Collections.py Outdated Show resolved Hide resolved
@MobyNL
Copy link
Contributor Author

MobyNL commented Mar 15, 2023

Thanks for the feedback @pekkaklarck. Some nice learnings and certainly good suggestions/improvements. I'm gonna work on it!

@MobyNL MobyNL changed the title Feature/4343 add ignore case to liss and dictionary comparisons Feature/4343 add ignore case to list and dictionary comparisons Mar 16, 2023
@MobyNL
Copy link
Contributor Author

MobyNL commented Mar 16, 2023

I updated and implemented some of the feedback mentioned above. Some additional comments/questions from my part:

  1. _normalize is now a function in both classes _List and _Dictionary. Considering the function is identical in both, we could consider making a general helper for this. I was thinking how to best do this? There already is a normalizing.py in utils, which has a different approach than how it is now used in Collections.
  2. I added a general function in _Dictionary, to check whether to ignore_case for key and/or value. I used a dictionary to return the values, because I did not always need both (and always returning both would sometimes result in unused variables). It is also possible to use multiple functions and split the variables, but I thought it was a nice approach. Maybe someone has additional comments.
  3. I added some tests to check the sad flow when ignore_case=False (or empty), but we could also consider changing the test data for some of the existing tests of the base functions. However, that would mean you are starting to test 2 things at once for those testcases.
    For example: the test Dictionary Should Contain Item currently looks at a key x, while the dictionary has keys a, b and 3. If we changed x to A, we should also get a Fail on the same testcase. However, if something would be wrong in ignore_case, the test might wrongfully pass, compare to using an entirely different key.

I also tried to uphold the PEP-8 standards, and also used pylint with an adjusted max 88 characters per line. So hopefully it's a bit closed to the standards (if not fully compliant) than before

@MobyNL MobyNL requested a review from pekkaklarck March 18, 2023 11:41
@MobyNL MobyNL marked this pull request as ready for review March 24, 2023 09:23
@MobyNL MobyNL marked this pull request as draft March 24, 2023 09:24
@MobyNL MobyNL marked this pull request as ready for review April 13, 2023 21:27
@MobyNL MobyNL marked this pull request as draft April 25, 2023 17:55
@MobyNL MobyNL marked this pull request as ready for review April 25, 2023 18:22
Copy link
Member

@pekkaklarck pekkaklarck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very good otherwise, but all keywords handling normalization separately adds lot of code even though there's a common _normalize helper. The Normalizer class I propose ought to remove most of that complexity.

I'm planning to create RF 6.1 rc 1 on Monday, June 5. There's not much time to get this PR merged before that, but because this change only affects Collections and ought to be pretty safe in general, I'm willing to make an exception and merge this after the release candidate is out. The final release is targeted for Monday, June 12, so there's not too much time for that either.

Notice also that there have been some changes to Collections after this PR was created. As the result there are some conflicts, but they ought to be easy to fix.

Let me know here or on Slack if you need any help!

atest/robot/standard_libraries/collections/list.robot Outdated Show resolved Hide resolved
atest/testdata/standard_libraries/collections/list.robot Outdated Show resolved Hide resolved
src/robot/libraries/Collections.py Outdated Show resolved Hide resolved
src/robot/libraries/Collections.py Outdated Show resolved Hide resolved
src/robot/libraries/Collections.py Outdated Show resolved Hide resolved
src/robot/libraries/Collections.py Outdated Show resolved Hide resolved
src/robot/libraries/Collections.py Outdated Show resolved Hide resolved
src/robot/libraries/Collections.py Outdated Show resolved Hide resolved
src/robot/libraries/Collections.py Outdated Show resolved Hide resolved
src/robot/libraries/Collections.py Outdated Show resolved Hide resolved
@MobyNL
Copy link
Contributor Author

MobyNL commented Jun 15, 2023

Thanks @pekkaklarck. I was a bit too busy to work on this but will take a look at the feedback in the coming days!

@MobyNL MobyNL requested a review from pekkaklarck June 22, 2023 20:32
…o feature/4343-add-ignore-case-to-liss-and-dictionary-comparisons
…parisons' of https://github.com/MobyNL/robotframework into feature/4343-add-ignore-case-to-liss-and-dictionary-comparisons
@MobyNL
Copy link
Contributor Author

MobyNL commented Oct 12, 2023

@pekkaklarck I updated this, as I saw it had merge conflicts with the current version. And just a reminder that it's still open

src/robot/libraries/Collections.py Outdated Show resolved Hide resolved
src/robot/libraries/Collections.py Outdated Show resolved Hide resolved
@@ -333,22 +340,26 @@ def list_should_not_contain_duplicates(self, list_, msg=None):

This keyword works with all iterables that can be converted to a list.
The original iterable is never altered.

The ignore_case argument can be used to make comparison case-insensitive.
See the Ignore case section for more details. It is new in Robot Framework 6.2.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same (last comment, more in code)

@MobyNL
Copy link
Contributor Author

MobyNL commented Oct 12, 2023

@pekkaklarck I updated this, as I saw it had merge conflicts with the current version. And just a reminder that it's still open

Maybe it could be added to version 7.0?

@pekkaklarck
Copy link
Member

I enabled running this PR on GitHub Actions and one test failed:

FAIL: Robot.Variables.Using Dict Variables.Multiple dict variables with same names multiple times
Status of 'Multiple dict variables with same names multiple times' should have been PASS but it was FAIL.

Error message:
Following keys have different values:
Key d: 4 != 1
Key a: 1 != 2
Key b: 2 != 3
Key c: 3 != 4

That test uses Dictionaries Should Be Equal with dicts that have items in different orders. I found where the bug is in the code and will add a comment there.

In addition to fixing the issue, it's a good idea to add a new test for this library that validates that Dictionary Should Be Equal doesn't require dicts to have same order. In practice this should succeed:

${dict1} =    Create Dictionary    a=1    b=2    c=3    d=4    e=5
${dict2} =    Create Dictionary    d=4    b=2    e=5    a=1    c=3
Dictionaries Should Be Equal    ${dict1}    ${dict2}

We could also enhance the keyword to actually check the order if both dicts are OrderedDict instances. That's relevant related to this PR and a separate issue and PR should be submitted instead.

@pekkaklarck
Copy link
Member

As I commented issue #4343, I won't have time for a proper review right now (and the aforementioned bug anyway needs to be fixed before merging this), but I added the issue to RF 7.0 scope that it won't be forgotten.

try:
assert_equal(dict1[key], dict2[key], msg=f'Key {key}')
assert_equal(dict1[key1], dict2[key2], msg=f'Key {key1}')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation isn't correct and causes a failure in an unrelated suite that uses Dictionaries Should Be Equal with dicts having items in different order. The old for key in keys approach avoided the issue and should be restored. normalize can then be used with each value like assert_equal(normalize(dict1[key]), ...).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed this, by resetting it to for key in keys, so this behaviour is fixed. Now differently ordered dictionaries also work. I also added some tests for this. I did this by change it to for key in keys:

I do however now encounter a problem at various tests because of key errors, that I haven't yet solved. Those can be solved by changing it to for key in normalize(keys):, but this poses a new problem as keys is list-like, meaning it will always normalize, unless ignore_case=False. I will work on this some more when I have time tomorrow

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have now also fixed the error mentioned above in a very ugly way. I will commit it for now but put in some more time tomorrow to see if a cleaner solution is possible

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, if you have {'A': 1, 'B': 2} and {'a': 1, 'b': 2}, keys are equal case-insensitively, but comparing values by iterating over keys of either of the dicts and doing d1[k] == d2[k] fails. Keys are returned by the method comparing keys, and it could possibly return a list of two-tuples containing matching keys of both of the dicts.

Copy link
Member

@pekkaklarck pekkaklarck Nov 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things get more complicated if you have {'a': 1, 'A': 2} and {'a': 2, 'A': 1}. Keys are equal and I guess we can consider values to match case-insensitively as well. Value comparison will fail, however, if comparison is done using key pairs [('a', 'a'), ('A', 'A')].

This gets even more "fun" with dicts having keys like 'abc', 'Abc', 'aBc', 'abC', 'ABc', 'AbC', 'aBC', 'ABC' so that they have same values but with different keys. I have and idea how to handle this in ganeric manner, but I'm currently on mobile and writing code examples is too hard. I probably should test my idea first as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made an example using keys = [(key1, key2) for key1, key2 in zip(dict1.keys(), dict2.keys()) if normalize(key1) == normalize(key2)]
It works, however, now a new challenge rises when amount of keys is not equal. So, still ongoing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I found it. In dictionary_should_contain_sub_dictionary we also create a set of keys. This should return the couples as well

Copy link
Contributor Author

@MobyNL MobyNL Nov 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't solve the problem yet for keys with double names. But I think that might be something that can't be solved. Because you don't know the actual pairs. If i create 2 dicts:

dict1 = {'abc': 3, 'AbC':4}
dict2 = {'ABC':4, 'Abc': 3}

It is now impossible to check if this is correct. I have 2 keys that are 3 and 2 that are 4, but i don't actually know if they are similar. Especially if they are also in a different order.

I think that if you have identically named keys (except for case), that you shouldn't be able to use ignore_case. So we could add an error message when multiple keys are present.

Copy link
Member

@pekkaklarck pekkaklarck Nov 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to test it, but I believe this algorithm would solve the above problem:

  1. Collect all values with matching keys to lists. In the above case we'd get lists [3, 4] and [4, 3].
  2. Validate that lists have same items.This wouldn't be that easy, because we couldn't simply sort lists (they could be unsortable) nor could we use sets (values could be unhashable).

Even though we could make the above work, I'm not sure is it worth the effort or a good idea in general. Making it an error to have multiple keys that normalize to the same value could be better.

Copy link
Contributor Author

@MobyNL MobyNL Nov 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have now added an error handling in _keys_should_be_equal. I used:
len(set(normalize(dict1).keys()))!=len(dict1.keys())
I think this is valid, since normalize changes dict_like objects to a dictionary. So I can use sets too, after the normalizing

@MobyNL
Copy link
Contributor Author

MobyNL commented Nov 11, 2023

I also ran the atests locally for collections and using_dict_variables that had a fail earlier. They all still pass, as well as the newly created ones.

@MobyNL
Copy link
Contributor Author

MobyNL commented Nov 11, 2023

I also ran the atests locally for collections and using_dict_variables that had a fail earlier. They all still pass, as well as the newly created ones.

Now the rest has also run. Everything except for the Telnet testcases passed. But those have never worked locally. I guess I need some installation to have those working too. So unless there are changes in the OS, or python versions I expect all to pass

@pekkaklarck
Copy link
Member

Making it an error if dicts have duplicate keys after normalization is a good solution for handling cases like {'a': 1, 'A': 2} vs {'A': 1, 'a': 2}. There are some style issues in the code, but it's probably easiest if I merge this PR and fix those myself. Thanks a lot for contribution @MobyNL!

@pekkaklarck pekkaklarck merged commit ea16315 into robotframework:master Nov 17, 2023
@MobyNL MobyNL deleted the feature/4343-add-ignore-case-to-liss-and-dictionary-comparisons branch November 17, 2023 12:32
@pekkaklarck
Copy link
Member

While making the small enhancements I mentioned above, I noticed Dictionary Should Contain Item doesn't work correctly in all cases. The problem occurs, for example, with a dictionary {'X': 1} when using key x or X because the keyword tries to get a normalized key from the original dictionary before normalizing the value like normalize(dictionary[normalize(key)]). Trying to fix by normalizing the whole dictionary like normalize(dictionary)[normalize(key)] fixes this situation, but breaks if someone has used normalize=key or normalize=value.

I believe a proper fix requires adding separate normalize_key and normalize_value methods to Normalizer to allow getting the compared value like normalizer.normalize_dict(dictionary)[normalizer.normalize_key(key)] and then normalizing the expected value like normalizer.normalize_value(value). I've already done this locally and it seems to work. I'll commit it along with other cleanup I mentioned above (and other cleanup I do this somewhat old library at the same time).

@pekkaklarck
Copy link
Member

As I commented #4343, this library now has some keywords that accept case_insensitive and others that accept ignore_case. The latter is used also by BuiltIn and I like it more also otherwise. I'll change the others to accept it as well, but for backwards compatibility reasons case_insensitive should still work for the time being.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants