Skip to content
This repository has been archived by the owner on Jul 9, 2024. It is now read-only.

Support multiple scope prefixes #54

Merged

Conversation

JohanLorenzo
Copy link
Contributor

Heavily inspired by mozilla-releng/signingscript#70

@JohanLorenzo JohanLorenzo changed the title Multiple scope support Support multiple scope prefixes Nov 28, 2018
@coveralls
Copy link

coveralls commented Nov 28, 2018

Coverage Status

Coverage remained the same at 100.0% when pulling f5a9c1d on JohanLorenzo:multiple-scope-support into 5e686f6 on mozilla-releng:master.

CHANGELOG.md Show resolved Hide resolved
examples/config.example.json Show resolved Hide resolved
def _get_scope_prefixes(context):
prefixes = context.config['taskcluster_scope_prefixes']
return [
prefix if prefix.endswith(':') else '{}:'.format(prefix)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to be explicit about this? E.g., if the prefix doesn't end with :, then stop the process and inform the user of the "one true format ™️"?

Don't get me wrong, with CLI and other business, this is exactly the sort of thing that can be automatically fixed! However, since this will be hard-coded in config, perhaps it would be better to enforce the structure so that different projects don't have different config conventions.

This would also help with maintainability - if we're automatically fixing and transforming configuration options, then we have additional behaviour that:

  • might break
  • might inter-operate in a weird way (e.g. multiple options that have magic fixing going on)
  • make future changes more difficult, because now the scope of backwards-compatibility is greater
  • other tools that also interact with this config now need to implement the same config-fixing logic

At the same time, this is pretty minimal, and not that big of a deal. Just 🍎 for 💭

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good thinking! I remember adding this auto-correction because there was discrepancy among the workers. The real fix is probably to not sanitize it here.

Let's fix this in a followup.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#56

for prefix in prefixes:
if all(scope.startswith(prefix) for scope in scopes):
break
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation is goofy here - shouldn't the else be aligned with the if?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

w o a w, nice


def _check_scopes_exist_and_all_have_the_same_prefix(scopes, prefixes):
for prefix in prefixes:
if all(scope.startswith(prefix) for scope in scopes):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't there only be one scope at this point? If not, it's going to be an error-case anyways once we do get_single_item_from_sequence(...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Fixed

@@ -8,15 +8,49 @@


def extract_android_product_from_scopes(context):
prefix = context.config['taskcluster_scope_prefix']
prefixes = _get_scope_prefixes(context)
scopes = _extract_scopes_from_unique_prefix(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand why this is a separate function - since we still want a single scope, why can't we keep the old behaviour of using condition in get_single_item_from_sequence(...)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're right. I don't remember why I made these functions this way, in the first place. However, they are too convoluted for this use case. Thank you pointing this out!

Copy link
Contributor

@mitchhentges mitchhentges left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥇

context.task['scopes'],
condition=lambda scope: scope.startswith(prefix),
all_scopes_starting_with_all_prefixes,
condition=lambda _: True, # it must just contain 1 single item
Copy link
Contributor

@mitchhentges mitchhentges Dec 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, as a reader, the important part here is that there should be a single item - unfortunately, this constraint is encoded within a very large call to get_single_item_from_sequence (it seems like the important part is the condition=lambda _: True, which also isn't super clear to the next reader of the code).
I understand that we follow a similar pattern of get_single_item_from_sequence elsewhere, but perhaps this is more cleanly solved with an assert? assert len(all_scopes_starting_with_all_prefixes) == 1 🤔

I'm not super spooked by this being in the PR, but I just want to bring it up for discussion :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree. The condition makes this call odd. The reason why get_single_item_from_sequence() was created is because too many *scripts were reimplementing the same logic. For instance, get_single_item_from_sequence() errors out whenever len(all_scopes_starting_with_all_prefixes) != 1. I'm not a huge fan of putting this logic back in pushapkscript. How about:

def extract_android_product_from_scopes(context):
    prefixes = _get_scope_prefixes(context)

    scope, _ = get_single_item_from_sequence(
        sequence=[(scope, prefix) for scope in context.task['scopes'] for prefix in prefixes],
        condition=lambda scope_then_prefix: scope_then_prefix[0].startswith(scope_then_prefix[1]),
        ErrorClass=TaskVerificationError,
        no_item_error_message='No scope starting with any of these prefixes {} found'.format(prefixes),
        too_many_item_error_message='More than one scope matching these prefixes {} found'.format(prefixes),
    )

    android_product = scope.split(':')[-1]

    return android_product

? Does this look still readable/not too hacky to you?

Side note regarding assert statements: they usually get optimized away in production (docs). The way to error our would be to raise exceptions.

Copy link
Contributor Author

@JohanLorenzo JohanLorenzo Dec 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmmm, I wonder if I shouldn't change https://github.com/mozilla-releng/scriptworker/blob/931a23b64e40cd913a629b65f6e9ade989a33765/scriptworker/utils.py#L689 to expand sequences within the sequence.

For instance:

def extract_android_product_from_scopes(context):
    # ...
    scope, _ = get_single_item_from_sequence(
        sequence=[(scope, prefix) for scope in context.task['scopes'] for prefix in prefixes],
        condition=lambda scope, prefix: scope.startswith(prefix),
        # ...
    )
    # ...

Edit: Eh, that'll be a breaking change. It's feasible but it'd require a little investigation to locate where nested sequences are already used

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_single_item_from_sequence is bugging me in a weird way - I feel like it's not very pythonic. It seems like it's generalizing some straightforward code that has the potential to have a lot of options - already, it's getting parameters like append_sequence_to_error_message, which is a bit of a code smell, IMHO.

If we start adding features like automatically expanding nested sequences (and putting an option behind that, too), then we have some fallout:

  • Complexity of get_single_item_from_sequence increases dramatically
  • As a consumer of get_single_item_from_sequence, if I'm confused by the result and I investigate the implementation, I see a lot of code (and, since there's a lot of boolean options, only ~half the code is relevant for my current use case)

My theory is that get_single_item_from_sequence was created to enhance reusability, but I feel like it's just increasing complexity and lowering readability. Besides, since it's so general, almost all of its behaviour needs to be passed in anyways (filtering logic, the exact error message for each case, etc). Is it worth it, and (:apple: for :thought_balloon:) what would we lose if we inlined the code in each case? So, here, it would be:

def extract_android_product_from_scopes(context):
    prefixes = _get_scope_prefixes(context)
    scopes = context.task['scopes']
    filtered_scopes = [scope for scope in scopes for prefix in prefixes if scope.startswith(prefix)]

    if len(filtered_scopes) == 0:
        raise TaskVerificationError('No scope starting with any of these prefixes: {}. Given scopes: {}'
                                    .format(prefixes, scopes))
    if len(filtered_scopes) > 1:
        raise TaskVerificationError('More than one scope found. Given scopes: {}'
                                    .format(scopes))

    android_product = filtered_scopes[0].split(':')[-1]

    return android_product

Note the benefits of de-generalizing this:

  • The "empty" error message is improved. At the end of the line, where we dump the "given" section, we don't show all the (scope, prefix) combinations, but rather just the list of scopes. This is good, since we're already dumping the list of prefixes earlier in the error message
  • Complexity is lowered, we don't need to mentally omit all the if that don't apply to us - we can just remove them from the code.
  • Readability is enhanced, since all the behaviour is presented to us, and none is hidden in a helper function.

Sorry for the essay! Feel free to merge this PR whenever, it's definitely 👌. This conversation is almost entirely a "🍎 for 💭"-er

Copy link
Contributor Author

@JohanLorenzo JohanLorenzo Dec 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your arguments make sense. I don't have a lot of counter-arguments left 😅. Let's ask Aki, what he thinks of this. I opened #60 to do so. In the meantime, I'm merging the PR as is [EDIT] with a variation of the latest example of get_single_item_from_sequence(). The reason of this change was a merge conflict on this line 5e686f6#diff-bd46f0eb1ee0ec91a456778ef84c2554R20. We need to know was the retained prefix.

pushapkscript/test/test_task.py Show resolved Hide resolved
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants