Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a fix to enable passing empty cifmw_run_id #2589

Merged

Conversation

frenzyfriday
Copy link
Collaborator

@frenzyfriday frenzyfriday commented Dec 9, 2024

TP: ci-framework-testproject 934
../controller-0/ci-framework-data/parameters/baremetal-info.yml

@github-actions github-actions bot marked this pull request as draft December 9, 2024 10:29
Copy link

github-actions bot commented Dec 9, 2024

Thanks for the PR! ❤️
I'm marking it as a draft, once your happy with it merging and the PR is passing CI, click the "Ready for review" button below.

@hjensas
Copy link
Contributor

hjensas commented Dec 13, 2024

I am positive to this change, I never understood why the run_id was added in the first place. The original Jira does not explain why it is needed ...

I don't see how the TP linked tests this change?
The nodes there still have the run_id?

    compute-nph0j7zf-1:
        hostname: compute-nph0j7zf-1

The run_id is causing us trouble in adoption : ref: https://github.com/openstack-k8s-operators/data-plane-adoption/blob/main/scenarios/uni01alpha/config_download.yaml#L83-L92

I am only aware of a single place cifmw_run_id is used ci-framwork-job here: scenarios/baremetal/bgp/bgp_dt02/03-custom-architecture.yaml - and the reason for use is - https://issues.redhat.com/browse/OSPRH-10923 , same issue we have in adoption

  • Do we needs this run_id at all? Is it of any use, or is it just causing trouble?

Adding @eduolivares since he created OSPRH-10923.

@frenzyfriday
Copy link
Collaborator Author

I am positive to this change, I never understood why the run_id was added in the first place. The original Jira does not explain why it is needed ...

I don't see how the TP linked tests this change? The nodes there still have the run_id?

    compute-nph0j7zf-1:
        hostname: compute-nph0j7zf-1

The run_id is causing us trouble in adoption : ref: https://github.com/openstack-k8s-operators/data-plane-adoption/blob/main/scenarios/uni01alpha/config_download.yaml#L83-L92

I am only aware of a single place cifmw_run_id is used ci-framwork-job here: scenarios/baremetal/bgp/bgp_dt02/03-custom-architecture.yaml - and the reason for use is - https://issues.redhat.com/browse/OSPRH-10923 , same issue we have in adoption

  • Do we needs this run_id at all? Is it of any use, or is it just causing trouble?

Adding @eduolivares since he created OSPRH-10923.

Yeah, sorry. The TP needed another depends on where I set the run id as '' https://gitlab.cee.redhat.com/ci-framework/ci-framework-jobs/-/merge_requests/1397
I am trying to check if passing a blank run id causes issues in the jobs. Otherwise I will prepare a patch where the run id creation logic is reversed - it is not created by default and if some job needs the run id appended with their resources they can set a flag that creates this run id

@hjensas
Copy link
Contributor

hjensas commented Dec 13, 2024

I am positive to this change, I never understood why the run_id was added in the first place. The original Jira does not explain why it is needed ...
I don't see how the TP linked tests this change? The nodes there still have the run_id?

    compute-nph0j7zf-1:
        hostname: compute-nph0j7zf-1

The run_id is causing us trouble in adoption : ref: https://github.com/openstack-k8s-operators/data-plane-adoption/blob/main/scenarios/uni01alpha/config_download.yaml#L83-L92
I am only aware of a single place cifmw_run_id is used ci-framwork-job here: scenarios/baremetal/bgp/bgp_dt02/03-custom-architecture.yaml - and the reason for use is - https://issues.redhat.com/browse/OSPRH-10923 , same issue we have in adoption

  • Do we needs this run_id at all? Is it of any use, or is it just causing trouble?

Adding @eduolivares since he created OSPRH-10923.

Yeah, sorry. The TP needed another depends on where I set the run id as '' https://gitlab.cee.redhat.com/ci-framework/ci-framework-jobs/-/merge_requests/1397 I am trying to check if passing a blank run id causes issues in the jobs. Otherwise I will prepare a patch where the run id creation logic is reversed - it is not created by default and if some job needs the run id appended with their resources they can set a flag that creates this run id

ok, sounds good.

I think it would be a good idea to explore the option to remove the run_id thing entirely as well - it adds complexity and makes the code harder to maintain/change. But that should be a follow up, after reversing the logic - if no one screams that it caused problems ...

@eduolivares
Copy link
Contributor

@frenzyfriday, thanks for this PR. It looks good to me and I think this is the proper solution for OSPRH-10923.
Let's wait for the TP results.

@hjensas, according to #2224 description, with this ID we could "use the Zuul job ID as the identifier, making things slightly easier to follow across the various runs". But to be honest, I have never done this.

@hjensas
Copy link
Contributor

hjensas commented Dec 13, 2024

@frenzyfriday, thanks for this PR. It looks good to me and I think this is the proper solution for OSPRH-10923. Let's wait for the TP results.

@hjensas, according to #2224 description, with this ID we could "use the Zuul job ID as the identifier, making things slightly easier to follow across the various runs". But to be honest, I have never done this.

hmm, I managed to find a Jira that may explain why we have this: OSPRH-7616
We have shared storage arrays in testing, so multiple jobs using the same storage array for cinder. And these storage arrays get confused if multiple hosts with the same hostname connects. (So my idea to completely remove the run_id should be scrapped ...)

Copy link
Contributor

@hjensas hjensas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Copy link
Collaborator

@lewisdenny lewisdenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

lgtm: I can see in the D/S TP that the nodes are generated without a run_id without error.

@pablintino
Copy link
Collaborator

/approve
@hjensas @eduolivares @frenzyfriday Good catch that Jira! IMHO, we should stick to fixed hostnames and add the random uuid only in scenarios that really need it.

Copy link
Contributor

openshift-ci bot commented Dec 16, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lewisdenny, pablintino

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [lewisdenny,pablintino]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@pablintino
Copy link
Collaborator

/lgtm

@openshift-merge-bot openshift-merge-bot bot merged commit 2d92ec8 into openstack-k8s-operators:main Dec 16, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants