-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gather just the necessary facts #5955
Gather just the necessary facts #5955
Conversation
Hi @vrlo. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
651f79e
to
1a6d2ee
Compare
Can ansible combine everything into one task ?
|
/ok-to-test |
This would work partially, because the task with subset |
Great, so you can combine the other two tasks into one, which would slightly reduce the size of the playbook? |
1a6d2ee
to
395a7c8
Compare
/approve |
/assign @mattymo |
cluster.yml
Outdated
gather_subset: '!all,!min,network,hardware' | ||
filter: "{{ item }}" | ||
loop: | ||
- ansible_distribution_major_version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looping will take 8x longer than just gathering all facts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why ?
the problem, as I understand it, is that on a node with running kubernetes and lots of pods, the size of facts is 30 megabytes, and the ansible loads this cache very slowly for each play
so with this task fact cache size is 10kb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ansible is slow when handling the large number of facts. This is a one time operation that ensures that the hostvars are kept small, speeding up all the other tasks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and it was 3MB -> 30kB in our case :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The number of loop iterations here is constant and the gathers will occur across all available forks in parallel so I don't have a problem with it.
A previous method of looping through hosts to gather facts with delegation had n^2 iterations which were not ran in parallel. That method could easily take 8 times longer than a gather all so I understand the apprehension with a loop. However, this patch does not have that problem since each setup iteration works on as many nodes as allowed by forks.
395a7c8
to
85ba346
Compare
Sorry @vrlo cancelled your pipeline because you need to rebase as tests name changed on master |
85ba346
to
6156830
Compare
I'm not an Ansible facts expert, but this looks reasonable. Since the same task is copied in a couple of playbooks, would it be possible to move that to a role or a task file? |
Yes, I wholeheartedly agree. The logical place would be something that's called only once, eg. bootstrap-os. However it's not called for all hosts in scale.yml (maybe that could be changed?). |
9d78743
to
2e3a644
Compare
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: chadswen, LuckySB, vrlo The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I had been testing something extremely similar to this while trying to scale to 200 node clusters in a single graceful upgrade build, so I can vouch for the approach. With this in place along with removing a few other bottlenecks, most of which coincidentally also got upstream variants merged this week, build time is greatly reduced at scale. Thanks for the patch! |
* Gather just the necessary facts * Move fact gathering to separate playbook.
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
Ansible is slow when there are many facts. This PR reduces the number of facts to necessary minimum by only gathering the ones that are needed.
Which issue(s) this PR fixes:
Fixes #5954
Also fixes #5927 as a side effect
Special notes for your reviewer:
Does this PR introduce a user-facing change?: