Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] RBAC - Make upgrade agent APIs space aware #190069

Merged

Conversation

jillguyonnet
Copy link
Contributor

@jillguyonnet jillguyonnet commented Aug 7, 2024

Summary

Relates to #185040

This PR makes the following Fleet agents API space aware (behind useSpaceAwareness feature flag):

  • POST /agents/{agentId}/reassign
  • POST /agents/{agentId}/upgrade
  • POST /agents/bulk_reassign
  • POST /agents/bulk_upgrade
  • POST /agents/{agentId}/actions/{actionId}/cancel

While working on that last endpoint, I noticed and fixed an error in the documentation.

Checklist

@jillguyonnet jillguyonnet self-assigned this Aug 7, 2024
@jillguyonnet jillguyonnet added the Team:Fleet Team label for Observability Data Collection Fleet team label Aug 7, 2024
@obltmachine
Copy link

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • /oblt-deploy : Deploy a Kibana instance using the Observability test environments.
  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@jillguyonnet jillguyonnet added the release_note:skip Skip the PR/issue when compiling release notes label Aug 7, 2024
@jillguyonnet jillguyonnet requested a review from nchaulet August 7, 2024 15:05
@jillguyonnet
Copy link
Contributor Author

/ci

@jillguyonnet jillguyonnet marked this pull request as ready for review August 8, 2024 08:03
@jillguyonnet jillguyonnet requested review from a team as code owners August 8, 2024 08:03
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@kibanamachine
Copy link
Contributor

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#6716

[✅] x-pack/test/fleet_api_integration/config.space_awareness.ts: 200/200 tests passed.

see run history

@@ -167,6 +169,8 @@ export async function upgradeBatch(

const actionId = options.actionId ?? uuidv4();
const total = options.total ?? givenAgents.length;
const currentNameSpace = getCurrentNamespace(soClient);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like this code could also run in a background task in that case the soClient namespace will not be correct (will be default),
we may want to use the agent namespace instead to populate that field

Copy link
Contributor Author

@jillguyonnet jillguyonnet Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for flagging this up! I've made a change (pass space id to batch function), let me know if it looks ok.

Copy link
Contributor Author

@jillguyonnet jillguyonnet Aug 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking more about this: in the current state, it looks like spaceId might be undefined in case this runs as a background task. A solution could be to modify the action runner to take a space id.

I'm not sure about using the namespaces property of (one of the) agents: could there be a scenario where we have agents belonging to multiple spaces? e.g. agent1 in spaces A and B, agent 2 in spaces B and C. In that case, we couldn't use agents to assign a space to the action.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as the task will come from an API we will probably have a namespace in the API request that we could pass to the task and use as a parameter for the action runner, does that make sense to you?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this would affect the case when an action runner is used, for example for bulk reassign when the number of agents exceeds the batch size:

return await new ReassignActionRunner(
esClient,
soClient,
{
...options,
batchSize,
total: res.total,
newAgentPolicyId,
},
{ pitId: await openPointInTime(esClient) }
).runActionAsyncWithRetry();

I should test this properly, but I expect that in this case the current solution will not provide a namespace. Do you mean that the API request should pass it (or already is)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think something like this should work

{ 
     ...options, 
spaceId: soClient.getCurrentNamespace(),
     batchSize, 
     total: res.total, 
     newAgentPolicyId, 
   }, 

Copy link
Contributor

@kilfoyle kilfoyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for platform-docs team! 👍

@@ -442,7 +440,7 @@ async function _filterAgents(
runtime_mappings: runtimeFields,
fields: Object.keys(runtimeFields),
sort: [{ [sortField]: { order: sortOrder } }],
query: { bool: { filter: query } },
query: addNamespaceFilteringToQuery({ bool: { filter: [query] } }, currentNameSpace),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nchaulet What do you think of this change? I realised that a lot of bulk endpoints were calling getAgentsById and then checking isAgentInNamespace on each agent. With this, getAgentsById directly adds a namespace filter in the ES query. The downside is we lose the more specific "agent not in namespace" error message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems okay to me 👍

nchaulet
nchaulet previously approved these changes Aug 9, 2024
Copy link
Member

@nchaulet nchaulet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@jillguyonnet jillguyonnet dismissed nchaulet’s stale review August 9, 2024 13:00

After discussion, the case of background job should be addressed and tested in this PR

@nchaulet nchaulet requested a review from juliaElastic August 14, 2024 15:00
@jillguyonnet
Copy link
Contributor Author

@elasticmachine merge upstream

@kibana-ci
Copy link
Collaborator

💛 Build succeeded, but was flaky

Failed CI Steps

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @jillguyonnet

@nchaulet nchaulet requested a review from juliaElastic August 19, 2024 11:55
Copy link
Member

@nchaulet nchaulet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@jillguyonnet
Copy link
Contributor Author

Merging as agreed offline since Julia is away this week.

@jillguyonnet jillguyonnet merged commit 1e64b9e into elastic:main Aug 19, 2024
23 checks passed
@jillguyonnet jillguyonnet deleted the fleet/185040-rbac-agents-write-api-3 branch August 19, 2024 14:22
@kibanamachine kibanamachine added v8.16.0 backport:skip This commit does not require backporting labels Aug 19, 2024
jillguyonnet added a commit that referenced this pull request Aug 30, 2024
## Summary

Closes #185040

Followup to:
#188507
#189519
#190069

This PR contains the last required changed for making Fleet agents write
APIs space aware:
* Implement space awareness in the following endpoints:
   * `POST /agents/{agentId}/unenroll`
   * `POST /agents/{agentId}/request_diagnostics`
   * `POST /agents/bulk_unenroll`
   * `POST /agents/bulk_request_diagnostics`
* Fix a bug in `POST /agents/bulk_update_agent_tags` where the space id
was not passed.
* Simply the setting of the `namespaces` property when creating agent
actions when the space id is known.
* Rename `currentNamespace` to `currentSpaceId` where appropriate (see
comment below).
* Add API integration tests and consolidate existing ones. ~⚠️ At the
time of writing, I would like there to be more tests covering bulk query
processing in batches, which are currently lacking. I have experienced
difficulties getting those tests to pass consistently.~ Filed [followup
issue](#191643) for those.

### A note on terminology

As pointed out in
#191083 (comment),
it seems that the terms "namespace" and "space id" are occasionally used
interchangeably in some parts of the codebase to refer to a Kibana
space. For instance, documents in Fleet indices (agents, agent policies,
agent actions...) [possess a `namespaces`
property](elastic/elasticsearch#108363) to track
the spaces they belong to. The current space id is also returned using
the Saved Object client's `getCurrentNamespace` function.

However, "namespace" is also a datastream property. In the Agent policy
settings UI, the "Spaces" property (which will be linked to the saved
object's `namespaces` property) is above the "Default namespace"
property, which relates to the integration's data streams:
<img width="1916" alt="Screenshot 2024-08-26 at 14 51 10"
src="https://github.com/user-attachments/assets/fe2a0948-3387-4a93-96dc-90fc5cf1a683">

This should not be a source of major issues, but is best clarified for
future reference. In this PR, I've replaced some occurrences of
`namespace` with `spaceId` where appropriate to try to maximise the use
of the latter.

### Testing

* This PR should be put through the Flaky Test Runner prior to merging
(I will kick the job).
* Manual testing should also be performed for the new endpoints
mentioned above.

### Checklist

- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [x] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed

---------

Co-authored-by: Elastic Machine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting release_note:skip Skip the PR/issue when compiling release notes Team:Fleet Team label for Observability Data Collection Fleet team v8.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants