-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bulk API Keys update #1779
Bulk API Keys update #1779
Conversation
@michalpristas does it relates #1581 or should it close it? |
@jlind23 this is exactly it, i also linked it as a related issue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you thought about how to scale test this? Are you able to measure that this is more efficient?
internal/pkg/bulk/opApiKey.go
Outdated
) | ||
|
||
const ( | ||
expectedAPIKeySize = 64 // 64B |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do these numbers come from? How would we know if they need to change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i took api key and added some buffer.
but knowing api key size i can do this dynamically, good point
not sure how to approach scale tests. |
This pull request is now in conflicts. Could you fix it @michalpristas? 🙏
|
Yes I believe we should expect to see much lower ES CPU usage, especially during bulk policy reassignments. Unfortunately, I don't believe we yet have a way to build a custom elastic-agent image to use in a scale test before we merge (this is something I will follow up on solving). I think we should just try to merge this and take a look at how it impacts scale tests once it's in 8.5 snapshots. If there's an issue we can revert this. |
This pull request is now in conflicts. Could you fix it @michalpristas? 🙏
|
1171183
to
607ab12
Compare
internal/pkg/api/handleAck.go
Outdated
zlog.Info(). | ||
Err(err). | ||
Str("id", apiKeyID). | ||
Msg("Failed to read API Key roles") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Blocker]
Please don't log an error as info, it's confusing and misleading. If it isn't critical, use warning then.
internal/pkg/api/handleAck.go
Outdated
if err != nil { | ||
zlog.Info(). | ||
Err(err). | ||
RawJSON("roles", res.RoleDescriptors). | ||
Str("id", apiKeyID). | ||
Msg("Failed to cleanup roles") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Blocker]
same as above
internal/pkg/api/handleAck.go
Outdated
Msg("Failed to cleanup roles") | ||
} else if removedCount > 0 { | ||
if err := ack.bulk.APIKeyUpdate(ctx, apiKeyID, permissionHash, clean); err != nil { | ||
zlog.Info().Err(err).RawJSON("roles", clean).Str("id", apiKeyID).Msg("Failed to refresh API Key") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Blocker]
Same as above
internal/pkg/api/handleAck.go
Outdated
Msg("Failed to cleanup roles") | ||
} else if removedCount > 0 { | ||
if err := ack.bulk.APIKeyUpdate(ctx, apiKeyID, permissionHash, clean); err != nil { | ||
zlog.Info().Err(err).RawJSON("roles", clean).Str("id", apiKeyID).Msg("Failed to refresh API Key") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Suggestion]
Keep it consistent, if the method is update, use the same term in the log
zlog.Info().Err(err).RawJSON("roles", clean).Str("id", apiKeyID).Msg("Failed to refresh API Key") | |
zlog.Info().Err(err).RawJSON("roles", clean).Str("id", apiKeyID).Msg("Failed to update API Key") |
internal/pkg/api/handleAck.go
Outdated
if err != nil { | ||
zlog.Info(). | ||
Err(err). | ||
Str("id", apiKeyID). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not apiKeyID
or api.key.id
?
internal/pkg/policy/policy_output.go
Outdated
zlog.Warn().Msg("Failed to find a key for role assignement.") | ||
|
||
zlog.Debug(). | ||
RawJSON("roles", new.Raw). | ||
Str("candidate", k). | ||
Msg("Failed to find a key for role assignement.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Blocker]
Again, please avoid 2 different logs with the same message
// creates new key for permissions | ||
// marks it stale so they can be removed later | ||
if _, exists := m[candidate]; !exists { | ||
return fmt.Sprintf("%s-0-rdstale", candidate) | ||
} | ||
|
||
// 1 should be enough, 100 is just to have some space | ||
for i := 0; i < 100; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could let just the for
, no need for this first try, besides the for
is trying %s-0-rdstale
again
…e same ES" (#1879) * Revert "Revert "Fix v8.5.0 migration painless script" (#1878)" This reverts commit ef9ca2b. * Revert "Revert "Allow multiple ES outputs as long as they are the same ES (#1684)"" This reverts commit bb696ac. * avoid new API keys being marked for invalidation Co-authored-by: Michal Pristas <[email protected]> He fixed the merge conflicts after Bulk API Keys update (#1779), commit 46ac14b, got merged
* [Automation] Update elastic stack version to 8.5.0-6b9f92c0 for testing (#1756) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-0616acda for testing (#1760) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-dd6f2bb0 for testing (#1765) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-feb644de for testing (#1768) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-7783a03c for testing (#1776) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-17b8a62d for testing (#1780) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-9aed3b11 for testing (#1784) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-440e0896 for testing (#1788) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-fedc3e60 for testing (#1791) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-b5001a6d for testing (#1795) Co-authored-by: apmmachine <[email protected]> * ci: move to fleet-ci (#1199) * Fic path to the packaging (#1806) * Fix gcs credentials for packaging (#1807) * [Automation] Update elastic stack version to 8.5.0-de69302b for testing (#1822) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-1bd77fc1 for testing (#1826) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-167dfc80 for testing (#1831) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-6b7dda2d for testing (#1835) Co-authored-by: apmmachine <[email protected]> * Allow multiple ES outputs as long as they are the same ES (#1684) * add 'outputs' field to the ES agent schema to store the API key data and permission hash for each ES output * add output name to API key metadata * add v8.5 migration to migration.go * add migration docs and improve logging * group migration functions per version * [Automation] Update elastic stack version to 8.5.0-4140365c for testing (#1837) Co-authored-by: apmmachine <[email protected]> * updating upgrade_status: completed (#1833) * updating upgrade_status: completed * updated schema.json and regenerated schema.go * updated license headers * Fix v8.5.0 migration painless script (#1839) * fix v8.5.0 migration painless script * [Automation] Update elastic stack version to 8.5.0-8e906f9f for testing (#1843) Co-authored-by: apmmachine <[email protected]> * ci: rename dra staging for release dra release staging (#1840) * Remove events from agent checkin body. (#1842) Remove the events attribute from the agent checkin body. Note that removal of the attribute will not stop the server from issuing a 400 if the response body is too long. The removal is so that the checkin code on the fleet-server and agent remain comparable. Co-authored-by: Blake Rouse <[email protected]> * [Automation] Update elastic stack version to 8.5.0-589a4a10 for testing (#1852) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-37418cf3 for testing (#1855) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-fcf3d4c2 for testing (#1862) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-c7913db3 for testing (#1868) Co-authored-by: apmmachine <[email protected]> * Add error detail to catch-all HTTP response (#1854) * Make authc log debug and add cache hit field (#1870) * Document Go 1.18 certificate change in changelog. (#1871) * Revert "Fix v8.5.0 migration painless script" (#1878) * Revert "Fix v8.5.0 migration painless script (#1839)" This reverts commit de5d74b. * Revert "Allow multiple ES outputs as long as they are the same ES (#1684)" This reverts commit 63fdcbf. * [Automation] Update elastic stack version to 8.5.0-56d2c52d for testing (#1880) Co-authored-by: apmmachine <[email protected]> * Bulk API Keys update (#1779) Bulk API Keys update (#1779) * Fix and reintroduce "Allow multiple ES outputs as long as they are the same ES" (#1879) * Revert "Revert "Fix v8.5.0 migration painless script" (#1878)" This reverts commit ef9ca2b. * Revert "Revert "Allow multiple ES outputs as long as they are the same ES (#1684)"" This reverts commit bb696ac. * avoid new API keys being marked for invalidation Co-authored-by: Michal Pristas <[email protected]> He fixed the merge conflicts after Bulk API Keys update (#1779), commit 46ac14b, got merged * [Automation] Update elastic stack version to 8.5.0-7dc445a0 for testing (#1888) Co-authored-by: apmmachine <[email protected]> * Update pre-sets limits to avoid overlap. (#1891) Update file max limits and env_defaults_test.go running make defaults to generate the new one * [Release] add-backport-next (#1892) * Bump version to 8.6.0 (#1895) * Catch error in waitBulkAction. Add bulk.WithRetryOnConflict(3) in multiple places. (#1896) * Catch error in waitBulkAction. Add bulk.WithRetryOnConflict(3) in multiple places. * Add changelog entry. * Update CHANGELOG.next.asciidoc Co-authored-by: Craig MacKenzie <[email protected]> Co-authored-by: Craig MacKenzie <[email protected]> * Update apikey.cache_hit log field name to match convention (#1900) * [Automation] Update elastic stack version to 8.6.0-21651da3 for testing (#1908) Co-authored-by: apmmachine <[email protected]> * LoadLimits does not override existing values (#1912) Fleet-server will use any specified cache or server limit values over whatever is returned by the default/agent number loader. For example, if A max body size is specifically set to a value such as 5MB, and the default returned by the LoadLimits is 1MB, the 5MB value is used. * [Automation] Update elastic stack version to 8.6.0-326f84b0 for testing (#1916) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-df00693f for testing (#1925) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-a2f4f140 for testing (#1928) Co-authored-by: apmmachine <[email protected]> * Revert "updating upgrade_status: completed (#1833)" (#1920) * Revert "updating upgrade_status: completed (#1833)" This reverts commit 23be42a. * Leaving in upgrade_status field for retry functionality * Storing checkin message in last_checkin_message (#1932) * Storing checkin message in last_checkin_message * added changelog * fixed tests * Unique limiters for each API listener (#1904) * Unique limiters for each API listener Refactor the limit.Limiter so it can wrap the separate API httprouter endpoints. Limiter.WrapX() calls take the handler and stats incrementer for metrics/error counting. api.Run() replaced with Router.Run(), which will generate an httprouter for each listener in order to be able to associate the httprouter with a unique Limiter. * Add listener address labeled logs to limiter * Review feedback * Apply suggestions from code review Co-authored-by: Anderson Queiroz <[email protected]> * review feedback * fix import * Fix test Co-authored-by: Anderson Queiroz <[email protected]> * Cleanup cmd/fleet/main.go (#1886) * Replace cache.Config with config.Cache * Move server setup from cmd/fleet to new pkg/server * Move constants * Fix imports and integration tests * fix linter * [Automation] Update elastic stack version to 8.6.0-158a13db for testing (#1938) Co-authored-by: apmmachine <[email protected]> * [8.6](forwardport) Add extra protection against accessing null fields to 8.5 migration (#1921) (#1926) * [Automation] Update elastic stack version to 8.6.0-aea1c645 for testing (#1942) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-0fca2953 for testing (#1948) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-e4c15f15 for testing (#1954) Co-authored-by: apmmachine <[email protected]> * Conditional log level for api key read (#1946) Conditional log level for api key read (#1946) * Updated migration query to match items with deprecated field present (#1959) Co-authored-by: Anderson Queiroz <[email protected]> * Fix fleet.migration.total log key overlap (#1951) Co-authored-by: Anderson Queiroz <[email protected]> * [Automation] Update elastic stack version to 8.6.0-d939cfde for testing (#1964) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-7c9f25a9 for testing (#1969) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-c49fac70 for testing (#1976) Co-authored-by: apmmachine <[email protected]> * Update to Go 1.18.7. (#1978) * [Automation] Update elastic stack version to 8.6.0-5a8d757d for testing (#1981) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-40086bc7 for testing (#1987) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-233dc5d4 for testing (#1990) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-54a302f0 for testing (#1995) Co-authored-by: apmmachine <[email protected]> * Don't send POLICY_CHANGE actions retrieved from index to agent. (#1963) * Don't send POLICY_CHANGE actions retrieved from index to agent. The fleet-server should not send any policy change actions that are written to the actions index to an agent on checkin. The server will remove these actions in the convert method and emit a warning message. The ack token that is used is not altered in this case. Policy change actions are dynamically generated by the fleet-server when it detects that the agent is not running an up to date version of the policy. * move filtering to its own method * Fix linter, tests, fix file name * [Automation] Update elastic stack version to 8.6.0-cae815eb for testing (#2000) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-6545f2df for testing (#2005) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-055acc83 for testing (#2011) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-baf193e8 for testing (#2016) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-22d60ec9 for testing (#2020) Co-authored-by: apmmachine <[email protected]> * Allow upgrade action to signal retry (#1887) * Allow upgrade action to signal retry Allow the ack of an upgrade action to set the upgrade status to retrying. * fix tests set failed state * Fix broken test * nil upgrade status by default * Set agent to healthy in case of upgrade failure * fix upgrade fields * Fix tests * [Automation] Update elastic stack version to 8.6.0-b8b35931 for testing (#2024) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-a892f234 for testing (#2030) Co-authored-by: apmmachine <[email protected]> * [Automation] Add GH action to add issues to ingest board Issues in this repo labeled with `Team:Fleet` will be added to the ingest board automatically w/ the `Fleet Server` area. * Update add-issues-to-ingest-board.yml * [Automation] Update elastic stack version to 8.6.0-89d224d2 for testing (#2034) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-949a38d2 for testing (#2039) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-26dc1164 for testing (#2045) Co-authored-by: apmmachine <[email protected]> * Add active filter for enrollment key queries. (#2044) * Add active filter for enrollment key queries. Add an active: true filter to enrollment key queries. This allows fleet-server to handle cases where there may be 10+ inactive keys associated with a policy. * review feedback * fix linter * fix tests * Fix test cases * [Automation] Update elastic stack version to 8.6.0-4765d2b0 for testing (#2048) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-8a615646 for testing (#2050) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-3f5f98b7 for testing (#2051) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-f20b7179 for testing (#2056) Co-authored-by: apmmachine <[email protected]> * Run mod tidy. * Run make notice. * Fix intergration tests. * Run go mod tidy and make notice. * Fix path to fleet-server.yml in integration test. * Fix race condition. * Fix try 2. * Fix race. * Fix race try 2. Co-authored-by: apmmachine <[email protected]> Co-authored-by: apmmachine <[email protected]> Co-authored-by: Victor Martinez <[email protected]> Co-authored-by: Anderson Queiroz <[email protected]> Co-authored-by: Julia Bardi <[email protected]> Co-authored-by: Michel Laterman <[email protected]> Co-authored-by: Josh Dover <[email protected]> Co-authored-by: Craig MacKenzie <[email protected]> Co-authored-by: Michal Pristas <[email protected]> Co-authored-by: Julien Lind <[email protected]> Co-authored-by: Elastic Machine <[email protected]> Co-authored-by: Anderson Queiroz <[email protected]> Co-authored-by: Kyle Pollich <[email protected]>
* Support for Elastic Agent V2 status (#1747) * Support for Elastic Agent V2 status * Make 'make check-ci' happy * Add a check that 'components' is valid array * Rename variable to better reflect it's meaning * [v2] Switch to Elastic Agent v2 control protocol (#1751) * Switch to new client.V2 for communication with Elastic Agent. * Fix tests. * Fix integration tests. * Update go.sum. * Fix some lint issues. * Fix panic with agentInfo. * Fix panic in logger reconfigure. * Fixes for switching units. * updated version (#2014) * Update the elastic-agent-client to latest version. (#2061) * [v2] Merge main as of Nov 7 (#2062) * [Automation] Update elastic stack version to 8.5.0-6b9f92c0 for testing (#1756) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-0616acda for testing (#1760) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-dd6f2bb0 for testing (#1765) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-feb644de for testing (#1768) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-7783a03c for testing (#1776) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-17b8a62d for testing (#1780) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-9aed3b11 for testing (#1784) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-440e0896 for testing (#1788) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-fedc3e60 for testing (#1791) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-b5001a6d for testing (#1795) Co-authored-by: apmmachine <[email protected]> * ci: move to fleet-ci (#1199) * Fic path to the packaging (#1806) * Fix gcs credentials for packaging (#1807) * [Automation] Update elastic stack version to 8.5.0-de69302b for testing (#1822) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-1bd77fc1 for testing (#1826) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-167dfc80 for testing (#1831) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-6b7dda2d for testing (#1835) Co-authored-by: apmmachine <[email protected]> * Allow multiple ES outputs as long as they are the same ES (#1684) * add 'outputs' field to the ES agent schema to store the API key data and permission hash for each ES output * add output name to API key metadata * add v8.5 migration to migration.go * add migration docs and improve logging * group migration functions per version * [Automation] Update elastic stack version to 8.5.0-4140365c for testing (#1837) Co-authored-by: apmmachine <[email protected]> * updating upgrade_status: completed (#1833) * updating upgrade_status: completed * updated schema.json and regenerated schema.go * updated license headers * Fix v8.5.0 migration painless script (#1839) * fix v8.5.0 migration painless script * [Automation] Update elastic stack version to 8.5.0-8e906f9f for testing (#1843) Co-authored-by: apmmachine <[email protected]> * ci: rename dra staging for release dra release staging (#1840) * Remove events from agent checkin body. (#1842) Remove the events attribute from the agent checkin body. Note that removal of the attribute will not stop the server from issuing a 400 if the response body is too long. The removal is so that the checkin code on the fleet-server and agent remain comparable. Co-authored-by: Blake Rouse <[email protected]> * [Automation] Update elastic stack version to 8.5.0-589a4a10 for testing (#1852) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-37418cf3 for testing (#1855) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-fcf3d4c2 for testing (#1862) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.5.0-c7913db3 for testing (#1868) Co-authored-by: apmmachine <[email protected]> * Add error detail to catch-all HTTP response (#1854) * Make authc log debug and add cache hit field (#1870) * Document Go 1.18 certificate change in changelog. (#1871) * Revert "Fix v8.5.0 migration painless script" (#1878) * Revert "Fix v8.5.0 migration painless script (#1839)" This reverts commit de5d74b. * Revert "Allow multiple ES outputs as long as they are the same ES (#1684)" This reverts commit 63fdcbf. * [Automation] Update elastic stack version to 8.5.0-56d2c52d for testing (#1880) Co-authored-by: apmmachine <[email protected]> * Bulk API Keys update (#1779) Bulk API Keys update (#1779) * Fix and reintroduce "Allow multiple ES outputs as long as they are the same ES" (#1879) * Revert "Revert "Fix v8.5.0 migration painless script" (#1878)" This reverts commit ef9ca2b. * Revert "Revert "Allow multiple ES outputs as long as they are the same ES (#1684)"" This reverts commit bb696ac. * avoid new API keys being marked for invalidation Co-authored-by: Michal Pristas <[email protected]> He fixed the merge conflicts after Bulk API Keys update (#1779), commit 46ac14b, got merged * [Automation] Update elastic stack version to 8.5.0-7dc445a0 for testing (#1888) Co-authored-by: apmmachine <[email protected]> * Update pre-sets limits to avoid overlap. (#1891) Update file max limits and env_defaults_test.go running make defaults to generate the new one * [Release] add-backport-next (#1892) * Bump version to 8.6.0 (#1895) * Catch error in waitBulkAction. Add bulk.WithRetryOnConflict(3) in multiple places. (#1896) * Catch error in waitBulkAction. Add bulk.WithRetryOnConflict(3) in multiple places. * Add changelog entry. * Update CHANGELOG.next.asciidoc Co-authored-by: Craig MacKenzie <[email protected]> Co-authored-by: Craig MacKenzie <[email protected]> * Update apikey.cache_hit log field name to match convention (#1900) * [Automation] Update elastic stack version to 8.6.0-21651da3 for testing (#1908) Co-authored-by: apmmachine <[email protected]> * LoadLimits does not override existing values (#1912) Fleet-server will use any specified cache or server limit values over whatever is returned by the default/agent number loader. For example, if A max body size is specifically set to a value such as 5MB, and the default returned by the LoadLimits is 1MB, the 5MB value is used. * [Automation] Update elastic stack version to 8.6.0-326f84b0 for testing (#1916) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-df00693f for testing (#1925) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-a2f4f140 for testing (#1928) Co-authored-by: apmmachine <[email protected]> * Revert "updating upgrade_status: completed (#1833)" (#1920) * Revert "updating upgrade_status: completed (#1833)" This reverts commit 23be42a. * Leaving in upgrade_status field for retry functionality * Storing checkin message in last_checkin_message (#1932) * Storing checkin message in last_checkin_message * added changelog * fixed tests * Unique limiters for each API listener (#1904) * Unique limiters for each API listener Refactor the limit.Limiter so it can wrap the separate API httprouter endpoints. Limiter.WrapX() calls take the handler and stats incrementer for metrics/error counting. api.Run() replaced with Router.Run(), which will generate an httprouter for each listener in order to be able to associate the httprouter with a unique Limiter. * Add listener address labeled logs to limiter * Review feedback * Apply suggestions from code review Co-authored-by: Anderson Queiroz <[email protected]> * review feedback * fix import * Fix test Co-authored-by: Anderson Queiroz <[email protected]> * Cleanup cmd/fleet/main.go (#1886) * Replace cache.Config with config.Cache * Move server setup from cmd/fleet to new pkg/server * Move constants * Fix imports and integration tests * fix linter * [Automation] Update elastic stack version to 8.6.0-158a13db for testing (#1938) Co-authored-by: apmmachine <[email protected]> * [8.6](forwardport) Add extra protection against accessing null fields to 8.5 migration (#1921) (#1926) * [Automation] Update elastic stack version to 8.6.0-aea1c645 for testing (#1942) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-0fca2953 for testing (#1948) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-e4c15f15 for testing (#1954) Co-authored-by: apmmachine <[email protected]> * Conditional log level for api key read (#1946) Conditional log level for api key read (#1946) * Updated migration query to match items with deprecated field present (#1959) Co-authored-by: Anderson Queiroz <[email protected]> * Fix fleet.migration.total log key overlap (#1951) Co-authored-by: Anderson Queiroz <[email protected]> * [Automation] Update elastic stack version to 8.6.0-d939cfde for testing (#1964) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-7c9f25a9 for testing (#1969) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-c49fac70 for testing (#1976) Co-authored-by: apmmachine <[email protected]> * Update to Go 1.18.7. (#1978) * [Automation] Update elastic stack version to 8.6.0-5a8d757d for testing (#1981) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-40086bc7 for testing (#1987) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-233dc5d4 for testing (#1990) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-54a302f0 for testing (#1995) Co-authored-by: apmmachine <[email protected]> * Don't send POLICY_CHANGE actions retrieved from index to agent. (#1963) * Don't send POLICY_CHANGE actions retrieved from index to agent. The fleet-server should not send any policy change actions that are written to the actions index to an agent on checkin. The server will remove these actions in the convert method and emit a warning message. The ack token that is used is not altered in this case. Policy change actions are dynamically generated by the fleet-server when it detects that the agent is not running an up to date version of the policy. * move filtering to its own method * Fix linter, tests, fix file name * [Automation] Update elastic stack version to 8.6.0-cae815eb for testing (#2000) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-6545f2df for testing (#2005) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-055acc83 for testing (#2011) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-baf193e8 for testing (#2016) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-22d60ec9 for testing (#2020) Co-authored-by: apmmachine <[email protected]> * Allow upgrade action to signal retry (#1887) * Allow upgrade action to signal retry Allow the ack of an upgrade action to set the upgrade status to retrying. * fix tests set failed state * Fix broken test * nil upgrade status by default * Set agent to healthy in case of upgrade failure * fix upgrade fields * Fix tests * [Automation] Update elastic stack version to 8.6.0-b8b35931 for testing (#2024) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-a892f234 for testing (#2030) Co-authored-by: apmmachine <[email protected]> * [Automation] Add GH action to add issues to ingest board Issues in this repo labeled with `Team:Fleet` will be added to the ingest board automatically w/ the `Fleet Server` area. * Update add-issues-to-ingest-board.yml * [Automation] Update elastic stack version to 8.6.0-89d224d2 for testing (#2034) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-949a38d2 for testing (#2039) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-26dc1164 for testing (#2045) Co-authored-by: apmmachine <[email protected]> * Add active filter for enrollment key queries. (#2044) * Add active filter for enrollment key queries. Add an active: true filter to enrollment key queries. This allows fleet-server to handle cases where there may be 10+ inactive keys associated with a policy. * review feedback * fix linter * fix tests * Fix test cases * [Automation] Update elastic stack version to 8.6.0-4765d2b0 for testing (#2048) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-8a615646 for testing (#2050) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-3f5f98b7 for testing (#2051) Co-authored-by: apmmachine <[email protected]> * [Automation] Update elastic stack version to 8.6.0-f20b7179 for testing (#2056) Co-authored-by: apmmachine <[email protected]> * Run mod tidy. * Run make notice. * Fix intergration tests. * Run go mod tidy and make notice. * Fix path to fleet-server.yml in integration test. * Fix race condition. * Fix try 2. * Fix race. * Fix race try 2. Co-authored-by: apmmachine <[email protected]> Co-authored-by: apmmachine <[email protected]> Co-authored-by: Victor Martinez <[email protected]> Co-authored-by: Anderson Queiroz <[email protected]> Co-authored-by: Julia Bardi <[email protected]> Co-authored-by: Michel Laterman <[email protected]> Co-authored-by: Josh Dover <[email protected]> Co-authored-by: Craig MacKenzie <[email protected]> Co-authored-by: Michal Pristas <[email protected]> Co-authored-by: Julien Lind <[email protected]> Co-authored-by: Elastic Machine <[email protected]> Co-authored-by: Anderson Queiroz <[email protected]> Co-authored-by: Kyle Pollich <[email protected]> Co-authored-by: Aleksandr Maus <[email protected]> Co-authored-by: Michal Pristas <[email protected]> Co-authored-by: apmmachine <[email protected]> Co-authored-by: apmmachine <[email protected]> Co-authored-by: Victor Martinez <[email protected]> Co-authored-by: Anderson Queiroz <[email protected]> Co-authored-by: Julia Bardi <[email protected]> Co-authored-by: Michel Laterman <[email protected]> Co-authored-by: Josh Dover <[email protected]> Co-authored-by: Craig MacKenzie <[email protected]> Co-authored-by: Julien Lind <[email protected]> Co-authored-by: Elastic Machine <[email protected]> Co-authored-by: Anderson Queiroz <[email protected]> Co-authored-by: Kyle Pollich <[email protected]>
What is the problem this PR solves?
Instead of creating and invalidating API keys with every policy change it updates them.
On top of that this PR modifies bulker so updates are grouped by role applied.
created
updated
invalidated
Bulker contains new queue where requests contains ID of an agent and proposed role descriptors.
On flush logic goes through the queue and composes IDs per roles. then it makes a single request to ES.
In case content size of this request is larger than predefined value. it splits IDs to make it suitable. value is configured and is defaulting to 100MB as configured in ES.
Grouping is not that optimal as I would like but we need to keep in mind and make sure that if 2 changes for one ID are in a queue only latest is applied.
the tricky thing is to clean roles on ACK as we don't have a way of persisting desired roles.
How this PR approaches the problem is that instead of persisting roles in agent structure it marks them stale in persisted API Key
e.g we remove network integration, we ended up with api key containing two sets of roles. set of roles is a map with guid as a key
e.g
when removing mongo we create a desired role description set and add this one with modified key by appending
-rdstale
also to avoid naming conflicts. sometimes content of this changes and we ended up with reduced descriptor set (e.g when disabling monitoring.logs but keepiing monitoring.metrics or disabling metrics for mongo integration)then on ack we iterate through api key keys and leave out everything with
rdstale
suffixTo test this you need latest ES. We are using new Update API and modified
GET apikey
which also returns role descriptions. this allows us not to persist roles and rely on ES.How to test this PR locally
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Related issues
Closes #1581