You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fleet Server hangs on this and won't be able to fully startup and accept requests from Agents
The migration will run in a loop when there are version conflicts on agents it tried to update, which can happen constantly at scale when there are frequent checkins from the online agents.
This usually will eventually resolve itself as the migration code races the Agent checkins, however in this case it's never completeing because this query for the migration is returning all agents, even the ones that are already on the new schema:
Notice the extra default key between outputs > api_key. When doing a "must not exists" query in ES, ES finds no indexed fields under outputs because default is not indexed. This results in all agents being returned every time.
I see a few paths forward to solve this:
Change the query to instead look for agents that have the old fields present rather than missing the new fields. I think this would be a good idea anyways to do in case we ever have a case where there is no output but there is also no old field either.
Fix the mappings by using a dynamic_template to fit the schema of the documents correctly
This would require less changes than the next option, but I don't like it because it results in more mapped fields than necessary
Change the shape of the documents to an array and use a nested field mapping
This requires more changes to Fleet Server and we're getting close to the 8.5 release.
Results in fewer mapped fields, better long-term option
Migrating to this later will be more overall work and risk IMO
In summary, I think it would be best to do 1 and 3.
The text was updated successfully, but these errors were encountered:
The migration will run in a loop when there are version conflicts on agents it tried to update, which can happen constantly at scale when there are frequent checkins from the online agents.
This usually will eventually resolve itself as the migration code races the Agent checkins, however in this case it's never completeing because this query for the migration is returning all agents, even the ones that are already on the new schema:
fleet-server/internal/pkg/dl/migration.go
Line 226 in b267e7a
The reason this returns all agents is because the mappings for this field is this:
But the migrated agent documents look like this:
Notice the extra
default
key betweenoutputs
>api_key
. When doing a "must not exists" query in ES, ES finds no indexed fields underoutputs
becausedefault
is not indexed. This results in all agents being returned every time.I see a few paths forward to solve this:
dynamic_template
to fit the schema of the documents correctlynested
field mappingIn summary, I think it would be best to do 1 and 3.
The text was updated successfully, but these errors were encountered: