-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Upgrade Assistant] Warn if cluster's node attributes and data tiers may not match #83800
Comments
Pinging @elastic/platform-deployment-management (Team:Deployment Management) |
@jakelandis Do you think we could add this warning to the Deprecation Info API? |
Should we be making value judgements for on-prem users based on an arbitrary node attribute, given that the attribute name could be anything (using |
@dakrone It sounds like you're concerned about providing guidance based on erroneous assumptions. Is that right? The way I read Ryan's suggestion, it sounded less about making assumptions, and more about pointing out possibilities and making suggestions. Bits that stood out to me in bold:
If we can craft a message that explains what ES has observed in the configuration and clearly explains how the user can determine whether it truly is a problem or not, then I think we can help some users and also reduce the risk of confusing others. |
@dakrone / @jakelandis thoughts on CJ's latest comment? |
It is not just a cloud convention, this is in numerous examples we've been advocating for since we introduced ILM: Yes it is absolutely possible a customer uses something other than |
Okay, I think this makes sense then, however, I would suggest a change:
+1 on it being a warning, but the validity is kind of confusing, given that it is still valid, just not recommended (attribute functionality is not going to be removed any time soon). So perhaps we can list a warning that the configuration is not recommended instead? |
OK, here's my proposed message:
@VimCommando Does this contain all of the information the user needs? Do we already have a docs page we can link the user to? @dakrone Do you know if this is pertinent to Cloud users or is this something Cloud will address automatically? |
This should be handled automatically by Cloud, which is part of my concern about it, since if a Cloud user were to see this, there is literally nothing they can do about it (short of clicking the "migrate to data tiers" button on their deployment configuration I think), which means it can be frustrating to have a warning they can't get rid of. For a regular warning, I think we should also do something like:
Which is more what I was thinking when I read "warn if cluster's node attributes and data tiers may not match". For the
But that's just my preference, so maybe there is a better way! |
Didn't y'all implement a blocklist on Cloud for settings which will be excluded from the Deprecation Info API output? I think that would take care of this case, right? Thanks for taking a pass at the copy. I think you and Ryan are probably the best folks to work out the details that need to go into the message, and then one of the writers can help with the phrasing once you're both in agreement. |
We already have CRITICAL level warnings for node.data, node.master, etc. since they are not supported in 8.0. We could provide better guidance for data tiers in the fly out, but what is there is correct. [1] We don't currently provide any warnings for custom node attributes...I agree that is some value in a warning for custom
Also, i wonder we should just expand the fly out documentation for the CRITICAL warning (since people will actually read that!) with data tier reference(s). (but that only impacts non-default clusters where a user expclicitly configured node.data, node.master, etc.)
I think that would work to hide from Cloud. |
This sounds like a good warning to me, as long as the documentation it links to is clear. Right now we don't even mention this issue in the 8.0 migration guide. https://www.elastic.co/guide/en/elasticsearch/reference/8.0/migrating-8.0.html The biggest problem with suggesting you remove |
@dakrone / @jakelandis is it OK if I transfer this issue to Elasticsearch? If I'm understanding correctly, the proposed change will be made in the deprecation info API and there isn't any additional work needed on the UA side. |
I think that's okay, yes the change would be on the deprecation API side. |
Pinging @elastic/es-data-management (Team:Data Management) |
I am going to take this one, but I'm a little unclear on where we settled. Are we just updating the message for the existing deprecation we already have if |
We should probably update to make mention data tier node roles such data_hot, data_content, etc. but that is bit outside the scope of the immediate ask.
yes. More technically, if [1] {
"nodes" : {
"ckqKId2fRfy3CsZFME689w" : {
"attributes" : {
"logical_availability_zone" : "zone-0",
"server_name" : "instance-0000000001.b074bd77617f4ad48be1676cdc1e58ea",
"availability_zone" : "us-east4-a",
"xpack.installed" : "true",
"data" : "warm",
"instance_configuration" : "gcp.es.datawarm.n2.68x10x190",
"transform.node" : "false",
"region" : "unknown-region"
}
},
"ajMYjuF-TGCF0AqMNfQNWA" : {
"attributes" : {
"logical_availability_zone" : "zone-0",
"server_name" : "instance-0000000000.b074bd77617f4ad48be1676cdc1e58ea",
"availability_zone" : "us-east4-a",
"xpack.installed" : "true",
"data" : "hot",
"instance_configuration" : "gcp.es.datahot.n2.68x10x45",
"transform.node" : "true",
"region" : "unknown-region"
}
},
"5FwW_aurQ6S6lbXa7nI5Wg" : {
"attributes" : {
"logical_availability_zone" : "zone-0",
"server_name" : "instance-0000000002.b074bd77617f4ad48be1676cdc1e58ea",
"availability_zone" : "us-east4-a",
"xpack.installed" : "true",
"data" : "cold",
"instance_configuration" : "gcp.es.datacold.n2.68x10x190",
"transform.node" : "false",
"region" : "unknown-region"
}
}
}
} [2]
|
We ought to have this on the 8.x line as well right? |
yeah, that probably makes sense to include in 8.x |
This adds a warning-level deprecation if a user has set the node.attr.data setting, since it is a sign that they are trying to create a hot/warm setup in the way that is no longer supported. Closes #83800
This adds a warning-level deprecation if a user has set the node.attr.data setting, since it is a sign that they are trying to create a hot/warm setup in the way that is no longer supported. Closes elastic#83800
This adds a warning-level deprecation if a user has set the node.attr.data setting, since it is a sign that they are trying to create a hot/warm setup in the way that is no longer supported. Closes elastic#83800
This adds a warning-level deprecation if a user has set the node.attr.data setting, since it is a sign that they are trying to create a hot/warm setup in the way that is no longer supported. Closes elastic#83800
shouldn't the docs perhaps be updated here: https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-advanced-node-scheduling.html#k8s-hot-warm-topologies |
Thanks for reporting @jeacott1 ! I've opened a new issue to fix the documentation here: elastic/cloud-on-k8s#6196 |
Describe the feature:
If a node has
node.data: true
defined and includes anynode.attributes.data
value, list a warning the configuration may not be valid for data tiers in 8.0.The presence of a
node.attribute.data
value strongly indicates a hot/warm or tiered architecture. It is completely valid to run all data tiers on the same nodes if the cluster is not used for timeseries data.Describe a specific use case for the feature:
Historically Elasticsearch has recommended using
node.attributes.data
to identifyhot
,warm
orcold
nodes.In 7.9 we introduced data tiers: https://www.elastic.co/guide/en/elasticsearch/reference/7.16/data-tiers.html
When upgrading to 8.0 the legacy
node.data
is no longer allowed: #66409In 7.10 through 7.17 any node with
node.data: true
is assigned all 5 data tiers:data_content
,data_hot
,data_warm
,data_cold
anddata_frozen
. This can conflict with the preexisting node attribute, leading to shards being assigned to unexpected nodes. This includes system indices which could end up on cold/frozen tiers and cause unexpected cluster behavior.In 8.0 we rely on
_tier_preference
so it is critical these are accurate in the nodes'elasticsearch.yml
files: #76147The text was updated successfully, but these errors were encountered: