Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store DataTier Preference directly on IndexMetadata #78668

Merged

Conversation

original-brownbear
Copy link
Member

The data tier preference is very expensive to parse out of the setting string
repeatedly for large number of indices when using it in the data tier allocation decider.

=> as done with other index settings relevant to allocation, this commit moves the data tier
preference to a field in IndexMetadata. The required moving the DataTier class itself to
server (as well as moving the setting handling from the allocator into it + moving some additional searchable snapshot
setting related logic to server). In a follow-up we can look into making the setting a list setting to remove the
duplication around turning the string value into a list in various places.

The data tier preference is very expensive to parse out of the setting string
repeatedly for large number of indices when using it in the data tier allocation decider.

=> as done with other index settings relevant to allocation, this commit moves the data tier
preference to a field in `IndexMetadata`. The required moving the `DataTier` class itself to
`server`. In a follow-up we can look into making the setting a list setting to remove the
duplication around turning the string value into a list in various places.
@original-brownbear original-brownbear added >non-issue :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) v8.0.0 v7.16.0 labels Oct 5, 2021
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Oct 5, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I left just a few small comments.

@@ -100,12 +99,12 @@ public void testIndexPrefer() {
d = decider.canAllocate(shard, node, allocation);
assertThat(node.toString(), d.type(), equalTo(Decision.Type.NO));
assertThat(node.toString(), d.getExplanation(),
containsString("index has a preference for tiers [data_warm,data_cold], " +
containsString("index has a preference for tiers [data_warm, data_cold], " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a nit, but I think we don't accept extra whitespace in this setting (we just use String#split to parse it) so I don't think we should include it when reporting it back to users either. If we do, someone will copy-paste it as-is and get a message saying invalid tier names found in [data_hot, data_warm] allowed values are [data_hot, data_warm, ... which is going to be pretty confusing.

(we should probably fix that error too, I guess almost nobody actually sets these things by hand or else we'd have had a report about it by now)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We wouldn't get here with an invalid tier name because of the setting validation that would catch it early. The validate logic that you mention in org.elasticsearch.cluster.routing.allocation.DataTier.DataTierSettingValidator#validate(java.lang.String) will throw an exception that has the original input string in the "invalid tier names found ". That's why I figured this was ok?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh this is exactly the confusion I mean: we're saying that this index has a tier preference of data_warm, data_cold (with a space) but that's not a valid value for the tier preference setting since we just split at commas, and data_cold (with a leading space) isn't the name of a tier.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bah Github does overly clever stuff with whitespace so my previous comment makes no sense. Reformatting:

Heh this is exactly the confusion I mean: we're saying that this index has a
tier preference of `data_warm, data_cold` (with a space) but that's not a valid
value for the tier preference setting since we just split at commas, and
` data_cold` (with a leading space) isn't the name of a tier.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point :) I made it use the old style formatting by concatenating strings again (could have used the setting value as well, but I figured this was less-error-prone/more-consistent on the off-chance that we made the setting buggy at some point) and added some early breakouts for the no-debug case so we don't get slower from that.

@@ -115,6 +130,15 @@ public static boolean isFrozenNode(final Set<DiscoveryNodeRole> roles) {
return roles.contains(DiscoveryNodeRole.DATA_FROZEN_NODE_ROLE) || roles.contains(DiscoveryNodeRole.DATA_ROLE);
}

public static List<String> parseTierList(String tiers) {
if (Strings.hasText(tiers) == false) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit weird that we treat all-whitespace as empty but otherwise we're whitespace-sensitive. (Acking that this is how it was before too, no action required)

@@ -60,7 +76,6 @@ public static boolean validTierName(String tierName) {
/**
* Based on the provided target tier it will return a comma separated list of preferred tiers.
* ie. if `data_cold` is the target tier, it will return `data_cold,data_warm,data_hot`
* This is usually used in conjunction with {@link DataTierAllocationDecider#TIER_PREFERENCE_SETTING}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think we can keep this, the setting is still in scope here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ brought this back

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

u sure? I don't see it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now :) Sorry accidentally put this on the top level class.

@@ -47,7 +47,7 @@ public void testDefaultIndexAllocateToContent() {
client().admin().indices().prepareCreate(index).setWaitForActiveShards(0).get();

Settings idxSettings = client().admin().indices().prepareGetIndex().addIndices(index).get().getSettings().get(index);
assertThat(DataTierAllocationDecider.TIER_PREFERENCE_SETTING.get(idxSettings), equalTo(DataTier.DATA_CONTENT));
assertThat(DataTier.TIER_PREFERENCE_SETTING.get(idxSettings), equalTo(DataTier.DATA_CONTENT));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename this test suite to DataTierAllocationDeciderIT since DataTier isn't in this package any more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ renamed

@original-brownbear
Copy link
Member Author

Thanks David, all addressed :)

@original-brownbear
Copy link
Member Author

@DaveCTurner alright explain format changes reverted now :) should be good for another look now I think.

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Armin, this looks good overall. Left two comments on how to handle the field.

* {@link String#hashCode()} works.
*/
@Nullable // since lazy-loaded
private List<String> tierPreference;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should this still mark this volatile to avoid unsafe publishing. Neither List.of nor string constructor promises to only have finalized fields and while it sort of looks like that is the case today, it could change in JDK updates. Also, a slight modification to DataTier.parseTierPrefence could cause this without anyone noticing.

The difference to String.hashCode is that it is a primitive with no references out (plus it is in the JDK so could special handle it if needed).

@@ -574,6 +575,26 @@ public Settings getSettings() {
return this.aliases;
}

/**
* Lazy loaded cache for tier preference setting. We can't eager load this setting because
* {@link IndexMetadataVerifier#convertSharedCacheTierPreference(IndexMetadata)} might not have acted on this index yet and thus the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative to lazy loading would be to store a null here if parsing the field fails when building and then parse the field every time when getTierPreference is called (or parse it to throw the exception and then if it succeeds throw another exception). That seems slightly better to me, since then the field is final.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ I think I like this option best

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented :)

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

List<String> tierPreference;
try {
tierPreference = DataTier.parseTierList(DataTier.TIER_PREFERENCE_SETTING.get(settings));
} catch (Exception e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this has to be an IllegalArgumentException so maybe we can catch that instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically yes, but we also call some other settings. I don't know if we want to bank on no changes to that?
I opted to be careful here mainly because this is a BwC thing to begin with and we could inadvertently introduce a bug like say some out of bounds exception with some specific old version of the setting and I wanted to avoid that. (we had a few similar BwC cases over the years that are hard to catch in tests up front)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, can we then assert that it is an IllegalArgumentException instead? Would like to avoid hiding some other exception from tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ aded

// BwC hack: the setting failed validation but it will be fixed in
// #IndexMetadataVerifier#convertSharedCacheTierPreference(IndexMetadata)} later so we just store a null
// to be able to build a temporary instance
tierPreference = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test that we can build an IndexMetadata object based on a settings object with an illegal _tier_preference?

@@ -574,6 +580,15 @@ public Settings getSettings() {
return this.aliases;
}

public List<String> getTierPreference() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a simple test that getTierPreference delivers the right output?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ added both tests

@original-brownbear
Copy link
Member Author

Jenkins run elasticsearch-ci/part-2 (unrelated + known)

@original-brownbear
Copy link
Member Author

Thanks David + Henning!

@original-brownbear original-brownbear merged commit 7cb2d05 into elastic:master Oct 8, 2021
@original-brownbear original-brownbear deleted the data-tiers-on-index branch October 8, 2021 12:59
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Oct 8, 2021
The data tier preference is very expensive to parse out of the setting string
repeatedly for large number of indices when using it in the data tier allocation decider.

=> as done with other index settings relevant to allocation, this commit moves the data tier
preference to a field in `IndexMetadata`. The required moving the `DataTier` class itself to
`server`. In a follow-up we can look into making the setting a list setting to remove the
duplication around turning the string value into a list in various places.
original-brownbear added a commit that referenced this pull request Oct 8, 2021
The data tier preference is very expensive to parse out of the setting string
repeatedly for large number of indices when using it in the data tier allocation decider.

=> as done with other index settings relevant to allocation, this commit moves the data tier
preference to a field in `IndexMetadata`. The required moving the `DataTier` class itself to
`server`. In a follow-up we can look into making the setting a list setting to remove the
duplication around turning the string value into a list in various places.
wjp719 added a commit to wjp719/elasticsearch that referenced this pull request Oct 9, 2021
* master:
  Fix DataTierTests package and add a validation test (elastic#78880)
  Fix split package org.elasticsearch.common.xcontent (elastic#78831)
  Store DataTier Preference directly on IndexMetadata (elastic#78668)
  [DOCS] Fixes typo in calendar API example (elastic#78867)
  Improve Node Shutdown Observability (elastic#78727)
  Convert encrypted snapshot license object to LicensedFeature (elastic#78731)
  Revert "Make nodePaths() singular (elastic#72514)" (elastic#78801)
  Fix incorrect generic type in PolicyStepsRegistry (elastic#78628)
  [DOCS] Fixes ML get calendars API (elastic#78808)
  Implement GET API for System Feature Upgrades (elastic#78642)
  [TEST] More MetadataStateFormat tests (elastic#78577)
  Add support for rest compatibility headers to the HLRC (elastic#78490)
  Un-ignoring tests after backporting fix (elastic#78830)
  Add node REPLACE shutdown implementation (elastic#76247)
  Wrap VersionPropertiesLoader in a BuildService to decouple build logic projects (elastic#78704)
  Adjust /_cat/templates not to request all metadata (elastic#78829)
  [DOCS] Fixes ML get scheduled events API (elastic#78809)
  Enable exit on out of memory error (elastic#71542)

# Conflicts:
#	server/src/main/java/org/elasticsearch/cluster/metadata/DataStream.java
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Oct 11, 2021
* upstream/master: (250 commits)
  [Transform] HLRC cleanups (elastic#78909)
  [ML] Make ML indices hidden when the node becomes master (elastic#77416)
  Introduce a Few Settings Singleton Instances (elastic#78897)
  Simplify TestCluster extraJar configuration (elastic#78837)
  Add @OverRide annotations to methods in EnrichPlugin class (elastic#76873)
  Add v7 restCompat for invalidating API key with the id field (elastic#78664)
  EQL: Refine repeatable queries (elastic#78895)
  Fix DataTierTests package and add a validation test (elastic#78880)
  Fix split package org.elasticsearch.common.xcontent (elastic#78831)
  Store DataTier Preference directly on IndexMetadata (elastic#78668)
  [DOCS] Fixes typo in calendar API example (elastic#78867)
  Improve Node Shutdown Observability (elastic#78727)
  Convert encrypted snapshot license object to LicensedFeature (elastic#78731)
  Revert "Make nodePaths() singular (elastic#72514)" (elastic#78801)
  Fix incorrect generic type in PolicyStepsRegistry (elastic#78628)
  [DOCS] Fixes ML get calendars API (elastic#78808)
  Implement GET API for System Feature Upgrades (elastic#78642)
  [TEST] More MetadataStateFormat tests (elastic#78577)
  Add support for rest compatibility headers to the HLRC (elastic#78490)
  Un-ignoring tests after backporting fix (elastic#78830)
  ...

# Conflicts:
#	server/src/main/java/org/elasticsearch/ingest/IngestService.java
#	server/src/test/java/org/elasticsearch/ingest/IngestServiceTests.java
@original-brownbear original-brownbear restored the data-tiers-on-index branch April 18, 2023 20:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >non-issue Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v7.16.0 v8.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants