Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Archive unknown or invalid settings on updates #28888

Merged
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,20 @@

package org.elasticsearch.action.admin.cluster.settings;

import org.apache.logging.log4j.Logger;
import org.apache.logging.log4j.message.ParameterizedMessage;
import org.apache.logging.log4j.util.Supplier;
import org.elasticsearch.cluster.ClusterState;
import org.elasticsearch.cluster.block.ClusterBlocks;
import org.elasticsearch.cluster.metadata.MetaData;
import org.elasticsearch.common.collect.Tuple;
import org.elasticsearch.common.settings.ClusterSettings;
import org.elasticsearch.common.settings.Settings;

import java.util.Map;

import static org.elasticsearch.cluster.ClusterState.builder;
import static org.elasticsearch.common.settings.AbstractScopedSettings.ARCHIVED_SETTINGS_PREFIX;

/**
* Updates transient and persistent cluster state settings if there are any changes
Expand All @@ -48,15 +55,34 @@ synchronized Settings getPersistentUpdate() {
return persistentUpdates.build();
}

synchronized ClusterState updateSettings(final ClusterState currentState, Settings transientToApply, Settings persistentToApply) {
synchronized ClusterState updateSettings(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get your newlines under control my friend

final ClusterState currentState, final Settings transientToApply, final Settings persistentToApply, final Logger logger) {
boolean changed = false;
Settings.Builder transientSettings = Settings.builder();
transientSettings.put(currentState.metaData().transientSettings());
changed |= clusterSettings.updateDynamicSettings(transientToApply, transientSettings, transientUpdates, "transient");

/*
* Our cluster state could have unknown or invalid settings that are known and valid in a previous version of Elasticsearch. We can
* end up in this situation during a rolling upgrade where the previous version will infect the current version of Elasticsearch
* with settings that the current version either no longer knows about or now considers to have invalid values. When the current
* version of Elasticsearch becomes infected with a cluster state containing such settings, we need to skip validating such settings
* and instead archive them. Consequently, for the current transient and persistent settings in the cluster state we do the
* following:
* - split existing settings instance into two with the known and valid settings in one, and the unknown or invalid in another
* (note that existing archived settings are included in the known and valid settings)
* - validate the incoming settings update combined with the existing known and valid settings
* - merge in the archived unknown or invalid settings
*/
final Tuple<Settings, Settings> partitionedTransientSettings =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have 2 methods Settings getValidSettings(Settings) and Settings getInvalidSetting(Settings) that would make it simpler to see what is valid and not instead of a tuple?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a change. I kept the tuple but clarified its usage. I hope this helps.

partitionKnownAndValidSettings(currentState.metaData().transientSettings(), "transient", logger);
final Settings knownAndValidTransientSettings = partitionedTransientSettings.v1();
final Settings unknownOrInvalidTransientSettings = partitionedTransientSettings.v2();
final Settings.Builder transientSettings = Settings.builder().put(knownAndValidTransientSettings);
changed |= clusterSettings.updateDynamicSettings(transientToApply, transientSettings, transientUpdates, "transient");

Settings.Builder persistentSettings = Settings.builder();
persistentSettings.put(currentState.metaData().persistentSettings());
final Tuple<Settings, Settings> partitionedPersistentSettings =
partitionKnownAndValidSettings(currentState.metaData().persistentSettings(), "persistent", logger);
final Settings knownAndValidPersistentSettings = partitionedPersistentSettings.v1();
final Settings unknownOrInvalidPersistentSettings = partitionedPersistentSettings.v2();
final Settings.Builder persistentSettings = Settings.builder().put(knownAndValidPersistentSettings);
changed |= clusterSettings.updateDynamicSettings(persistentToApply, persistentSettings, persistentUpdates, "persistent");

final ClusterState clusterState;
Expand All @@ -69,8 +95,8 @@ synchronized ClusterState updateSettings(final ClusterState currentState, Settin
clusterSettings.validate(persistentFinalSettings, true);

MetaData.Builder metaData = MetaData.builder(currentState.metaData())
.persistentSettings(persistentFinalSettings)
.transientSettings(transientFinalSettings);
.transientSettings(Settings.builder().put(transientFinalSettings).put(unknownOrInvalidTransientSettings).build())
.persistentSettings(Settings.builder().put(persistentFinalSettings).put(unknownOrInvalidPersistentSettings).build());

ClusterBlocks.Builder blocks = ClusterBlocks.builder().blocks(currentState.blocks());
boolean updatedReadOnly = MetaData.SETTING_READ_ONLY_SETTING.get(metaData.persistentSettings())
Expand Down Expand Up @@ -102,5 +128,46 @@ synchronized ClusterState updateSettings(final ClusterState currentState, Settin
return clusterState;
}

/**
* Partitions the settings into those that are known and valid versus those that are unknown or invalid. The resulting tuple contains
* the known and valid settings in the first component and the unknown or invalid settings in the second component. Note that archived
* settings contained in the settings to partition are included in the first component.
*
* @param settings the settings to partition
* @param settingsType a string to identify the settings (for logging)
* @param logger a logger to sending warnings to
* @return the partitioned settings
*/
private Tuple<Settings, Settings> partitionKnownAndValidSettings(
final Settings settings, final String settingsType, final Logger logger) {
final Settings existingArchivedSettings = settings.filter(k -> k.startsWith(ARCHIVED_SETTINGS_PREFIX));
final Settings settingsExcludingExistingArchivedSettings =
settings.filter(k -> k.startsWith(ARCHIVED_SETTINGS_PREFIX) == false);
final Settings settingsWithUnknownOrInvalidArchived = clusterSettings.archiveUnknownOrInvalidSettings(
settingsExcludingExistingArchivedSettings,
e -> logUnknownSetting(settingsType, e, logger),
(e, ex) -> logInvalidSetting(settingsType, e, ex, logger));
return Tuple.tuple(
Settings.builder()
.put(settingsWithUnknownOrInvalidArchived.filter(k -> k.startsWith(ARCHIVED_SETTINGS_PREFIX) == false))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this distinction a bit confusing - why do we keep the "old" invalid settings but remove the new ones? In the next iteration will just be including the newly archived setting in the previous ones. I'm inclined to say - either remove all archived setting (both new and old) when validating or keep them all. WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bleskes Consider a settings update to remove archived persistent settings:

{
  "persistent": {
    "archived.*": null
  }
}

If there are existing persistent archived settings in the cluster state, these should be removed by this update. The strategy that we use to handle unknown or invalid settings is to add them at the end with the archived prefix. If we do the same for existing archived settings, the application of the above settings update will not remove them and they will remain in the cluster state. Does that help?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand you to say that you want the invalid settings not to be removed by the incoming request you gave. If so, nothing we do here will be "clean" (for example, after you called that command there may be archived settings in the cluster state, which is not what people expect). Not sure it's worth all the complexity. If you prefer it this way (I presume you do), I'm fine with keeping as is but please write an explicit test to demonstrate this behavior.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for example, after you called that command there may be archived settings in the cluster state, which is not what people expect

I disagree with this. If I list my cluster settings and I see archived.foo, archived.bar, and invalid.setting, I do not expect PUT archived.*: null (lazy) to remove invalid.setting. Hence, the choice that I made here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a test in 3298664.

.put(existingArchivedSettings)
.build(),
settingsWithUnknownOrInvalidArchived.filter(k -> k.startsWith(ARCHIVED_SETTINGS_PREFIX)));
}

private void logUnknownSetting(final String settingType, final Map.Entry<String, String> e, final Logger logger) {
logger.warn("ignoring existing unknown {} setting: [{}] with value [{}]; archiving", settingType, e.getKey(), e.getValue());
}

private void logInvalidSetting(
final String settingType, final Map.Entry<String, String> e, final IllegalArgumentException ex, final Logger logger) {
logger.warn(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

man newlines?!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about painting it yellow. 😛

(Supplier<?>)
() -> new ParameterizedMessage("ignoring existing invalid {} setting: [{}] with value [{}]; archiving",
settingType,
e.getKey(),
e.getValue()),
ex);
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,8 @@ public void onFailure(String source, Exception e) {

@Override
public ClusterState execute(final ClusterState currentState) {
ClusterState clusterState = updater.updateSettings(currentState, request.transientSettings(), request.persistentSettings());
ClusterState clusterState =
updater.updateSettings(currentState, request.transientSettings(), request.persistentSettings(), logger);
changed = clusterState != currentState;
return clusterState;
}
Expand Down
Loading