Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove latestSettings cache from KNNSettings #727

Merged
merged 4 commits into from
Jan 19, 2023

Conversation

jmazanec15
Copy link
Member

@jmazanec15 jmazanec15 commented Jan 13, 2023

Description

Removes the latestSettings cache from the KNNSettings class. latestSettings cache gets updated in a consumer when the particular settings are updated.

KNNSettings.getSettingValue would pull from this cache and then fallback to the default if it is not present. However, in the case when the settings are set via opensearch.yml config file, the settings get set before any functions can be registered. So the config values never get put in the cache. This leads to getSettingValue to always return the default instead of the value specified in the config file.

To fix this, this change refactors getSettingValue to pull from the cluster settings and removes the latestSetting cache. However, because the dynamicCacheSettings have consumers that rebuild the NativeMemoryCacheManager cache when they are changed, the logic for passing the parameters to rebuild this cache had to change as well. Initially, the NativeMemoryCacheManager gets its parameters from the settings. However, because we are switching "getSettingValue" to get the values from the ClusterSettings, the new cluster settings will not yet be committed to the cluster state when the settings update consumer is called (it gets called in this chain starting here). To workaround this, this change refactors the cache to accept parameters as args.

In addition to this, I added a couple tests units tests and did some refactoring to get tests to pass.

Issues Resolved

#585

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@jmazanec15 jmazanec15 added the Bug Fixes Changes to a system or product designed to handle a programming bug/glitch label Jan 13, 2023
@jmazanec15 jmazanec15 requested a review from a team January 13, 2023 17:44
@codecov-commenter
Copy link

codecov-commenter commented Jan 13, 2023

Codecov Report

Merging #727 (3a0277d) into main (519bf1b) will increase coverage by 0.16%.
The diff coverage is 97.72%.

@@             Coverage Diff              @@
##               main     #727      +/-   ##
============================================
+ Coverage     84.43%   84.60%   +0.16%     
+ Complexity     1072     1069       -3     
============================================
  Files           152      152              
  Lines          4356     4364       +8     
  Branches        389      390       +1     
============================================
+ Hits           3678     3692      +14     
+ Misses          498      492       -6     
  Partials        180      180              
Impacted Files Coverage Δ
...rch/knn/index/memory/NativeMemoryCacheManager.java 95.20% <96.00%> (+0.70%) ⬆️
...ain/java/org/opensearch/knn/index/KNNSettings.java 84.37% <100.00%> (+3.49%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

*
* Modifications Copyright OpenSearch Contributors. See
* GitHub history for details.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please fix license header to 2-lines format

NativeMemoryCacheManager.getInstance().rebuildCache();
});
}
ByteSizeValue maxCacheWeight = getSettingValue(KNN_MEMORY_CIRCUIT_BREAKER_LIMIT);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we change maxCacheWeight type to long, this way it's gonna be consistent with expiryTimeInMinutes setting. now we do have both primitive object and getter of complex. seems maxCacheWeight it's not used anywhere else in this method so ByteSizeValue type is not required

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will update.

executor.execute(() -> {
cache.invalidateAll();
initialize();
if (cache != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious did we have an NPE with cache var or it's more of being on a safe side?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think this can be reached. The only time the cache will be null will be if rebuildCache is called before the constructor, but I dont think this can happen. I think Ill remove this.

);
}

private void initialize(boolean isWeightLimited, long maxWeight, boolean isExpirationLimited, long expiryTimeInMin) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I suggest we use DTO instead of 4 arguments. It's will increase maintainability for future changes if we need to add/remove settings, especially as we can re-use DTO in multiple places, e.g. rebuildCache, tests etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will update

public class KNNSettingsTests extends KNNTestCase {

@SneakyThrows
public void testGetSettingValueFromConfig() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that would fail without changes in rebuild cache? do we need to include other settings as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand what you mean by "that would fail without changes in rebuild cache". Could you elaborate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean is this test, if it exists before your PR, fail without you code change? And second question is about settings, we're checking here only for KNNSettings.KNN_MEMORY_CIRCUIT_BREAKER_LIMIT, but we have other settings like KNNSettings.KNN_CACHE_ITEM_EXPIRY_ENABLED and KNNSettings.KNN_CACHE_ITEM_EXPIRY_TIME_MINUTES, do we need to have similar test for them as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean is this test, if it exists before your PR, fail without you code change?

Yes, this was the failure before the fix:

  2> REPRODUCE WITH: ./gradlew ':test' --tests "org.opensearch.knn.index.KNNSettingsTests.testGetSettingValueFromConfig" -Dtests.seed=2B1E0A6CFDBD324B -Dtests.security.manager=false -Dtests.locale=fr -Dtests.timezone=America/Paramaribo -Druntime.java=11
  2> java.lang.AssertionError: expected:<8126464> but was:<13>
        at __randomizedtesting.SeedInfo.seed([2B1E0A6CFDBD324B:4423BA039518D35]:0)
        at org.junit.Assert.fail(Assert.java:89)
        at org.junit.Assert.failNotEquals(Assert.java:835)
        at org.junit.Assert.assertEquals(Assert.java:647)
        at org.junit.Assert.assertEquals(Assert.java:633)

I think my actual and expected are mixed up, but Ill go ahead and fix this.

And second question is about settings, we're checking here only for KNNSettings.KNN_MEMORY_CIRCUIT_BREAKER_LIMIT, but we have other settings like KNNSettings.KNN_CACHE_ITEM_EXPIRY_ENABLED and KNNSettings.KNN_CACHE_ITEM_EXPIRY_TIME_MINUTES, do we need to have similar test for them as well?

I think it is probably okay to test just one. We group all of the dynamic cache settings here: https://github.com/opensearch-project/k-NN/blob/main/src/main/java/org/opensearch/knn/index/KNNSettings.java#L223 and then dont handle them uniquely from there, so I think we are okay.

@navneet1v
Copy link
Collaborator

However, in the case when the settings are set via opensearch.yml config file, the settings update consumers never get called, so the config values never get put in the cache.

The better explanation for this as per the deep-dive done was config(from .yml) gets set at the start, before any functions can be registered. Hence the .yml file config will not trigger the settings update, as these are updates.

Removes the latestSettings cache from the KNNSettings class.
latestSettings cache gets updated in a consumer when the particular
settings are updated.

KNNSettings.getSettingValue would pull from this cache and then fallback
to the default if it is not present. However, in the case when the
settings are set via opensearch.yml config file, the settings update
consumers never get called, so the config values never get put in the
cache. This leads to getSettingValue to always return the default
instead of the value specified in the config file.

To fix this, this change refactors getSettingValue to pull from the
cluster settings and removes the latestSetting cache. However, because
the dynamicCacheSettings have consumers that rebuild the
NativeMemoryCacheManager cache when they are changed, the logic for
passing the parameters to rebuild this cache had to change as well.

Signed-off-by: John Mazanec <[email protected]>
Addresses review comments. Switches configuration of cache manager to
use DTO.

Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
@jmazanec15
Copy link
Member Author

@navneet1v true, I updated the PR description to reflect this.

@@ -30,6 +41,18 @@ public static void resetState() {
knnCounter.set(0L);
}

ClusterService clusterService = mock(ClusterService.class);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we @mock annotation and mock this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will update.

Comment on lines 11 to 12
@Getter
@Builder
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Minor]
if you use @value and @builder then you don't need add @Getter and you can remove final from every variable as @value will make class variables final.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Will add.

@navneet1v
Copy link
Collaborator

Minor comments overall code looks good to me.

Signed-off-by: John Mazanec <[email protected]>
@jmazanec15 jmazanec15 requested a review from navneet1v January 19, 2023 05:05
Copy link
Member

@martin-gaievski martin-gaievski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, thank you!

@jmazanec15 jmazanec15 merged commit 8e2ad45 into opensearch-project:main Jan 19, 2023
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jan 19, 2023
Removes the latestSettings cache from the KNNSettings class.
latestSettings cache gets updated in a consumer when the particular
settings are updated.

KNNSettings.getSettingValue would pull from this cache and then fallback
to the default if it is not present. However, in the case when the
settings are set via opensearch.yml config file, the settings update
consumers never get called, so the config values never get put in the
cache. This leads to getSettingValue to always return the default
instead of the value specified in the config file.

To fix this, this change refactors getSettingValue to pull from the
cluster settings and removes the latestSetting cache. However, because
the dynamicCacheSettings have consumers that rebuild the
NativeMemoryCacheManager cache when they are changed, the logic for
passing the parameters to rebuild this cache had to change as well.

Also, switches configuration of cache manager to use DTO.

Signed-off-by: John Mazanec <[email protected]>
(cherry picked from commit 8e2ad45)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Bug Fixes Changes to a system or product designed to handle a programming bug/glitch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants