Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repo analysis of uncontended register behaviour #101185

Merged

Conversation

DaveCTurner
Copy link
Contributor

@DaveCTurner DaveCTurner commented Oct 21, 2023

Today repository analysis verifies that a register behaves correctly
under contention, retrying until successful, but it turns out that some
repository implementations cannot even perform uncontended register
writes correctly which may cause endless retries in the contended case.
This commit adds another repository analyser which verifies that
uncontended register writes work correctly on the first attempt.

@DaveCTurner DaveCTurner added >enhancement WIP :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Supportability Improve our (devs, SREs, support eng, users) ability to troubleshoot/self-service product better. v8.12.0 labels Oct 21, 2023
@DaveCTurner
Copy link
Contributor Author

DaveCTurner commented Oct 21, 2023

WIP because:

@DaveCTurner DaveCTurner force-pushed the 2023/10/21/uncontended-register-analysis branch 2 times, most recently from 854c212 to 9d3cdb9 Compare October 21, 2023 11:22
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Oct 21, 2023
Today repository analysis verifies that a register behaves correctly
under contention, retrying until successful, but it turns out that some
repository implementations cannot even perform uncontended register
writes correctly which may cause endless retries in the contended case.
This commit adds another repository analyser which verifies that
uncontended register writes work correctly on the first attempt.
@DaveCTurner DaveCTurner force-pushed the 2023/10/21/uncontended-register-analysis branch from 9d3cdb9 to 17566ba Compare October 21, 2023 16:02
@DaveCTurner DaveCTurner requested a review from ywangd October 22, 2023 20:51
@DaveCTurner DaveCTurner marked this pull request as ready for review October 22, 2023 20:51
Copy link
Member

@ywangd ywangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines +105 to +108
for (final var multipartUpload : uploads.values()) {
if (multipartUpload.getPath().startsWith(prefix)) {
multipartUpload.appendXml(uploadsList);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. This was essentially a bug in the fixture.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, or at least an unimplemented feature :) This fixture takes quite a lot of shortcuts and only really emulates the bits of S3's API that matter to us.

@@ -144,7 +144,7 @@ static TransportVersion def(int id) {
public static final TransportVersion PIPELINES_IN_BULK_RESPONSE_ADDED = def(8_519_00_0);
public static final TransportVersion PLUGIN_DESCRIPTOR_STRING_VERSION = def(8_520_00_0);
public static final TransportVersion TOO_MANY_SCROLL_CONTEXTS_EXCEPTION_ADDED = def(8_521_00_0);

public static final TransportVersion UNCONTENDED_REGISTER_ANALYSIS_ADDED = def(8_522_00_0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to use transportVersion for now. But strictly speaking, this feels more belong to the in-development Feature interface.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO adding a new remote action is a change to the transport protocol, although I do see that we could reasonably avoid calling the new action based on whether the cluster supports the feature or not. I don't expect we will be able to migrate assertions like these over to features tho (but maybe that is the eventual plan?):

https://github.com/elastic/elasticsearch/pull/101185/files#diff-73e6714a684296ea1da4ecb49e8b2e37485b08e60b44657ff7be05c7472b5181R154

private final String registerName;
private final List<DiscoveryNode> nodes;
private final AtomicBoolean otherAnalysisComplete;
private int currentValue; // actions run in strict sequence so no need for synchronization
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sending request and handling response can be performed by different transport worker threads. So I think they can potentially see different values even when the action is in strict order?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this isn't the case, but it's a good observation. If that were the case then we potentially would need synchronization here indeed. Fortunately for remote requests the request and response go over the same TCP channel which means they use the same transport worker thread (docs) and therefore the request handling happens-before the response handling in program order on that thread so it's ok. Local requests bypass the transport worker threads of course, but they all happen within the same JVM so we have a proper happens-before relationship there too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. You are right. In previous convesations, I heard that we could in theory send request via one channel but receive the response via another since transport does not have the ordered processing constraint of HTTP 1.1. It is a theoretical possbility, not what we have today. Sorry that I mis-remembered.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically speaking I think there may be no happens-before relationship between sending a request A and receiving a different request B which was caused by the remote node's handling of request A, because those things will use different TCP channels for sure and therefore may land on different transport threads. We do use nested requests in various places, e.g. recovery. I'm not sure if this is something that can happen in practice, but definitely something to watch out for.

Comment on lines +128 to +129
// Registers are not supported on all repository types, and that's ok.
listener.onResponse(null);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my understanding: I don't think we indicate in the response that this operation is unsupported? Are we not interested in it? I am aware that the existing "Contented" version does the same. So it is likely ok.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct, specifically we do not support register operations for the (somewhat-unloved) HDFS repository implementation, and we have no plan to address this in future so we just skip all these checks for HDFS repositories.

Comment on lines 451 to 454
} else if (key.startsWith(RepositoryAnalyzeAction.UNCONTENDED_REGISTER_NAME_PREFIX) || randomBoolean()) {
listener.onResponse(OptionalBytesReference.of(registers.computeIfAbsent(key, ignored -> new BytesRegister()).get()));
} else {
final var bogus = randomFrom(BytesArray.EMPTY, new BytesArray(new byte[] { randomByte() }));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic here is a bit to follow. IIUC, we don't want to return anything wrong for uncontended analysis. But the code here seems to suggest that we could be returning bogus result for it. But not really because the call to compareAndExchangeRegister always return a new BytesRegister() which always return 0 regardless of the bogus value. I think it would be better if we could make this more explicit for the uncontended operations. Maybe have it as the top level switch, e.g.:

if (key.startsWith(RepositoryAnalyzeAction.UNCONTENDED_REGISTER_NAME_PREFIX)) {
    ...
} else {
    // everything else, essentially the existing code
}

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ yes this is all a little unsatisfactory indeed. This area is going to need a little rework once #101184 is merged, I'll try and bring the checks on the key prefix to the top level in these methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok now merged and cleaned up, see 2dd4250.

@DaveCTurner DaveCTurner requested a review from fcofdez October 23, 2023 09:03
@DaveCTurner DaveCTurner removed the WIP label Oct 23, 2023
@elasticsearchmachine elasticsearchmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Oct 23, 2023
@elasticsearchmachine
Copy link
Collaborator

Hi @DaveCTurner, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

Copy link
Contributor

@fcofdez fcofdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@DaveCTurner DaveCTurner merged commit 4bbf760 into elastic:main Oct 23, 2023
@DaveCTurner DaveCTurner deleted the 2023/10/21/uncontended-register-analysis branch October 23, 2023 10:46
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Nov 15, 2023
Verification of uncontended operations on linearizable registers was
introduced in elastic#101185 so does not apply in versions before 8.12.

Fixes up the backport of elastic#102050.
DaveCTurner added a commit that referenced this pull request Nov 15, 2023
Verification of uncontended operations on linearizable registers was
introduced in #101185 so does not apply in versions before 8.12.

Fixes up the backport of #102050.
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Nov 15, 2023
Verification of uncontended operations on linearizable registers was
introduced in elastic#101185 so does not apply in versions before 8.12.

Fixes up the backport of elastic#102050.
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Nov 15, 2023
Verification of uncontended operations on linearizable registers was
introduced in elastic#101185 so does not apply in versions before 8.12.

Fixes up the backport of elastic#102050.
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Nov 15, 2023
Verification of uncontended operations on linearizable registers was
introduced in elastic#101185 so does not apply in versions before 8.12.

Fixes up the backport of elastic#102050.
elasticsearchmachine pushed a commit that referenced this pull request Nov 15, 2023
Verification of uncontended operations on linearizable registers was
introduced in #101185 so does not apply in versions before 8.12.

Fixes up the backport of #102050.
elasticsearchmachine pushed a commit that referenced this pull request Nov 15, 2023
Verification of uncontended operations on linearizable registers was
introduced in #101185 so does not apply in versions before 8.12.

Fixes up the backport of #102050.
elasticsearchmachine pushed a commit that referenced this pull request Nov 15, 2023
Verification of uncontended operations on linearizable registers was
introduced in #101185 so does not apply in versions before 8.12.

Fixes up the backport of #102050.
@DaveCTurner DaveCTurner restored the 2023/10/21/uncontended-register-analysis branch June 17, 2024 06:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >enhancement Supportability Improve our (devs, SREs, support eng, users) ability to troubleshoot/self-service product better. Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v8.12.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants