Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce TransportGetSnapshotsAction#GetSnapshotsOperation #105609

Conversation

DaveCTurner
Copy link
Contributor

This commit introduces a class to encapsulate each invocation of
TransportGetSnapshotsAction, avoiding the need to pass around a huge
list of parameters between all the methods that make up the
implementation.

This commit introduces a class to encapsulate each invocation of
`TransportGetSnapshotsAction`, avoiding the need to pass around a huge
list of parameters between all the methods that make up the
implementation.
@DaveCTurner DaveCTurner added :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >refactoring v8.14.0 labels Feb 19, 2024
@elasticsearchmachine elasticsearchmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Feb 19, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@DaveCTurner
Copy link
Contributor Author

For context, I was looking at this in relation to #104607 and #95345. They should be fairly simple to fix, except that this class is currently too convoluted to do so. There's loads of other things to clean up here too, incorrect/outdated comments, inconsistent and misleading naming, excessive copying of lists into other lists, but I'm isolating this one change first for ease of review.

Copy link
Contributor

@DiannaHohensee DiannaHohensee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the high level structural changes, and I left a number of questions.

I'm not sure where you're headed with further refactoring after this, but my default inclinations are typically to separate network handling – parsing, verifying, validating, response construction – from implementation code that accesses internal server state. Obviously that's outside the scope of this change, but I'd like to better understand what the ideal state is, to ensure that's where we're moving.

final Map<String, ElasticsearchException> failures = ConcurrentCollections.newConcurrentMap();
for (String missingRepo : repositoriesResult.missing()) {
failures.put(missingRepo, new RepositoryMissingException(missingRepo));
private class GetSnapshotsOperation {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we move this into its own file? It seems too big/complex to be reasonable as an inner class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but then basically nothing would be left in TransportGetSnapshotsAction and we'd still have all that size and complexity in one place. And indeed one could argue it's kinda more complex as a top-level class because that means the reader has to worry about who else in the package might be calling it, whereas as a private inner class we can see all the callers right in front of us.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After talking offline, I think this is more of a lack of strong API boundaries problem. Ideally there's an API that says a method does X, and callers can use it as needed. But the GetSnapshotsOperation namespace isn't really a logical piece, rather a namespace we're dumping everything into until we refactor more meaningful boundaries. For the time being.

Multiple callers means deduplicating code, which sounds like a good problem.

@DaveCTurner
Copy link
Contributor Author

I will admit I don't have a precise end-goal in mind. Clearly this area needs improvement, and experience shows that introducing a method object is usually a good first step in cases like this. I have played around with some follow-up changes, consolidating some logic and breaking up other bits, and this change definitely makes those things less noisy and therefore hopefully easier to review.

@DaveCTurner
Copy link
Contributor Author

my default inclinations are typically to separate network handling – parsing, verifying, validating, response construction – from implementation code that accesses internal server state.

Sorry if I'm missing something but I don't think we're doing any parsing/verifying/validating of network stuff here. Constructing a response (by calling methods on Repository) is the whole point of this API, I'm not sure we could get rid of that.

Copy link
Contributor

@DiannaHohensee DiannaHohensee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Spoke offline with David. Here's a summary:

David has a rough idea to ultimately refactor the logic to make more sense, maybe booting out some logic to a separate file (better unit testing of some inner mechanics, perhaps), or at least un-blobbing the logic from inside a meaningless class namespace into meaningful logical separation. He's pushing most of the code into an inner class to get rid of the method parameter noise and friction to further refactoring, and then eventually following up with method refactors to rearrange the code into more sensical pieces.

Ultimately trying to make some performance improvements (meaningful result pagination, and not fetching all the snapshot info when only a subset is returned) some day and needs the code to be manageable first.

@DaveCTurner DaveCTurner added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Feb 23, 2024
@elasticsearchmachine elasticsearchmachine merged commit b95cb8c into elastic:main Feb 23, 2024
14 checks passed
@DaveCTurner DaveCTurner deleted the 2024/02/19/TransportGetSnapshotsAction-reduce-parameters branch February 23, 2024 06:52
@DaveCTurner DaveCTurner restored the 2024/02/19/TransportGetSnapshotsAction-reduce-parameters branch June 17, 2024 06:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >refactoring Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v8.14.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants