-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pagination and Sorting for Get Snapshots API #73952
Changes from 22 commits
ee277a3
d6db405
9d8042b
e85c74b
b06c720
1bb21ad
6698e00
ed601ac
9fd4150
5caefe1
6b8a221
d3d3f60
dc5a264
87a8542
952fa63
92d1ba0
a602d4f
4298bef
6372f7e
499fb80
fabdf36
1ed1c37
f8d448d
f5efc44
37c2a80
3a09b85
b2c862a
eb17bf8
8aa7408
401c586
c36fc23
0a2b715
c7b9acf
6a05dec
d93cf14
2853a23
a4b5976
af97c8e
d214c74
05f529c
86f0f4a
de5cb0a
48e92a5
c33e37d
a8e83f6
c1172c8
d903760
74e1923
9a244df
53268d5
93dd6ba
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -98,6 +98,36 @@ comprising the number of shards in the index, the total size of the index in | |
bytes, and the maximum number of segments per shard in the index. Defaults to | ||
`false`, meaning that this information is omitted. | ||
|
||
`sort`:: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The main question left from my end here is on what to do about My thinking here would be to maybe just not allow pagination and sorting with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I prefer your first option. Making it dependent on data makes testing difficult. This would only apply if you specify There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes I think just not allowing any of the three for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Went with option 1 in this PR now => no pagination allowed with |
||
(Optional, string) | ||
Allows setting a sort order for the result. Defaults to `start_time`, i.e. sorting by snapshot start time stamp. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this sort them ascending or descending? I'd expect ascending from these docs but in fact if using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
++ I'd vote yes here, seems useful for the UI in any case. Any objections to adding that @henningandersen ? :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agree to adding this. Reading this documentation line again, I wonder if we should leave the default sorting as "unspecified"? Would we not otherwise risk provoking an unnecessary read of snapshot info for repos that have not been upgraded when using verbose=false? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I figured this was covered by the fact that the parameter can only be used with verbose listing. Maybe just explicitly call out that verbose=false sort order is unspecified? |
||
+ | ||
.Valid values for `sort` | ||
[%collapsible%open] | ||
==== | ||
`start_time`:: | ||
Sort snapshots by their start time stamp and break ties by snapshot name. | ||
|
||
`duration`:: | ||
Sort snapshots by their duration and break ties by snapshot name. | ||
|
||
`name`:: | ||
Sort snapshots by their name. | ||
|
||
`indices`:: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We discussed removing this requirement in a meeting, I wonder if we should leave it out initially? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This one is effectively free, the necessary information is available via the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ++ renamed |
||
Sort snapshots by the number of indices they contain and break ties by snapshot name. | ||
==== | ||
|
||
`size`:: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, we talked about this before, we want an upper bound here probably in some implicit + explicit form. I think from a response size perspective I'd go with something like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could we leave this untouched if none of sort, search, size are specified at least in 7.x? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I wonder if we can/should :) Obviously we don't want to break the API here, but we are moving towards creating situations where users will have a lot more snapshots that before. We did have problems with the memory consumption of the unpaginated API before and although it's better now, I think we are creating situations where a call getting all snapshot verbose could OOM/destabilize a master easily (thinking about shard failures in particular here). If we do keep the unlimited option untouched in 7.x (which makes sense don't get me wrong :)), should we then maybe invest some effort into some sort of circuit breaking during collecting the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As I see it, we are not adding such a new API, the existing API has those properties. I think limiting explicit That said, adding some kind of a memory breaker does make sense (but I would prefer to do that after having done all the pagination/scalability improvement work). |
||
(Optional, integer) | ||
Maximum number of snapshots to return. Defaults to `0` which means return all that match the request without limit. | ||
|
||
`after`:: | ||
(Optional, string) | ||
Offset from which to start the snapshot listing defined by sort value and snapshot name as a string of the form | ||
`<sort_value>,<snapshot_name>`. Defaults to not set which means that listing will start from the first snapshot that matches the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder if we should introduce the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ++ that sounds like a neat idea. Follow-up for that is probably fine, this one is already pretty large :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My understanding was that we would return a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah ok. I would rather that the parameter were more opaque then - if we document how to construct it then folk will do just that and then we'll be stuck supporting it even if, say, we want to support multiple sort keys or some other enhancement later on. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fine with me, shall we just drop the documentation for the parameter in this PR then and document There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, we could even drop this whole parameter until the followup: IMO |
||
request. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we also need the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd prefer this goes into a separate PR due to this size of this one as it adds yet another code path to the get snapshots action I think. That should be a quick one though fortunately :) |
||
[role="child_attributes"] | ||
[[get-snapshot-api-response-body]] | ||
==== {api-response-body-title} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,222 @@ | ||
/* | ||
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
* or more contributor license agreements. Licensed under the Elastic License | ||
* 2.0 and the Server Side Public License, v 1; you may not use this file except | ||
* in compliance with, at your election, the Elastic License 2.0 or the Server | ||
* Side Public License, v 1. | ||
*/ | ||
|
||
package org.elasticsearch.http; | ||
|
||
import org.apache.http.client.methods.HttpGet; | ||
import org.elasticsearch.action.ActionFuture; | ||
import org.elasticsearch.action.admin.cluster.snapshots.create.CreateSnapshotResponse; | ||
import org.elasticsearch.action.admin.cluster.snapshots.get.GetSnapshotsAction; | ||
import org.elasticsearch.action.admin.cluster.snapshots.get.GetSnapshotsRequest; | ||
import org.elasticsearch.action.admin.cluster.snapshots.get.GetSnapshotsResponse; | ||
import org.elasticsearch.client.Request; | ||
import org.elasticsearch.client.Response; | ||
import org.elasticsearch.common.settings.Settings; | ||
import org.elasticsearch.common.util.CollectionUtils; | ||
import org.elasticsearch.common.xcontent.DeprecationHandler; | ||
import org.elasticsearch.common.xcontent.NamedXContentRegistry; | ||
import org.elasticsearch.common.xcontent.XContentParser; | ||
import org.elasticsearch.common.xcontent.json.JsonXContent; | ||
import org.elasticsearch.plugins.Plugin; | ||
import org.elasticsearch.snapshots.AbstractSnapshotIntegTestCase; | ||
import org.elasticsearch.snapshots.SnapshotInfo; | ||
import org.elasticsearch.snapshots.mockstore.MockRepository; | ||
import org.elasticsearch.test.ESIntegTestCase; | ||
import org.elasticsearch.threadpool.ThreadPool; | ||
|
||
import java.io.IOException; | ||
import java.io.InputStream; | ||
import java.util.ArrayList; | ||
import java.util.Collection; | ||
import java.util.HashSet; | ||
import java.util.List; | ||
|
||
import static org.elasticsearch.snapshots.AbstractSnapshotIntegTestCase.assertSnapshotListSorted; | ||
import static org.hamcrest.Matchers.hasSize; | ||
import static org.hamcrest.Matchers.in; | ||
import static org.hamcrest.Matchers.is; | ||
|
||
// TODO: dry up duplication across this suite and org.elasticsearch.snapshots.GetSnapshotsIT more | ||
@ESIntegTestCase.ClusterScope(scope = ESIntegTestCase.Scope.TEST, numDataNodes = 0, numClientNodes = 0) | ||
public class RestGetSnapshotsIT extends HttpSmokeTestCase { | ||
|
||
@Override | ||
protected Collection<Class<? extends Plugin>> nodePlugins() { | ||
return CollectionUtils.appendToCopy(super.nodePlugins(), MockRepository.Plugin.class); | ||
} | ||
|
||
@Override | ||
protected Settings nodeSettings(int nodeOrdinal, Settings otherSettings) { | ||
return Settings.builder().put(super.nodeSettings(nodeOrdinal, otherSettings)) | ||
.put(ThreadPool.ESTIMATED_TIME_INTERVAL_SETTING.getKey(), 0) // We have tests that check by-timestamp order | ||
.build(); | ||
} | ||
|
||
public void testSortOrder() throws Exception { | ||
final String repoName = "test-repo"; | ||
AbstractSnapshotIntegTestCase.createRepository(logger, repoName, "fs"); | ||
final List<String> snapshotNamesWithoutIndex = | ||
AbstractSnapshotIntegTestCase.createNSnapshots(logger, repoName, randomIntBetween(3, 20)); | ||
|
||
createIndexWithContent("test-index"); | ||
|
||
final List<String> snapshotNamesWithIndex = | ||
AbstractSnapshotIntegTestCase.createNSnapshots(logger, repoName, randomIntBetween(3, 20)); | ||
|
||
final Collection<String> allSnapshotNames = new HashSet<>(snapshotNamesWithIndex); | ||
allSnapshotNames.addAll(snapshotNamesWithoutIndex); | ||
|
||
final List<SnapshotInfo> defaultSorting = clusterAdmin().prepareGetSnapshots(repoName).get().getSnapshots(repoName); | ||
assertSnapshotListSorted(defaultSorting, null); | ||
assertSnapshotListSorted( | ||
allSnapshotsSorted(allSnapshotNames, repoName, GetSnapshotsAction.SortBy.NAME), GetSnapshotsAction.SortBy.NAME); | ||
assertSnapshotListSorted( | ||
allSnapshotsSorted(allSnapshotNames, repoName, GetSnapshotsAction.SortBy.DURATION), GetSnapshotsAction.SortBy.DURATION | ||
); | ||
assertSnapshotListSorted( | ||
allSnapshotsSorted(allSnapshotNames, repoName, GetSnapshotsAction.SortBy.INDICES), GetSnapshotsAction.SortBy.INDICES); | ||
assertSnapshotListSorted( | ||
allSnapshotsSorted(allSnapshotNames, repoName, GetSnapshotsAction.SortBy.START_TIME), GetSnapshotsAction.SortBy.START_TIME | ||
); | ||
} | ||
|
||
public void testResponseSizeLimit() throws Exception { | ||
final String repoName = "test-repo"; | ||
AbstractSnapshotIntegTestCase.createRepository(logger, repoName, "fs"); | ||
final List<String> names = AbstractSnapshotIntegTestCase.createNSnapshots(logger, repoName, randomIntBetween(6, 20)); | ||
for (GetSnapshotsAction.SortBy sort : GetSnapshotsAction.SortBy.values()) { | ||
logger.info("--> testing pagination for [{}]", sort); | ||
final List<SnapshotInfo> allSnapshotsSorted = allSnapshotsSorted(names, repoName, sort); | ||
final List<SnapshotInfo> batch1 = sortedWithLimit(repoName, sort, null, 2); | ||
assertEquals(batch1, allSnapshotsSorted.subList(0, 2)); | ||
final List<SnapshotInfo> batch2 = sortedWithLimit(repoName, sort, batch1.get(1), 2); | ||
assertEquals(batch2, allSnapshotsSorted.subList(2, 4)); | ||
final int lastBatch = names.size() - batch1.size() - batch2.size(); | ||
final List<SnapshotInfo> batch3 = sortedWithLimit(repoName, sort, batch2.get(1), lastBatch); | ||
assertEquals(batch3, allSnapshotsSorted.subList(batch1.size() + batch2.size(), names.size())); | ||
final List<SnapshotInfo> batch3NoLimit = sortedWithLimit(repoName, sort, batch2.get(1), 0); | ||
assertEquals(batch3, batch3NoLimit); | ||
final List<SnapshotInfo> batch3LargeLimit = | ||
sortedWithLimit(repoName, sort, batch2.get(1), lastBatch + randomIntBetween(1, 100)); | ||
assertEquals(batch3, batch3LargeLimit); | ||
} | ||
} | ||
|
||
public void testSortAndPaginateWithInProgress() throws Exception { | ||
final String repoName = "test-repo"; | ||
AbstractSnapshotIntegTestCase.createRepository(logger, repoName, "mock"); | ||
final Collection<String> allSnapshotNames = | ||
new HashSet<>(AbstractSnapshotIntegTestCase.createNSnapshots(logger, repoName, randomIntBetween(3, 20))); | ||
createIndexWithContent("test-index-1"); | ||
allSnapshotNames.addAll(AbstractSnapshotIntegTestCase.createNSnapshots(logger, repoName, randomIntBetween(3, 20))); | ||
createIndexWithContent("test-index-2"); | ||
|
||
final int inProgressCount = randomIntBetween(6, 20); | ||
final List<ActionFuture<CreateSnapshotResponse>> inProgressSnapshots = new ArrayList<>(inProgressCount); | ||
AbstractSnapshotIntegTestCase.blockAllDataNodes(repoName); | ||
for (int i = 0; i < inProgressCount; i++) { | ||
final String snapshotName = "snap-" + i; | ||
allSnapshotNames.add(snapshotName); | ||
inProgressSnapshots.add(AbstractSnapshotIntegTestCase.startFullSnapshot(logger, repoName, snapshotName, false)); | ||
} | ||
AbstractSnapshotIntegTestCase.awaitNumberOfSnapshotsInProgress(logger, inProgressCount); | ||
|
||
assertStablePagination(repoName, allSnapshotNames, GetSnapshotsAction.SortBy.START_TIME); | ||
assertStablePagination(repoName, allSnapshotNames, GetSnapshotsAction.SortBy.NAME); | ||
assertStablePagination(repoName, allSnapshotNames, GetSnapshotsAction.SortBy.INDICES); | ||
|
||
AbstractSnapshotIntegTestCase.unblockAllDataNodes(repoName); | ||
for (ActionFuture<CreateSnapshotResponse> inProgressSnapshot : inProgressSnapshots) { | ||
AbstractSnapshotIntegTestCase.assertSuccessful(logger, inProgressSnapshot); | ||
} | ||
|
||
assertStablePagination(repoName, allSnapshotNames, GetSnapshotsAction.SortBy.START_TIME); | ||
assertStablePagination(repoName, allSnapshotNames, GetSnapshotsAction.SortBy.NAME); | ||
assertStablePagination(repoName, allSnapshotNames, GetSnapshotsAction.SortBy.INDICES); | ||
} | ||
|
||
private void createIndexWithContent(String indexName) { | ||
logger.info("--> creating index [{}]", indexName); | ||
createIndex(indexName, AbstractSnapshotIntegTestCase.SINGLE_SHARD_NO_REPLICA); | ||
ensureGreen(indexName); | ||
indexDoc(indexName, "some_id", "foo", "bar"); | ||
} | ||
|
||
private static void assertStablePagination(String repoName, | ||
Collection<String> allSnapshotNames, | ||
GetSnapshotsAction.SortBy sort) throws IOException { | ||
final List<SnapshotInfo> allSorted = allSnapshotsSorted(allSnapshotNames, repoName, sort); | ||
|
||
for (int i = 1; i <= allSnapshotNames.size(); i++) { | ||
final List<SnapshotInfo> subsetSorted = sortedWithLimit(repoName, sort, i); | ||
assertEquals(subsetSorted, allSorted.subList(0, i)); | ||
} | ||
|
||
for (int j = 0; j < allSnapshotNames.size(); j++) { | ||
final SnapshotInfo after = allSorted.get(j); | ||
for (int i = 1; i < allSnapshotNames.size() - j; i++) { | ||
final List<SnapshotInfo> subsetSorted = sortedWithLimit(repoName, sort, after, i); | ||
assertEquals(subsetSorted, allSorted.subList(j + 1, j + i + 1)); | ||
} | ||
} | ||
} | ||
|
||
private static List<SnapshotInfo> allSnapshotsSorted(Collection<String> allSnapshotNames, | ||
String repoName, | ||
GetSnapshotsAction.SortBy sortBy) throws IOException { | ||
final Request request = baseGetSnapshotsRequest(repoName); | ||
request.addParameter("sort", sortBy.toString()); | ||
final Response response = getRestClient().performRequest(request); | ||
final List<SnapshotInfo> snapshotInfos = readSnapshotInfos(repoName, response); | ||
assertEquals(snapshotInfos.size(), allSnapshotNames.size()); | ||
for (SnapshotInfo snapshotInfo : snapshotInfos) { | ||
assertThat(snapshotInfo.snapshotId().getName(), is(in(allSnapshotNames))); | ||
} | ||
return snapshotInfos; | ||
} | ||
|
||
private static Request baseGetSnapshotsRequest(String repoName) { | ||
return new Request(HttpGet.METHOD_NAME, "/_snapshot/" + repoName + "/*"); | ||
} | ||
|
||
private static List<SnapshotInfo> sortedWithLimit(String repoName, GetSnapshotsAction.SortBy sortBy, int size) throws IOException { | ||
final Request request = baseGetSnapshotsRequest(repoName); | ||
request.addParameter("sort", sortBy.toString()); | ||
request.addParameter("size", String.valueOf(size)); | ||
final Response response = getRestClient().performRequest(request); | ||
final List<SnapshotInfo> snapshotInfos = readSnapshotInfos(repoName, response); | ||
assertThat(snapshotInfos, hasSize(size)); | ||
return snapshotInfos; | ||
} | ||
|
||
private static List<SnapshotInfo> readSnapshotInfos(String repoName, Response response) throws IOException { | ||
final List<SnapshotInfo> snapshotInfos; | ||
try (InputStream input = response.getEntity().getContent(); | ||
XContentParser parser = JsonXContent.jsonXContent.createParser( | ||
NamedXContentRegistry.EMPTY, DeprecationHandler.THROW_UNSUPPORTED_OPERATION, input)) { | ||
snapshotInfos = GetSnapshotsResponse.fromXContent(parser).getSnapshots(repoName); | ||
} | ||
return snapshotInfos; | ||
} | ||
|
||
private static List<SnapshotInfo> sortedWithLimit(String repoName, | ||
GetSnapshotsAction.SortBy sortBy, | ||
SnapshotInfo after, | ||
int size) throws IOException { | ||
final Request request = baseGetSnapshotsRequest(repoName); | ||
request.addParameter("sort", sortBy.toString()); | ||
if (size != 0 || randomBoolean()) { | ||
request.addParameter("size", String.valueOf(size)); | ||
} | ||
if (after != null) { | ||
request.addParameter("after", GetSnapshotsRequest.After.from(after, sortBy).value() + "," + after.snapshotId().getName()); | ||
} | ||
final Response response = getRestClient().performRequest(request); | ||
return readSnapshotInfos(repoName, response); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new arguments are indicated as specified in the body, but I think they have to be specified as a query param?
In fact, looking at some of the other params here, it looks as though those also need to be specified as query param, at least
ignore_unavailable
? I feel like I am missing something that our REST framework handles automatically?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, this seems to be a doc bug, sorry for not noticing. The docs are simply wrong here IMO, all of these parameters only work as parts of the URL. We never parse the request body.
I'll adjust the docs accordingly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One second look, we have the same bug in at least the snapshot status API as well where we document all query parameters as body elements but don't actually parse the body. Shall we fix this in a separate PR and audit other docs for the same mistake and leave it as is here maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, we would just move the new parameters introduced in this PR to the query params section and then fix it in another PR.
But I am also good with your suggestion here if you prefer.