Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Total data set size in stats #70625

Merged

Conversation

henningandersen
Copy link
Contributor

@henningandersen henningandersen commented Mar 22, 2021

With shared cache searchable snapshots we have shards that have a size
in S3 that differs from the locally occupied disk space. This commit
introduces store.total_data_set_size to node and indices stats, allowing to
differ between the two.

Relates #69820

With shared cache searchable snapshots we have shards that have a size
in S3 that differs from the locally occupied disk space. This commit
introduces `store.local_size` to node and indices stats, allowing to
differ between the two.

Relates elastic#69820
@henningandersen henningandersen added >enhancement :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs :Data Management/Stats Statistics tracking and retrieval APIs v8.0.0 v7.13.0 labels Mar 22, 2021
@elasticmachine elasticmachine added Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. Team:Data Management Meta label for data/management team labels Mar 22, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (Team:Core/Features)

@@ -422,6 +422,7 @@ public void testToXContent() throws IOException {
+ " },"
+ " \"store\": {"
+ " \"size_in_bytes\": 0,"
+ " \"local_size_in_bytes\": 0,"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is mapped as a dynamic=false object, I chose the easy path here, but it is slightly inconsistent with index and indices stats monitoring docs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow this - what is the mapping you mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mapping is in monitoring-es.json here:

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, with one comment on the tests.

Should we change the InternalClusterInfoService to use the local size of shards in this change too?

final long expectedSize = snapshotShards.get(shardStats.getShardRouting().getId()).getStats().getTotalSize();
assertThat(shardStats.getShardRouting().toString(), store.getSize().getBytes(), greaterThanOrEqualTo(expectedSize));
// we expect the new segments file to make it significantly less than 2x.
assertThat(shardStats.getShardRouting().toString(), store.getSize().getBytes(), lessThan(expectedSize * 2));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it might be flaky - why *2? The index could in theory have one tiny doc per segment, are we sure that each segment will always be larger than the per-segment entry in the segments_N file?

If you're sure this is ok, can we explain the reasoning in a more detailed comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 2x is there because there is one segments_N file already and the bootstrap/translog association adds another segments_N file with only changes being the two UUIDs that are now different. AFAICS, that will always lead to a new segments_N file of identical size. Thus a worst case is 2x. I added a comment to clarify this in the code too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good point I had forgotten about the older segments_N file. Yes this looks solid now.

@henningandersen henningandersen changed the title Local size in stats Total data set size in stats Mar 24, 2021
@henningandersen
Copy link
Contributor Author

After receiving feedback on this outside this PR, I have left size as the "local, on-disk size" metric and added a new total_data_set_size metric instead, representing the full size of shared cache searchable snapshots (as well as the size of all regular shards).

This is ready for another review round.

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@henningandersen
Copy link
Contributor Author

Build failure reported in #66495.

@elasticmachine run elasticsearch-ci/2

@henningandersen henningandersen merged commit 0f28e97 into elastic:master Mar 30, 2021
henningandersen added a commit to henningandersen/elasticsearch that referenced this pull request Mar 30, 2021
With shared cache searchable snapshots we have shards that have a size
in S3 that differs from the locally occupied disk space. This commit
introduces `store.total_data_set_size` to node and indices stats, allowing to
differ between the two.

Relates elastic#69820
henningandersen added a commit that referenced this pull request Mar 31, 2021
With shared cache searchable snapshots we have shards that have a size
in S3 that differs from the locally occupied disk space. This commit
introduces `store.total_data_set_size` to node and indices stats, allowing to
differ between the two.

Relates #69820
henningandersen added a commit that referenced this pull request Mar 31, 2021
StoreStats serialization was changed in #70625.
henningandersen added a commit that referenced this pull request Mar 31, 2021
Reenable bwc and serialize new field `totalDataSetSizeInBytes` to 7.13+ now that the backport of #70625 is done.
tlrx added a commit that referenced this pull request May 27, 2021
…shotsIntegTests (#73243)

In #70625 we added the total data set size of shards 
to the Indices Stats API and we enhanced the test 
testCreateAndRestorePartialSearchableSnapshot to 
also verify the correctness of this data set size.

Because restoring a searchable snapshot shard 
creates a new in-memory segment size, the 
verification of the data set size was implemented 
in an approximative fashion: between the 
expected size and twice the expected size. This 
approximation sometimes fails for shards that 
have no documents indexed (see #73194).

This commit changes the test so that it now 
verifies the exact data set size returned by the 
Indices Stats API, which should be the sum of 
the original expected size of the snapshotted 
size + the length of the extra segment file in 
memory.

Closes #73194
tlrx added a commit to tlrx/elasticsearch that referenced this pull request May 27, 2021
…shotsIntegTests (elastic#73243)

In elastic#70625 we added the total data set size of shards 
to the Indices Stats API and we enhanced the test 
testCreateAndRestorePartialSearchableSnapshot to 
also verify the correctness of this data set size.

Because restoring a searchable snapshot shard 
creates a new in-memory segment size, the 
verification of the data set size was implemented 
in an approximative fashion: between the 
expected size and twice the expected size. This 
approximation sometimes fails for shards that 
have no documents indexed (see elastic#73194).

This commit changes the test so that it now 
verifies the exact data set size returned by the 
Indices Stats API, which should be the sum of 
the original expected size of the snapshotted 
size + the length of the extra segment file in 
memory.

Closes elastic#73194
tlrx added a commit to tlrx/elasticsearch that referenced this pull request May 27, 2021
…shotsIntegTests (elastic#73243)

In elastic#70625 we added the total data set size of shards 
to the Indices Stats API and we enhanced the test 
testCreateAndRestorePartialSearchableSnapshot to 
also verify the correctness of this data set size.

Because restoring a searchable snapshot shard 
creates a new in-memory segment size, the 
verification of the data set size was implemented 
in an approximative fashion: between the 
expected size and twice the expected size. This 
approximation sometimes fails for shards that 
have no documents indexed (see elastic#73194).

This commit changes the test so that it now 
verifies the exact data set size returned by the 
Indices Stats API, which should be the sum of 
the original expected size of the snapshotted 
size + the length of the extra segment file in 
memory.

Closes elastic#73194
tlrx added a commit that referenced this pull request May 27, 2021
…shotsIntegTests (#73243) (#73455)

In #70625 we added the total data set size of shards 
to the Indices Stats API and we enhanced the test 
testCreateAndRestorePartialSearchableSnapshot to 
also verify the correctness of this data set size.

Because restoring a searchable snapshot shard 
creates a new in-memory segment size, the 
verification of the data set size was implemented 
in an approximative fashion: between the 
expected size and twice the expected size. This 
approximation sometimes fails for shards that 
have no documents indexed (see #73194).

This commit changes the test so that it now 
verifies the exact data set size returned by the 
Indices Stats API, which should be the sum of 
the original expected size of the snapshotted 
size + the length of the extra segment file in 
memory.

Closes #73194
tlrx added a commit that referenced this pull request May 27, 2021
…shotsIntegTests (#73243) (#73454)

In #70625 we added the total data set size of shards 
to the Indices Stats API and we enhanced the test 
testCreateAndRestorePartialSearchableSnapshot to 
also verify the correctness of this data set size.

Because restoring a searchable snapshot shard 
creates a new in-memory segment size, the 
verification of the data set size was implemented 
in an approximative fashion: between the 
expected size and twice the expected size. This 
approximation sometimes fails for shards that 
have no documents indexed (see #73194).

This commit changes the test so that it now 
verifies the exact data set size returned by the 
Indices Stats API, which should be the sum of 
the original expected size of the snapshotted 
size + the length of the extra segment file in 
memory.

Closes #73194
dakrone added a commit to dakrone/elasticsearch that referenced this pull request May 9, 2022
The telemetry for data tiers was using the size in bytes, however, for the frozen tier using
searchable snapshots, this was the disk usage rather than the size of the actual data.

This commit changes the telemetry to use `total_data_set_size` as introduced in elastic#70625 so that the
telemetry is correct.

Resolves elastic#86055
elasticsearchmachine pushed a commit that referenced this pull request May 12, 2022
The telemetry for data tiers was using the size in bytes, however, for
the frozen tier using searchable snapshots, this was the disk usage
rather than the size of the actual data.

This commit changes the telemetry to use `total_data_set_size` as
introduced in #70625 so that the telemetry is correct.

Resolves #86055
dakrone added a commit to dakrone/elasticsearch that referenced this pull request May 12, 2022
…c#86580)

The telemetry for data tiers was using the size in bytes, however, for
the frozen tier using searchable snapshots, this was the disk usage
rather than the size of the actual data.

This commit changes the telemetry to use `total_data_set_size` as
introduced in elastic#70625 so that the telemetry is correct.

Resolves elastic#86055
dakrone added a commit to dakrone/elasticsearch that referenced this pull request May 12, 2022
…c#86580)

The telemetry for data tiers was using the size in bytes, however, for
the frozen tier using searchable snapshots, this was the disk usage
rather than the size of the actual data.

This commit changes the telemetry to use `total_data_set_size` as
introduced in elastic#70625 so that the telemetry is correct.

Resolves elastic#86055
elasticsearchmachine pushed a commit that referenced this pull request May 12, 2022
#86749)

The telemetry for data tiers was using the size in bytes, however, for
the frozen tier using searchable snapshots, this was the disk usage
rather than the size of the actual data.

This commit changes the telemetry to use `total_data_set_size` as
introduced in #70625 so that the telemetry is correct.

Resolves #86055
elasticsearchmachine pushed a commit that referenced this pull request May 12, 2022
#86748)

The telemetry for data tiers was using the size in bytes, however, for
the frozen tier using searchable snapshots, this was the disk usage
rather than the size of the actual data.

This commit changes the telemetry to use `total_data_set_size` as
introduced in #70625 so that the telemetry is correct.

Resolves #86055
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Stats Statistics tracking and retrieval APIs :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >enhancement Team:Data Management Meta label for data/management team Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v7.13.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants