Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve series iteration for TSI index #20543

Closed
lesam opened this issue Jan 19, 2021 · 0 comments
Closed

Improve series iteration for TSI index #20543

lesam opened this issue Jan 19, 2021 · 0 comments

Comments

@lesam
Copy link
Contributor

lesam commented Jan 19, 2021

Proposal:
Queries like select count(_seriesKey) from /.*/ should be satisfied only from the TSI index, without reference to the tsm engine. We should iterate through the _seriesKey values without storing them all in memory.

Current behavior:
Currently a query like the one above would require reading all the series keys into RAM.

Desired behavior:
The query should iterate over the series without swapping/OOM-ing if there are too many series to fit in memory

Alternatives considered:
'show series cardinality' shows the total series cardinality blazingly fast, but does not show the series cardinality at the measurement level. 'show series cardinality from my measurement' is pretty slow for large numbers of series. This is a first step toward improving that type of query.

Use case:
Finding out cardinality by measurement to investigate series cardinality at a more granular level.

lesam added a commit to lesam/influxdb that referenced this issue Jan 19, 2021
When using queries like 'select count(_seriesKey) from bigmeasurement`, we
should iterate over the tsi structures to serve the query instead of loading
all the series into memory up front.

Closes influxdata#20543
lesam added a commit to lesam/influxdb that referenced this issue Jan 19, 2021
When using queries like 'select count(_seriesKey) from bigmeasurement`, we
should iterate over the tsi structures to serve the query instead of loading
all the series into memory up front.

Closes influxdata#20543
lesam added a commit to lesam/influxdb that referenced this issue Jan 25, 2021
When using queries like 'select count(_seriesKey) from bigmeasurement`, we
should iterate over the tsi structures to serve the query instead of loading
all the series into memory up front.

Closes influxdata#20543
lesam added a commit to lesam/influxdb that referenced this issue Jan 25, 2021
When using queries like 'select count(_seriesKey) from bigmeasurement`, we
should iterate over the tsi structures to serve the query instead of loading
all the series into memory up front.

Closes influxdata#20543
@lesam lesam self-assigned this Jan 26, 2021
lesam added a commit to lesam/influxdb that referenced this issue Feb 10, 2021
Closes: influxdata#20614

Also fix nil pointer for seriesKey iterator

Fix for bug in: influxdata#20543

Also add a test for ingress metrics
lesam added a commit to lesam/influxdb that referenced this issue Feb 10, 2021
Closes: influxdata#20614

Also fix nil pointer for seriesKey iterator

Fix for bug in: influxdata#20543

Also add a test for ingress metrics
lesam added a commit that referenced this issue Feb 10, 2021
Closes: #20614

Also fix nil pointer for seriesKey iterator

Fix for bug in: #20543

Also add a test for ingress metrics
@lesam lesam closed this as completed Mar 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant