Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve consistency on SQL Activity #108989

Closed
maryliag opened this issue Aug 18, 2023 · 3 comments · Fixed by #109164
Closed

improve consistency on SQL Activity #108989

maryliag opened this issue Aug 18, 2023 · 3 comments · Fixed by #109164
Assignees
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

Comments

@maryliag
Copy link
Contributor

maryliag commented Aug 18, 2023

Scenario: a fingerprint gets executed 1k times per day.
On statement_activity, we are able to keep the data for that fingerprint for 2 days.
On statement_statistics we had data for several days, but the limit of the table was reached, so older data had to be deleted and now only data for the past day is available.

If a user selects see past day: it will return the results from statement_activity, since the selected period existed there, and we will show 2k.
If a user selects past week: the statement_activity doesn't have this entire period, so it will return the results from statement_statistics, and that only had the last day, so it will return 1k.

So for the user, it shows past day 2k and past week 1k, which is confusing.
We need to improve this experience.

Possible solution:

  • Delete the data based on TTL and have the same value for all tables
  • Some union when the main table is selected
  • Show warning/info message.

Jira issue: CRDB-30734

@maryliag maryliag added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-cluster-observability labels Aug 18, 2023
@andy-kimball
Copy link
Contributor

IMO, combining the data is not useful if there's still data missing. I need confidence that I have all the data, or I can reach incorrect conclusions about my workload. We should always show a warning if there may be data missing. Furthermore, the warning should tell the user how to increase their data retention limit.

@andy-kimball
Copy link
Contributor

One thing I don't understand, how come we don't combine rows, i.e. consolidate hourly into daily, daily into weekly, etc. The granularity would go down, but at least we wouldn't completely lose data. There are usually going to be a fairly small set of fingerprints, so it won't use much storage if we keep longer term rows for each unique fingerprint.

@maryliag
Copy link
Contributor Author

how come we don't combine rows

that will be accomplish when we migrate to use the observability service, so improvements on retention and granularity are part of the roadmap.

maryliag added a commit to maryliag/cockroach that referenced this issue Aug 18, 2023
Previously, it was hard to know which tables were
used to populate the SQL Activity, making debug for it
complicated.
This commit adds extra information to the
`combinedstmts` to help:
- olderAggregatedTsReturned
- stmtsSourceTable
- txnsSourceTable

Returning a value for `olderAggregatedTsReturned` will
also allow us to show proper warning when the older
timestamp doesn't match the start date of the selected
time period on the Search Criteria.

Part Of cockroachdb#108989

Release note: None
maryliag added a commit to maryliag/cockroach that referenced this issue Aug 19, 2023
Previously, it was hard to know which tables were
used to populate the SQL Activity, making debug for it
complicated.
This commit adds extra information to the
`combinedstmts` to help:
- olderAggregatedTsReturned
- stmtsSourceTable
- txnsSourceTable

Returning a value for `olderAggregatedTsReturned` will
also allow us to show proper warning when the older
timestamp doesn't match the start date of the selected
time period on the Search Criteria.

Part Of cockroachdb#108989

Release note: None
maryliag added a commit to maryliag/cockroach that referenced this issue Aug 20, 2023
Previously, it was hard to know which tables were
used to populate the SQL Activity, making debug for it
complicated.
This commit adds extra information to the
`combinedstmts` to help:
- olderAggregatedTsReturned
- stmtsSourceTable
- txnsSourceTable

Returning a value for `olderAggregatedTsReturned` will
also allow us to show proper warning when the older
timestamp doesn't match the start date of the selected
time period on the Search Criteria.

Part Of cockroachdb#108989

Release note: None
maryliag added a commit to maryliag/cockroach that referenced this issue Aug 20, 2023
Previously, it was hard to know which tables were
used to populate the SQL Activity, making debug for it
complicated.
This commit adds extra information to the
`combinedstmts` to help:
- olderAggregatedTsReturned
- stmtsSourceTable
- txnsSourceTable

Returning a value for `olderAggregatedTsReturned` will
also allow us to show proper warning when the older
timestamp doesn't match the start date of the selected
time period on the Search Criteria.

Part Of cockroachdb#108989

Release note: None
craig bot pushed a commit that referenced this issue Aug 21, 2023
108483: roachtest: wait smaller duration before timining out lease prefs r=erikgrinaker a=kvoli

Previously, the `lease-preferences` roachtests would fail on the test
tiemout, despite the test actually failing much earlier. 5 minutes
stopping a node, the node is considered dead, and up-replication will
begin. If the cluster doesn't satisfy lease preferences within that 5
minute window, the test will reliably fail -- making the remainder of
the test timeout (30 or 60 minutes), wasted time.

Fail earlier, by adhering to a post event timeout. This timeout set to
10 minutes.

Informs: #108425
Epic: none
Release note: None

108775: kvserver: fix double span Finish on reproposals r=erikgrinaker a=pavelkalinnikov

Since #106750, `ProposalData` is being copied on superseding reproposals. The caller only knows about the original proposal, and only detaches its context / tracing span when abandoning the request (e.g. on timeout). By the time of abandoning the request, there might have been a few superseding reproposals, and the context / tracing span is being used by multiple `ProposalData` objects.

In addition to rewriting the original `proposal.ctx`, we should do the same for all the `ProposalData` objects corresponding to the same request. This commit unbinds the context for older proposals when reproposing them, and unbinds the latest reproposal context when cleaning up / abandoning the request.

Fixes #107521
Fixes #108534
Fixes #108241
Fixes #108663
Fixes #108696
Fixes #108837

Epic: CRDB-25287
Release note: none

108942: backupccl: deflake memory monitor restore test r=rhu713 a=rhu713

Deflake the TestRestoreMemoryMonitoring tests by only asserting the lower bound in the number of files produced in the backup to account for elastic CPU limiter.

Fixes: #108239

Release note: None

109040: server: add more info to the combined statement endpoint r=maryliag a=maryliag

Previously, it was hard to know which tables were
used to populate the SQL Activity, making debug for it complicated.
This commit adds extra information to the
`combinedstmts` to help:
- oldestAggregatedTsReturned
- stmtsSourceTable
- txnsSourceTable

Returning a value for `olderAggregatedTsReturned` will also allow us to show proper warning when the older timestamp doesn't match the start date of the selected time period on the Search Criteria.

Part Of #108989

Release note: None

Co-authored-by: Austen McClernon <[email protected]>
Co-authored-by: Pavel Kalinnikov <[email protected]>
Co-authored-by: Rui Hu <[email protected]>
Co-authored-by: maryliag <[email protected]>
blathers-crl bot pushed a commit that referenced this issue Aug 21, 2023
Previously, it was hard to know which tables were
used to populate the SQL Activity, making debug for it
complicated.
This commit adds extra information to the
`combinedstmts` to help:
- olderAggregatedTsReturned
- stmtsSourceTable
- txnsSourceTable

Returning a value for `olderAggregatedTsReturned` will
also allow us to show proper warning when the older
timestamp doesn't match the start date of the selected
time period on the Search Criteria.

Part Of #108989

Release note: None
@maryliag maryliag self-assigned this Aug 21, 2023
maryliag added a commit to maryliag/cockroach that referenced this issue Aug 21, 2023
If a user selects a time period that we don't have
the data for it (because it was already cleaned),
we show a warning on SQL Activity (Statement and Transactions)
to indicate the data might not be complete.

Fix cockroachdb#108989

Release note (ui change): Show warning when time period selected
on SQL Activity is older than the oldest data available.
craig bot pushed a commit that referenced this issue Aug 24, 2023
109164: ui: show warning when data is not old enough r=maryliag a=maryliag

If a user selects a time period that we don't have the data for it (because it was already cleaned),
we show a warning to indicate the data might not be complete.

Fix #108989

<img width="334" alt="Screenshot 2023-08-23 at 5 30 18 PM" src="https://github.com/cockroachdb/cockroach/assets/1017486/ed9d9768-c4b4-456f-b5ce-0156bbfd9d4f">



https://www.loom.com/share/a7e0b0738d834bb0b6f38a9da43fb6a4


Release note (ui change): Show warning when time period selected on SQL Activity is older than the oldest data available.

109290: multitenantccl: add metric for network consumption r=JeffSwenson a=JeffSwenson

Previously, the cross region network portion of the RU consumption did not get its own line item in the TenantConsumption proto or the tenant consumption metrics. Now, the network component is removed from the kvRU field. Network cost is recorded in the TenantConsumption proto and exported as its own consumption metric.

The cross region network cost table was added by #105592 and is not yet deployed to production.

Release Note: None

Fixes: #108376

109351: changefeedccl: enable new webhook and pubsub sinks by default r=samiskin a=samiskin

Resolves #108806

This PR enables the new webhook and pubsub sink implementations by default. Leaving the settings in existence for near future in case something does go wrong, but I did not set them to public since there's no normal-operation reason to disable them.

Release note (performance improvement): changefeeds to webhook or pubsub endpoints now support much higher throughput

Co-authored-by: maryliag <[email protected]>
Co-authored-by: Jeff <[email protected]>
Co-authored-by: Shiranka Miskin <[email protected]>
@craig craig bot closed this as completed in 0d123cf Aug 24, 2023
maryliag added a commit to maryliag/cockroach that referenced this issue Aug 24, 2023
If a user selects a time period that we don't have
the data for it (because it was already cleaned),
we show a warning on SQL Activity (Statement and Transactions)
to indicate the data might not be complete.

Fix cockroachdb#108989

Release note (ui change): Show warning when time period selected
on SQL Activity is older than the oldest data available.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants