-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server: add more info to the combined statement endpoint #109040
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 4 files at r1, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @maryliag)
pkg/server/combined_statement_stats.go
line 342 at r1 (raw file):
fmt.Sprintf(` SELECT COALESCE(min(aggregated_ts), now()) FROM %s %s`, table, whereClause), args...)
I have concern about this statement, I ran EXPLAIN ANALYZE
on this query and it used full scan on it.
EXPLAIN ANALYZE SELECT COALESCE(min(aggregated_ts), now()) FROM crdb_internal.statement_statistics WHERE true AND aggregated_ts >= now() AND aggregated_ts <= now();
result:
planning time: 9ms
execution time: 9ms
distribution: local
vectorized: true
• render
│
└── • group (scalar)
│ nodes: n1
...
│
└── • scan
....
table: statement_statistics@fingerprint_stats_idx ---------------------- WARNING: the row count estimate is inaccurate, consider running 'ANALYZE statement_statistics'
spans: FULL SCAN (SOFT LIMIT)
Most of the indexes on system.statement\_statistics
table defined with condition WHERE app\_name NOT LIKE '$ internal%'
so I suggest to include this clause in WHERE condition of the statement:
EXPLAIN ANALYZE
SELECT COALESCE(min(aggregated_ts), now()) FROM crdb_internal.statement_statistics
WHERE true
AND aggregated_ts >= now()
AND aggregated_ts <= now()
AND app_name NOT LIKE '$ internal%' -- <- add this
which gives us:
└── • scan
nodes: n1
...
table: statement_statistics@execution_count_idx (partial index)
spans: [/'2023-08-18 17:31:07.875656+00' - /'2023-08-19 17:40:35.478088+00']
pkg/server/serverpb/status.proto
line 1644 at r1 (raw file):
// OlderAggregatedTsReturned is the timestamp of the oldest entry returned, // or NOW() if there is no data returned. google.protobuf.Timestamp older_aggregated_ts_returned = 8 [(gogoproto.nullable) = false, (gogoproto.stdtime) = true];
The naming of field is a bit confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @maryliag)
pkg/server/combined_statement_stats.go
line 345 at r1 (raw file):
if err != nil { return timeutil.Now(), err
What is the motivation to return now()
instead of NULL in case of error or empty results?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @koorosh)
pkg/server/combined_statement_stats.go
line 342 at r1 (raw file):
Previously, koorosh (Andrii Vorobiov) wrote…
I have concern about this statement, I ran
EXPLAIN ANALYZE
on this query and it used full scan on it.EXPLAIN ANALYZE SELECT COALESCE(min(aggregated_ts), now()) FROM crdb_internal.statement_statistics WHERE true AND aggregated_ts >= now() AND aggregated_ts <= now();result:
planning time: 9ms execution time: 9ms distribution: local vectorized: true • render │ └── • group (scalar) │ nodes: n1 ... │ └── • scan .... table: statement_statistics@fingerprint_stats_idx ---------------------- WARNING: the row count estimate is inaccurate, consider running 'ANALYZE statement_statistics' spans: FULL SCAN (SOFT LIMIT)
Most of the indexes on
system.statement\_statistics
table defined with conditionWHERE app\_name NOT LIKE '$ internal%'
so I suggest to include this clause in WHERE condition of the statement:EXPLAIN ANALYZE SELECT COALESCE(min(aggregated_ts), now()) FROM crdb_internal.statement_statistics WHERE true AND aggregated_ts >= now() AND aggregated_ts <= now() AND app_name NOT LIKE '$ internal%' -- <- add thiswhich gives us:
└── • scan nodes: n1 ... table: statement_statistics@execution_count_idx (partial index) spans: [/'2023-08-18 17:31:07.875656+00' - /'2023-08-19 17:40:35.478088+00']
We actually have a cluster setting to decide if we're returning internal statements on this endpoint or not, so I added the changes to take that into consideration. The default value is to filter out internal, so now is doing that correctly! Nice callout!
pkg/server/combined_statement_stats.go
line 345 at r1 (raw file):
Previously, koorosh (Andrii Vorobiov) wrote…
What is the motivation to return
now()
instead of NULL in case of error or empty results?
I get compile error if I try to just return null, since the function expects a Time. I changed to just the empty Time component, which ends up also returning NOW, but at east shows that we don't really care the value itself
pkg/server/serverpb/status.proto
line 1644 at r1 (raw file):
Previously, koorosh (Andrii Vorobiov) wrote…
The naming of field is a bit confusing.
Made a slight change, let me know if is more clear now. We do select a time period, but we might not have data for that full period, so let's say we request from august 10 to august 19, but we only have data starting on aug 15, so that is the value returned here. A following PR will use this value to show a warning in the UI.
Let me know if the name is better now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r2, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @maryliag)
pkg/server/combined_statement_stats.go
line 342 at r1 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
We actually have a cluster setting to decide if we're returning internal statements on this endpoint or not, so I added the changes to take that into consideration. The default value is to filter out internal, so now is doing that correctly! Nice callout!
Great!
pkg/server/combined_statement_stats.go
line 345 at r1 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
I get compile error if I try to just return null, since the function expects a Time. I changed to just the empty Time component, which ends up also returning NOW, but at east shows that we don't really care the value itself
It requires following changes to make it work with null
:
- Remove
(gogoproto.nullable) = false
attribute fromoldest_aggregated_ts_returned
field in ptoto message
google.protobuf.Timestamp oldest\_aggregated\_ts\_returned = 8 [(gogoproto.stdtime) = true];
- Return pointer to time.Date in
getOldestDate
and change returned values to eithernil
or address of value
getOldestDate := func(table string) (\*time.Time, error) {
My concern about returning current time is to avoid confusion when users reset stats so there's no records in tables but oldest aggregated ts
would return current time which in turn would be problematic.
pkg/server/combined_statement_stats.go
line 375 at r2 (raw file):
}() return tree.MustBeDTimestampTZ(row[0]).Time, nil
tree.MustBeDTimestampTZ
panics if it cannot retrieve the value. Should we be more tolerant and use AsDTimestampTZ
that indicates where the value parsed or not?
pkg/server/serverpb/status.proto
line 1644 at r1 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
Made a slight change, let me know if is more clear now. We do select a time period, but we might not have data for that full period, so let's say we request from august 10 to august 19, but we only have data starting on aug 15, so that is the value returned here. A following PR will use this value to show a warning in the UI.
Let me know if the name is better now.
Got it! Thanks for detailed explanation it is clear now! Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @koorosh)
pkg/server/combined_statement_stats.go
line 345 at r1 (raw file):
Previously, koorosh (Andrii Vorobiov) wrote…
It requires following changes to make it work with
null
:
- Remove
(gogoproto.nullable) = false
attribute fromoldest_aggregated_ts_returned
field in ptoto messagegoogle.protobuf.Timestamp oldest\_aggregated\_ts\_returned = 8 [(gogoproto.stdtime) = true];
- Return pointer to time.Date in
getOldestDate
and change returned values to eithernil
or address of valuegetOldestDate := func(table string) (\*time.Time, error) {My concern about returning current time is to avoid confusion when users reset stats so there's no records in tables but
oldest aggregated ts
would return current time which in turn would be problematic.
Nice! I made the changes and now returns null
when no data is returned :)
thank you for the detailed steps
pkg/server/combined_statement_stats.go
line 375 at r2 (raw file):
Previously, koorosh (Andrii Vorobiov) wrote…
tree.MustBeDTimestampTZ
panics if it cannot retrieve the value. Should we be more tolerant and useAsDTimestampTZ
that indicates where the value parsed or not?
the value should ne null or timestamp, so I added a check for DNull first, then make the call the parse the timestamp, now it will probably return null or the timestamp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, great!
Also want to clarify if we need to do the same check for transaction statistics?
Reviewed 3 of 3 files at r3, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @maryliag)
pkg/server/combined_statement_stats.go
line 375 at r2 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
the value should ne null or timestamp, so I added a check for DNull first, then make the call the parse the timestamp, now it will probably return null or the timestamp.
Nice! 🎉
Previously, it was hard to know which tables were used to populate the SQL Activity, making debug for it complicated. This commit adds extra information to the `combinedstmts` to help: - olderAggregatedTsReturned - stmtsSourceTable - txnsSourceTable Returning a value for `olderAggregatedTsReturned` will also allow us to show proper warning when the older timestamp doesn't match the start date of the selected time period on the Search Criteria. Part Of cockroachdb#108989 Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On transaction statistics we also return statement statistics, so it will go over the same parts of the code, I just had to made an adjustement.
Examples of responses when:
Statement page with results:
"stmtsTotalRuntimeSecs": 89.46363,
"txnsTotalRuntimeSecs": 0,
"oldestAggregatedTsReturned": "2023-08-20T14:00:00Z",
"stmtsSourceTable": "crdb\_internal.statement\_statistics\_persisted",
"txnsSourceTable": ""
Statement page with no results:
"stmtsTotalRuntimeSecs": 0,
"txnsTotalRuntimeSecs": 0,
"oldestAggregatedTsReturned": null,
"stmtsSourceTable": "crdb\_internal.statement\_statistics",
"txnsSourceTable": ""
Transactions page with results:
"stmtsTotalRuntimeSecs": 90.2161,
"txnsTotalRuntimeSecs": 47.55528,
"oldestAggregatedTsReturned": "2023-08-20T14:00:00Z",
"stmtsSourceTable": "crdb\_internal.statement\_statistics\_persisted",
"txnsSourceTable": "crdb\_internal.transaction\_statistics\_persisted"
Transactions page with no results:
"stmtsTotalRuntimeSecs": 0,
"txnsTotalRuntimeSecs": 0,
"oldestAggregatedTsReturned": null,
"stmtsSourceTable": "crdb\_internal.statement\_statistics",
"txnsSourceTable": "crdb\_internal.transaction\_statistics"
```
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @koorosh)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if my question was misleading.
getOldestDate
function is only called to get oldest timestamp for statement_activity
tables but not transaction_activity
tables. I'm not sure if it is necessary or not.
Reviewed all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @maryliag)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was on purpose, because the statements data would reach the limit and be deleted first and that is used by both statements and transactions, so having just one of them is enough.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @koorosh)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @maryliag)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @maryliag)
TFTR! bors r+ |
Build succeeded: |
blathers backport 23.1 |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error setting reviewers, but backport branch blathers/backport-release-23.1-109040 is ready: POST https://api.github.com/repos/cockroachdb/cockroach/pulls/109132/requested_reviewers: 422 Reviews may only be requested from collaborators. One or more of the teams you specified is not a collaborator of the cockroachdb/cockroach repository. [] Backport to branch 23.1 failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Previously, it was hard to know which tables were
used to populate the SQL Activity, making debug for it complicated.
This commit adds extra information to the
combinedstmts
to help:Returning a value for
oldestAggregatedTsReturned
will also allow us to show proper warning when the older timestamp doesn't match the start date of the selected time period on the Search Criteria.Part Of #108989
Release note: None