-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add index for range scans on views #95770
Comments
What's the level of urgency of this? |
Possibly related: #74806 |
The feature mentioned on the description (listing fingerprints used by index) will have to be hidden to non-admins until this task is completed, so the sooner the better. Feel free to close this one in favour of the task you tagged on your message. |
Another possible approach could be for us to:
Then we could have the function do the nice query parameterized as we want it and then grant access to the function somehow via a role? cc @mgartner and @chengxiong-ruan |
Can we not already do the exact same thing as @ajwerner describes but with views? Or is it impossible to create a role that can access a view but not the tables it references? |
What is the privilege model here? I assumed it was complex, but if it's simple, views should do the trick. |
The get the list of fingerprints used by an index, we use the `system.statement_statistics` directly, but non-admins don't have access to system table. Until cockroachdb#95756 or cockroachdb#95770 are done, we need to hide this feature for non-admins. Part Of cockroachdb#93087 Release note (ui change): Hide list of used fingerprint per index on Index Details page, for non-admin users.
94153: sql: Remove type annotations from `column_default` in `information_schema.columns` r=ZhouXing19 a=e-mbrown Informs: #87774 The type annotation on `column_default` caused confusion in postgres compatible apps. This change formats the `column_default` without the type annotation. It does not address the incorrect type annotation. Release note (bug fix): The content of `column_default` in `information_schema.columns` no longer have type annotations. 95855: backupccl: run restore/tpce/32tb/aws/nodes=15/cpus=16 once a week r=benbardin a=msbutler This patch adds the restore/tpce/32TB/aws/nodes=15/cpus=16 test to our weekly test suite. The backup fixture was generated by initing a tpce cluster with 2 Million customers, and running this workload while 48 incremental backups were taken at 15 minute increments. The roachtest restores from a system time captured in the 24th backup. The cluster used to restore the backup will run on aws with 15 nodes each with 16 vcpus. To verify this test exists, run: `roachtest list tag:weekly` Release note: None Epic: none 95861: kv: remove lastToReplica and lastFromReplica from raftMu r=nvanbenschoten a=nvanbenschoten Extracted from #94165 without modification. ---- In 410ef29, we moved these fields from under the Replica.mu to under the Replica.raftMu. This was done to avoid lock contention. In this commit, we move these fields under their own mutex so that they can be accessed without holding the raftMu. This allows us to send Raft messages from other goroutines. The commit also switches from calling RawNode.ReportUnreachable directly in sendRaftMessage to using the more flexible unreachablesMu set, which defers the call to ReportUnreachable until the next Raft tick. As a result, the commit closes #84246. Release note: None Epic: None 95876: sql: remove round-trips from common DISCARD ALL r=ajwerner a=ajwerner #### sql: avoid checking if a role exists when setting to the current role This is an uncached round-trip. It makes `DISCARD` expensive. In #86485 we implemented `SET SESSION AUTHORIZATION DEFAULT`, this added an extra sql query to `DISCARD ALL` to check if the role we'd become exists. This is almost certainly unintentional in the case where the role we'd become is the current session role. I think this just happened because of code consolidation. We now no longer check if the current session role exists. This removes the last round-trip from DISCARD ALL. #### sql: use in-memory session data to decide what to do in DISCARD In #86246 we introduced logic to discard schemas when running DISCARD ALL and DISCARD TEMP. This logic did expensive kv operations unconditionally; if the session knew it had never created a temporary schema, we'd still fetch all databases and proceed to search all databases for a temp schema. This is very expensive. Fixes: #95864 Release note (performance improvement): In 22.2, logic was added to make `SET SESSION AUTHORIZATION DEFAULT` not a no-op. This implementation used more general code for setting the role for a session which made sure that the role exists. The check for whether a role exists is currently uncached. We don't need to check if the role we already are exists. This improves the performance of `DISCARD ALL` in addition to `SET SESSION AUTHORIZATION DEFAULT`. Release note (performance improvement): In 22.2, we introduced support for `DISCARD TEMP` and made `DISCARD ALL` actually discard temp tables. This implementation ran expensive logic to discover temp schemas rather than consulting in-memory data structures. As a result, `DISCARD ALL`, which is issued regularly by connection pools, became an expensive operation when it should be cheap. This problem is now resolved. 95997: ui: hide list of statement for non-admins r=maryliag a=maryliag To get the list of fingerprints used by an index, we use the `system.statement_statistics` directly, but non-admins don't have access to system table. Until #95756 or #95770 are done, we need to hide this feature for non-admins. Part Of #93087 Release note (ui change): Hide list of used fingerprint per index on Index Details page, for non-admin users. 95998: roachtest: update activerecord blocklist r=ZhouXing19 a=andyyang890 This patch removes a few tests from the blocklist that do not exist in the test suite for rails 7.0.3, which were being run prior to the fix for version pinning. Informs #94211 Release note: None 96003: kv: add MVCC local timestamp details to ReadWithinUncertaintyIntervalError r=nvanbenschoten a=nvanbenschoten This PR adds details about MVCC local timestamps to ReadWithinUncertaintyIntervalErrors. This provides more information about the cause of uncertainty errors. The PR also includes a series of minor refactors to clean up this code. Release note: None Epic: None Co-authored-by: e-mbrown <[email protected]> Co-authored-by: Michael Butler <[email protected]> Co-authored-by: Nathan VanBenschoten <[email protected]> Co-authored-by: Andrew Werner <[email protected]> Co-authored-by: maryliag <[email protected]> Co-authored-by: Andy Yang <[email protected]>
[sql queries triage meeting] @maryliag would using views work, as mentioned above? |
I don't understand how that works, for example, what I would need to do to make a select as |
Okay, so the summary of the problem as I understand it is:
The proposed solution is:
The advantage of this over the system table is that you can grant privileges to these virtual tables now. An example of such a a wrapping is this: --- a/pkg/sql/crdb_internal.go
+++ b/pkg/sql/crdb_internal.go
@@ -182,6 +182,7 @@ var crdbInternal = virtualSchema{
catconstants.CrdbInternalTablesTableID: crdbInternalTablesTable,
catconstants.CrdbInternalClusterTxnStatsTableID: crdbInternalClusterTxnStatsTable,
catconstants.CrdbInternalTxnStatsTableID: crdbInternalTxnStatsView,
+ catconstants.CrdbInternalStatementStatisticsTableID: crdbInternalStatementStatisticsView,
catconstants.CrdbInternalTransactionStatsTableID: crdbInternalTransactionStatisticsTable,
catconstants.CrdbInternalZonesTableID: crdbInternalZonesTable,
catconstants.CrdbInternalInvalidDescriptorsTableID: crdbInternalInvalidDescriptorsTable,
@@ -6297,6 +6298,34 @@ CREATE TABLE crdb_internal.cluster_transaction_statistics (
},
}
+var crdbInternalStatementStatisticsView = virtualSchemaView{
+ schema: `CREATE VIEW crdb_internal.statement_statistics2 AS SELECT
+ aggregated_ts,
+ fingerprint_id,
+ transaction_fingerprint_id,
+ plan_hash,
+ app_name,
+ node_id,
+ agg_interval,
+ metadata,
+ statistics,
+ plan,
+ index_recommendations FROM system.statement_statistics`,
+ resultColumns: colinfo.ResultColumns{
+ {Name: "aggregated_ts", Typ: types.TimestampTZ},
+ {Name: "fingerprint_id", Typ: types.Bytes},
+ {Name: "transaction_fingerprint_id", Typ: types.Bytes},
+ {Name: "plan_hash", Typ: types.Bytes},
+ {Name: "app_name", Typ: types.String},
+ {Name: "node_id", Typ: types.Int},
+ {Name: "agg_interval", Typ: types.Interval},
+ {Name: "metadata", Typ: types.Jsonb},
+ {Name: "statistics", Typ: types.Jsonb},
+ {Name: "plan", Typ: types.Jsonb},
+ {Name: "index_recommendations", Typ: types.StringArray},
+ },
+}
+ |
To the point 4: I'm only using the system table because there is no way to do that on the combined view. My goal is to do that select on the combined view, because I do care with the latest data (that only exists in-memory) The point number 5 can be done as a workaround, until I can make the call to the combined view. |
Great. This issue does seem fitting as written for point 4. We need to have more advanced predicate push down into virtual tables and virtual indexes to realize the vision. |
In a recent escalation, we had a system.statement_statistics table with 2.9TiB of data, and queries that power the SQL Activity page were performing a full scan of that table. Fixing this issue would have (probably) helped there and perhaps made the escalation be resolved faster. |
@maryliag When we spoke IRL I believe you mentioned that you are not longer combining the in-memory data with persisted data because fanning out to each node to retrieve the in-memory data was too slow. I'm going to close this issue, because I don't think there's anything to be done here, but let me know if I'm mistaken. |
Add the ability to add/use index for range scans on virtual tables.
We have a feature where we display the list of fingerprints used per index, and to be possible we do a select such as
SELECT * FROM system.statement_statistics WHERE "1@1"::jsonb <@ indexes_usage
but we have to use the
system
table, since this is not possible on thecrdb_internal.statement_statistics
virtual table.We don't let non-admin users to do selects on system tables, so using the crdb_internal table without the index is very inefficient.
Jira issue: CRDB-23726
The text was updated successfully, but these errors were encountered: