-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv: timetravel queries do not error on ClearRange'd data #31563
Comments
cc: @vivekmenezes |
31326: sql: use ClearRange for index truncation r=eriktrinh a=eriktrinh This change makes the deletion of index data use the ClearRange batch request. DROP INDEX schema changes in the same transaction as the one which created the table still uses the slower DelRange because ClearRange cannot be run inside a transaction and will remove write intents of the index keys which have not been resolved in the transaction. This deletion of index data happens once the GC TTL period has passed and it is safe to remove all the data. See PR #30566 for which delays this data deletion. Note: See #31563 for a limitation with queries with old timestamps which returns inconsistent data which has been deleted with ClearRange. Fixes #20696. Release note (performance improvement): Deletion of index data is faster. Co-authored-by: Erik Trinh <[email protected]>
@vivekmenezes is there a plan here? This seems like a biggie to me; I don't think |
@andreimatei there is no plan for this. i think there is value here in considering returning an error from sql-land to handle this case. |
Looking at #31504 which is a PR that attempted to address the issue, it seems that we got hung up on the semantics of moving the GCThreshold forward. The PR made setting a GCThreshold part of ClearRange, which would solve the issue observed here. However, since the GC threshold is per range, this raised questions about how this might affect unrelated colocated data not targeted by the ClearRange. Also, has this been fixed? On
It seems as though we're smart enough to prevent use of the
Maybe @ajwerner can point to a specific change SQL has made here. |
This is funny. I have a PR open right now to add back the ability to read data from a time before a descriptor was deleted. I “broke” it during this release. It turns out to be a somewhat important property for correctness in some cases, namely if you have a read only transaction which knows (via pg catalog or something like that) that some table exists but then cannot interact with it. It was one of the things breaking the randomized schema change tests. This does seem problematic re:clear range. What I wish we’d do is set the GC threshold in the same batch. |
The real motivation for the above pr isn’t about table data but rather about types and schemas which have their descriptors removed immediately without a ttl. However, the same logic is what enables to code to discover old table descriptor after the ttl has expired and the data has been removed. |
I've stewed on this a bit and I now don't think that the GC threshold is necessarily the right thing. Unfortunately this space is sort of complex. Ideally we'd find a way to expose a consistent view of the catalog to time travel queries. Right now we just blindly move the read timestamp of the kv txn and things almost perfectly just work out. However, this has its own problems, namely that we use historical privileges (https://groups.google.com/a/cockroachlabs.com/g/sql-schema-team/c/L4oUTiceGY8/m/J030sIylAQAJ, #51861). Ideally what we'd do is know the status of these dropped descriptors. However, it's even worse than all of this; we support setting a GC TTL for an index. We would need to track the status of that index somewhere and currently that somewhere is not on the descriptor. The point of all of this is that our time travel queries when it comes to schema are somewhat under-considered. Fixing them up will require a reasonably big investment. In the short term we are working slowly on building better abstractions. Namely on pulling the catalog functionality which drives virtual table usage into the As far as the above PR, there is a more ideal thing we could do that still wouldn't be perfect. My issue is that I really want for tables which have not been GC'd to still be accessible -- but I'm fine with preventing access after the descriptor has been dropped. The problem is that right now, when an AS OF SYSTEM TIME query wants to find the relevant descriptor, the way we do it is by having in memory the current leased descriptor and then walking the chain backwards using the The fix here is to instead take the timestamp we want and do an tl;dr
cc @vy-ton as a problem area |
The solution here should probably have something to do with #62585. |
This should go away magically after #70429. |
This is fixed for 23.1, closing. |
Describe the problem
After dropping a table and the underlying data has been deleted by the
ClearRange
command, queries with a transaction timestamp before the table has been dropped serve empty, invalid results. It should error because by the timeClearRange
has been issued, the GC ttl should have already been passed.This also affects
drop index
when the index truncation usesClearRange
, where if a transaction uses an older version of a table descriptor with the index then incorrect results are returned if the index is used.To Reproduce
create table t(i int primary key);
t
select now();
alter table t configure zone with gc.ttlseconds={ttlseconds}
drop table t
drop
job has completedselect * from t as of system time '{timestamp from 3}'
Jira issue: CRDB-4786
The text was updated successfully, but these errors were encountered: