Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Delta flush_metadata_cache after table creation #17174

Conversation

findepi
Copy link
Member

@findepi findepi commented Apr 21, 2023

Support following scenario

  • table was checked via Trino, so Delta connector knows table doesn't
    exist and cached that (CachingMetastore caches also misses).
  • table is created externally
  • flush_metadata_cache is used so that table becomes accessible via
    Trino

findepi added 2 commits April 21, 2023 17:51
Support following scenario

- table was checked via Trino, so Delta connector knows table doesn't
  exist and cached that (`CachingMetastore` caches also misses).
- table is created externally
- `flush_metadata_cache` is used so that table becomes accessible via
  Trino
@cla-bot cla-bot bot added the cla-signed label Apr 21, 2023
@findepi findepi added the no-release-notes This pull request does not require release notes entry label Apr 21, 2023
@findepi findepi force-pushed the findepi/allow-delta-flush-metadata-cache-after-table-creation-7221ed branch from 805d359 to 12e5845 Compare April 21, 2023 16:02
extendedStatisticsAccess.invalidateCache(tableLocation);
// This may insert into a cache, but this will get invalidated below. TODO fix Delta so that flush_metadata_cache doesn't have to read from metastore
Optional<Table> tableBeforeFlush = metastore.getTable(schemaName.get(), tableName.get());
cachingHiveMetastore.ifPresent(caching -> caching.invalidateTable(schemaName.get(), tableName.get()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does invalidateTable throw something if you call it with a schema/table that isn't cached? Seems like we should be able to just call it blindly without calling getTable first.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does invalidateTable throw something if you call it with a schema/table that isn't cached?

it does not

Seems like we should be able to just call it blindly without calling getTable first.

we call getTable to know the location (pre-existing)

we invalidate after calling, so that the cache is empty in the end state.

Comment on lines +115 to +116
tableLocation.ifPresent(transactionLogAccess::invalidateCaches);
tableLocation.ifPresent(extendedStatisticsAccess::invalidateCache);
Copy link
Member

@alexjo2144 alexjo2144 Apr 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the cache key for these should be a tableName, tableLocation tuple? That way you could invalidate them by name

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i think so

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i added TODO fix Delta so that flush_metadata_cache doesn't have to read from metastore for now.
would prefer to revisit this after #17092

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> #17214

@github-actions github-actions bot added the delta-lake Delta Lake connector label Apr 21, 2023
.skippingTypesCheck() // Delta has no parametric varchar
.matches("TABLE tpch.tiny.region");

assertUpdate("DROP TABLE flush_metadata_after_table_created");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
assertUpdate("DROP TABLE flush_metadata_after_table_created");
assertUpdate("DROP TABLE " + tableName)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missed this one. will resolve in #17214

@findepi
Copy link
Member Author

findepi commented Apr 24, 2023

CI #17210 (fix: #17211)
and #17203

@findepi findepi merged commit 593b4aa into trinodb:master Apr 24, 2023
@findepi findepi deleted the findepi/allow-delta-flush-metadata-cache-after-table-creation-7221ed branch April 24, 2023 12:19
@github-actions github-actions bot added this to the 415 milestone Apr 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed delta-lake Delta Lake connector no-release-notes This pull request does not require release notes entry
Development

Successfully merging this pull request may close these issues.

4 participants