Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve test time for testReadMetadataWithRelationsConcurrentModifications in Iceberg #13642

Conversation

findepi
Copy link
Member

@findepi findepi commented Aug 12, 2022

Fixes #13556

alternative to #13641

…tions in Iceberg

Profiling of `testReadMetadataWithRelationsConcurrentModifications` with
`TestIcebergParquetConnectorTest` shows that most time is spent listing
metadata, and of which, most time is spending listing tables in the
TESTING_FILE_METASTORE. This commit improves this by adding simple
listing cache on top of the file system.
@alexjo2144
Copy link
Member

alexjo2144 commented Aug 12, 2022

This feels sketchy to me. If you want caching, we have CachingHiveMetastore, do we really need caching at both levels?

@findepi
Copy link
Member Author

findepi commented Aug 12, 2022

This feels sketchy to me. If you want caching, we have CachingHiveMetastore, do we really need caching at both levels?

Apparently, yes. This gave me big speed up in the test.
Maybe because we don't have #13115
But in any case, I like the simplicity of this "10x" improvement. We can remove it once and when it becomes obsolete.

Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add TODO to remove these linked to #13115

I prefer this over the alternative PR but to be honest I'm not a fan of either - the other alternative being to disable this test.

LGTM however.

@ebyhr
Copy link
Member

ebyhr commented Aug 18, 2022

Merged as #13716 with TODO comment.

@ebyhr ebyhr closed this Aug 18, 2022
@findepi findepi deleted the findepi/improve-test-time-for-testreadmetadatawithrelationsconcurrentmodifications-in-iceberg-d3c301 branch August 22, 2022 08:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed no-release-notes This pull request does not require release notes entry test
Development

Successfully merging this pull request may close these issues.

Flaky TestIcebergParquetConnectorTest.testReadMetadataWithRelationsConcurrentModifications
4 participants