Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Reduce the access of filesystem when reading paimon table #2978

Closed
2 tasks done
wg1026688210 opened this issue Mar 8, 2024 · 6 comments
Closed
2 tasks done
Labels
enhancement New feature or request

Comments

@wg1026688210
Copy link
Contributor

wg1026688210 commented Mar 8, 2024

Search before asking

  • I searched in the issues and found nothing similar.

Motivation

Currently, the computation engine frequently reads the schema file and the size of the data files from filesystem when reading the Paimon table.This lead to additional access to the file system. The purpose of this issue is to reduce access to the file system by caching the schema ,paimon table and so on.

Solution

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@wg1026688210 wg1026688210 added the enhancement New feature or request label Mar 8, 2024
@zhangjun0x01
Copy link
Contributor

I submit an pr to cache table in catalog , #2971
you could help me review it

@zhangjun0x01
Copy link
Contributor

we can also to cache manifest file to improve query speed, after all, plan() is also a very heavy operation

https://docs.cloudera.com/cdw-runtime/cloud/iceberg-how-to/topics/iceberg-manifest-caching.html

@JingsongLi
Copy link
Contributor

I submit an pr to cache table in catalog , #2971 you could help me review it

@zhangjun0x01 This is a separate issue. In theory, we can cache all the files.

@JingsongLi
Copy link
Contributor

@wg1026688210 I think we have finished this issue, thanks!

@zhangjun0x01
Copy link
Contributor

I submit an pr to cache table in catalog , #2971 you could help me review it

@zhangjun0x01 This is a separate issue. In theory, we can cache all the files.

yes ,we can open other pr to cache manifest , like this apache/iceberg#4518

@Kyofin
Copy link

Kyofin commented Nov 20, 2024

I submit an pr to cache table in catalog , #2971 you could help me review it

@zhangjun0x01 This is a separate issue. In theory, we can cache all the files.

yes ,we can open other pr to cache manifest , like this apache/iceberg#4518

Is there any progress? This is a very important requirement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants