Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[META] Composite Directory Implementation for Writable Warm #13149

Open
11 tasks
rayshrey opened this issue Apr 10, 2024 · 0 comments
Open
11 tasks

[META] Composite Directory Implementation for Writable Warm #13149

rayshrey opened this issue Apr 10, 2024 · 0 comments
Assignees
Labels
Meta Meta issue, not directly linked to a PR Roadmap:Cost/Performance/Scale Project-wide roadmap label Storage:Remote

Comments

@rayshrey
Copy link
Contributor

rayshrey commented Apr 10, 2024

Please describe the end goal of this project

Implement Composite Directory - a new directory implementation where data is backed in a remote store and not all data needs to be stored locally. This directory will behave as a local directory when complete files are present in disk, but can fall back to the on-demand fetch(can be extended to block level or non block level fetch) from the remote store when data is is not present locally.

We will also need to enhance the present FileCache implementation as the implementation of Composite Directory is heavily dependent on this FileCache

Supporting References

Composite Directory RFC - #12781

Issues

Basic Components:

Follow-up AIs from the above PRs:

  • Optimize readInput for Block files of RemoteDirectory to avoid loading whole 8MB chunk in memory
  • Explore and verify impact of temp files in directory as they are not tracked in FileCache
  • Add new role for Writable Warm for FileCache initialization logic (instead of being dependent on Feature Flag)
  • Refactor OnDemandSnapshotIndexInput and FileInfo to more generic names as they are now being used by Composite directory as well which is not Snapshot related.
  • Optimize and make listAll() of RemoteSegmentStoredircetory more robust (fix NPE on empty metadata and improve complexity)

Backlogs:

  • Use String as key of FileCache instead of Path to avoid having hard dependency of local directory being FSDirectory

Related component

Storage:Remote

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Meta Meta issue, not directly linked to a PR Roadmap:Cost/Performance/Scale Project-wide roadmap label Storage:Remote
Projects
Status: New
Status: 🆕 New
Status: Release v2.16 (7/23/24)
Development

No branches or pull requests

3 participants