Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New data store preload API #1097

Merged
merged 9 commits into from
Dec 27, 2024
Merged

New data store preload API #1097

merged 9 commits into from
Dec 27, 2024

Conversation

forman
Copy link
Member

@forman forman commented Dec 18, 2024

Added a new preload API to xcube data stores:

  • Enhanced the xcube.core.store.DataStore class to optionally support
    preloading of datasets via an API represented by the
    new xcube.core.store.DataPreloader interface.
  • Added handy default implementations NullPreloadHandle and ExecutorPreloadHandle
    to be returned by implementations of the prepare_data() method of a
    given data store.

Closes #1093

Checklist:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/source/*
  • Changes documented in CHANGES.md
  • GitHub CI passes
  • AppVeyor CI passes
  • Test coverage remains or increases (target 100%)

Copy link
Contributor

@konstntokas konstntokas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like very much, that we already have the multi-threading in there.I think this makes it fairly easy to do the implementation in the data store plugins.

I tested the notebook locally which works fine!

xcube/core/store/accessor.py Outdated Show resolved Hide resolved
xcube/core/store/accessor.py Outdated Show resolved Hide resolved
xcube/core/store/accessor.py Outdated Show resolved Hide resolved
xcube/core/store/preload.py Show resolved Hide resolved
for data_id, future in self._futures.items():
if f is future:
break
if data_id is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I search for the data_id that belongs to the passed future f.

I'll add a comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@forman forman requested a review from konstntokas December 18, 2024 16:22
Copy link
Contributor

@konstntokas konstntokas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just saw that the failed test are not related to preload API. So I approve :)

Copy link
Contributor

@b-yogesh b-yogesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good overall, thanks. I have added some comments below, please have a look.

xcube/core/store/preload.py Show resolved Hide resolved
xcube/core/store/preload.py Show resolved Hide resolved
xcube/core/store/preload.py Show resolved Hide resolved
@@ -0,0 +1,368 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also add the context manager example in the notebook.

def __init__(
self,
data_ids: tuple[str, ...],
preload_data: Callable[[PreloadHandle, str], None] | None = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is useful to allow preload_params in the preload_data function.

E.g. I had the case, where I had multiple tiff files within a zip file which could be merged if the user requests it by a keyword boolean.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the use case. Why don't you just add a dedicated opener parameter then?
You could also use functools.partial().

@forman forman requested a review from b-yogesh December 27, 2024 07:04
Copy link
Contributor

@b-yogesh b-yogesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved! Added a tiny change suggestion. Please have a look before merging.
Btw, the tests are failing.

xcube/core/store/preload.py Outdated Show resolved Hide resolved
@forman forman marked this pull request as ready for review December 27, 2024 14:40
@forman forman merged commit 3473d04 into main Dec 27, 2024
0 of 2 checks passed
@forman forman deleted the forman-1093-preload_api branch December 27, 2024 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow data stores to prepare accessing inert resources
3 participants