Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run RawKNNClassifier._predict_fc as parallel to avoid memory issues #9

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

grovduck
Copy link
Member

Closes #8 by creating a joblib.Parallel job to retrieve predictions for a feature collection. Options are provided for specifying the size of the batch (chunk_size) and the number of threads to use (num_threads). Once all neighbors are retrieved, the result is stitched back together into an ee.FeatureCollection.

Note that this is a way to do this server-side, but that may not be the best workflow for this use case. Typically, one wants to run the feature collection mode to do cross-validation on the plots used to fit the model or run a new set of targets. We are investigating the possibility of: 1) converting the feature collection client-side; and 2) using sknnr to run the prediction locally.

We will keep this PR open as we decide on the best path forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

User memory issues on immediate retrieval of predict for feature collections
1 participant