-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle large datasets efficienlty #582
Comments
@vgro , we will need an example simulation with, at least, one BIG file and some indication to where it is used, so we can explore how to best handle that memory wise. |
@vgro Do you happen to have a big file like this lying around? No pressure -- I've got lots to be getting on elsewhere -- but I won't be able to start on this until there's some data for me to work with, so if you do have a chance to look at it over the next few weeks, that'd be great. |
@alexdewar I'm terribly sorry that I haven't replied to this, I never received an email about the issue and we haven't checked the issues systematically in a while. I have a few urgent tasks this week, I'll try and get something for you by the end of next week or so |
Nw @vgro. If it had been really urgent I'd have sent an email... Whenever you can send it through is fine. |
If you want to run a simulation, you would probably need all the input data to have the same dimensions? For example you would need the climate data to have the same spatial extent and time steps? Or would it be enough to provide one variable, say precipitation? |
Ideally it would have the same dimensions. I haven't looked into it enough to know exactly what I'd need, but big files with somewhat realistic input data should do the trick. Don't spend too long on this -- feel free to just send it through once you've got something and I can let you know if I need anything different. |
dask
is well-suited to as this handles lazy loading of chunked data.The text was updated successfully, but these errors were encountered: