Updates to SustainBenchCropYield dataset #1756
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #1754 -- the whole dataset is only 291Mb so we can safely load all of this in the constructor then just access it in
__getitem__
(this also only takes a few seconds at most).This PR also updates the dataset to download the "soybeans_updated.zip" file from https://drive.google.com/drive/folders/1hsp2PlXAgcQ0pbx_vvPKHZcj_Am3rWx4 which appears to be the version of the dataset that the benchmark results reported here https://sustainlab-group.github.io/sustainbench/leaderboard/#crop-yield-prediction are based off of.
@chrisyeh96 in case you are interested!