Sharding the dataset #85
-
Will there be support for a sharded version of the data loader (e.g. iWildCam) instead of reading from individual JPG images? I find the data reading sometimes very slow with network drives. Any suggestions? Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
We currently don't have any plans to add sharded data loaders, sorry, though you're definitely welcome to write your own using the underlying Other potentially things that you might already be doing include: increasing the number of CPUs available for the job, and increasing the |
Beta Was this translation helpful? Give feedback.
We currently don't have any plans to add sharded data loaders, sorry, though you're definitely welcome to write your own using the underlying
WILDSDataset
classes, and we'd be happy to look it over! For our own experiments, we found it helpful to first copy (a compressed version of) the data from the network drive to the local disk before running the script. Would that help?Other potentially things that you might already be doing include: increasing the number of CPUs available for the job, and increasing the
num_workers
for the data loader.