Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write out chunks #203

Open
umertens opened this issue Oct 29, 2024 · 1 comment
Open

Write out chunks #203

umertens opened this issue Oct 29, 2024 · 1 comment

Comments

@umertens
Copy link

umertens commented Oct 29, 2024

Hi,

this is not an issue but rather a basic question. I hope you can still help me.

I would like to store my data (locally) in chunks because I do not know the final size beforehand. So eventually, I want to store n training examples of 2d tensors of shape (m, l). The first dimension n is not known, so I would like to write chunks of say 512 training examples and resize the first dimension accordingly.

Once stored, I want to load a random batch for further use in Torch.

Thanks in advance for your help! :)

@BrianMichell
Copy link

It sounds like you're looking for the resize method. It'd initialize the store to some arbitrarily large dimensions and then resize with the resize_tied_bounds once you know the final extent of your store.

Here's a snippet of our C++ code that handles just that. We use the implicit dims as the lower bound because we expect everything to have an origin at zero.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants