Make `gpu(::DataLoader)` work or error loudly if it doesn't #1974

ToucheSir · 2022-05-27T05:26:07Z

This has come up a couple times, most recently in https://discourse.julialang.org/t/training-with-flux-jl-on-the-gpu-causes-argumenterror-cannot-take-the-cpu-address-of-a-cuarray/81751/2. RFC on what the desired approach is.

DhairyaLGandhi · 2022-05-27T05:37:03Z

Is DataLoader(gpu(..)) sufficiently succinct syntax?

darsnack · 2022-05-27T13:00:16Z

I don't think gpu(DataLoader(...)) should be made to work because we don't know whether the entire dataset fits in memory or not. Depending on your data set, you might have gpu.(...) or mapobs(gpu, ...). A big error suggesting either would be good.

DhairyaLGandhi · 2022-05-27T13:20:09Z

right, suggesting moving the data prior to constructing the DataLoader should allow for expression of intent. Whether they want to move the whole data set or not. Beyond that, allowing DataLoader to accept gpu as a preprocessing step would allow expressing "load minibatches". CuIterator is another option.

lfenzo · 2022-05-28T16:25:14Z

When I created this post I knew that my data would fit in the GPU memory (because it was not a very big dataset), but I wasn't really understanding why such data was not being loaded into the GPU. I thought that if the whole DataLoader object was fed to the GPU (using dataloader |> gpu) then so would be the data inside it. I also thought that transferring the whole training dataset to the GPU, as it had enough memory for it, would be better than iteractively transferring the training batches, but my tests showed no significant difference for that particular dataset and training configuration.

IMO loading the whole object with the correspondent data to the GPU with gpu(::DataLoader) is fine. If the GPU runs out of memory during the allocation process then an error will be thrown and, by the stacktrace, it will be possible to identify the source of the error; or maybe the gpu methods could point out which object is causing the allocation that originated the error.

The real issue though was that I couldn't find any good examples of GPU usage in Flux until I decided to dig into the model zoo for more practical usage scenarios. It would have been easier to solve this issue if the "GPU Support" page in the docs presented some workflows similar to the ones in the model zoo as for how to transfer artifacts between CPU and GPU. Common operations such as transferring data and models exemplified and explained would be extremely helpful!

Some of the examples could include, for instance:

Loading data to the GPU to perform training

train_loader = Flux.DataLoader((xtrain, ytrain), batchsize = 64, shuffle = true)
# ... model, optimizer and loss definitions
for epoch in 1:nepochs
    for (xtrain_batch, ytrain_batch) in train_loader
        x = gpu(xtrain_batch)
        y = gpu(ytrain_batch)
        gradients = gradient(() -> loss(x, y), parameters)
        Flux.Optimise.update!(optimizer, parameters, gradients)
    end
end

Or returning model to back to the CPU in order to save it

let model = cpu(model)
    BSON.@save modelpath model epoch
end

ToucheSir · 2022-05-28T17:37:11Z

I think a big argument for not exposing gpu(::DataLoader) is that you can't tell whether or not it worked. gpu is necessarily a no-op for any types it doesn't understand, and the internals of DataLoader are completely opaque to the user. Therefore, it's much easier to ask that data be moved before it is passed to or after it is pulled from the data loader. If I had to guess, this is also part of why PyTorch exposes .to(device) on Datasets but not DataLoaders.

But as you pointed out, the first-order issue is a lack of good docs. If you have a rough idea of how you'd like them to look, please do drop a PR and we'll make sure it gets in. One reason I think the status quo has persisted for so long is that the most efficient method we have for GPU data movement is https://cuda.juliagpu.org/stable/usage/memory/#Batching-iterator, but it's not quite general enough for all datasets people want to use. However, that can be remedied by just having slightly longer documentation which presents multiple alternatives.

DhairyaLGandhi · 2022-05-29T03:57:05Z

Not that we should do it but you're asking for ***@***.*** DataLoader (data,)`.

…

On Sat, May 28, 2022, 23:07 Brian Chen ***@***.***> wrote: I think a big argument for not exposing gpu(::DataLoader) is that you can't tell whether or not it worked. gpu is necessarily a no-op for any types it doesn't understand, and the internals of DataLoader are completely opaque to the user. Therefore, it's much easier to ask that data be moved before it is passed to or after it is pulled from the data loader. If I had to guess, this is also part of why PyTorch exposes .to(device) on Datasets but not DataLoaders. But as you pointed out, the first-order issue is a lack of good docs. If you have a rough idea of how you'd like them to look, please do drop a PR and we'll make sure it gets in. One reason I think the status quo has persisted for so long is that the most efficient method we have for GPU data movement is https://cuda.juliagpu.org/stable/usage/memory/#Batching-iterator, but it's not quite general enough for all datasets people want to use. However, that can be remedied by just having slightly longer documentation which presents multiple alternatives. — Reply to this email directly, view it on GitHub <#1974 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AJOZVVNQG4WFFBBUXG4FLFLVMJKVDANCNFSM5XDFBHCQ> . You are receiving this because you commented.Message ID: <FluxML/Flux. ***@***.***>

mcabbott mentioned this issue May 27, 2022

Piratical errors for two mistakes #1976

Merged

mcabbott closed this as completed in #1976 May 30, 2022

lfenzo mentioned this issue May 30, 2022

Added Common GPU Workflows in Docs #1980

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make `gpu(::DataLoader)` work or error loudly if it doesn't #1974

Make `gpu(::DataLoader)` work or error loudly if it doesn't #1974

ToucheSir commented May 27, 2022

DhairyaLGandhi commented May 27, 2022

darsnack commented May 27, 2022

DhairyaLGandhi commented May 27, 2022

lfenzo commented May 28, 2022

ToucheSir commented May 28, 2022

DhairyaLGandhi commented May 29, 2022 via email

Make gpu(::DataLoader) work or error loudly if it doesn't #1974

Make gpu(::DataLoader) work or error loudly if it doesn't #1974

Comments

ToucheSir commented May 27, 2022

DhairyaLGandhi commented May 27, 2022

darsnack commented May 27, 2022

DhairyaLGandhi commented May 27, 2022

lfenzo commented May 28, 2022

ToucheSir commented May 28, 2022

DhairyaLGandhi commented May 29, 2022 via email

Make `gpu(::DataLoader)` work or error loudly if it doesn't #1974

Make `gpu(::DataLoader)` work or error loudly if it doesn't #1974