-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to modify baseline architectures #197
Comments
Hi @jainspoornima. Can you be a bit more elaborate on your question? |
Hi @georgeyiasemis. I wished to modify the LPDNet architecture by replacing its convolution blocks with a different block for my research experiments, but I could not find the model definition in the repo. Thus I wished to ask if the code for model implementation is open-sourced so that we can experiment with it. Thank you. |
You can modify anything in the code. For the models specifically, please refer to If you want to modify any code it might be best to install I hope these help. |
Hi @georgeyiasemis, I executed the following command inside direct/direct folder to train LPDNet on Calgary Campinas Dataset:
But it just runs for a few seconds and doesn't save any logs in LPD_Net_Real directory, or apparently do any training. (I executed the command |
Hi @jainspoornima. Direct is supposed to work on gpu nodes and was not designed or tested in colab. Not sure if colab is compatible with torch.distributed module. I will need more context to be able to help you. Is there some output you can show? Is it possible that colab runs out of memory? |
Hi, I executed the following commands in Colab -
Till here they were fine, and then I executed this command (I have saved 12-channel data of Calgary-Campinas dataset in '/content/drive/MyDrive/Calgary_PDNet_Experiments/Data/' folder) -
which gave the following error -
So I executed these commands -
This command just completes execution in 3-4 seconds with no error message and without the RAM usage exceeding at all. I also wished to add that I am using Colab-Pro, which offers one 16 GB GPU, so I hoped that the code may run on it. Maybe not, but I haven't got a resource exhausted error yet - just no training, no error or no logs saved in LPD_Net_Real directory. |
Hi @georgeyiasemis, could you please tell if I can continue training on Colab? I have easy access to Colab, but for a single physical 24/32 GB GPU machine I will need to ask for permissions for access. Thus it may be helpful if you can tell that. Thanks. |
@jainspoornima unfortunately I cannot provide support without any error output. I will let you know if I have any more insight about colab |
Hi @jainspoornima. So the following are directions for setting up DIRECT on colab:
This is needed to install python 3.8. (Somehow there are only older versions in colab.)
|
@jainspoornima if the above worked for you, I will go ahead and close the issue. Let me know |
Hi @georgeyiasemis, that worked in Colab, thanks a lot. I trained LPDNet on Calgary Campinas multicoil dataset, so it gave an out of memory error for CUDA for batch size of 3, but fit for batch size 1. I am not sure if this is the right place to ask this, but the loss did not seem to decrease monotonically in the training - |
@jainspoornima Glad to hear it worked. |
I just wished to ask how we can modify the baseline architectures to include our modifications before training them. Thanks.
The text was updated successfully, but these errors were encountered: