GRU layer should have the batch_first=True flag #62

ra1995 · 2023-04-20T04:28:06Z

Hi, I was going through the gesticulator codebase and using GRU for speech feature encoding. I noticed that before sending the curr_speech input to GRU, you keep the first dimension as the batch_size and the second dimension as the temporal size. So batch_first=True flag should be used to initialize GRU layer in my opinion. Please let me know if this is the case. Thank you for sharing your awesome work :)

Svito-zar · 2023-04-21T11:48:35Z

Hi @ra1995. Thank you for raising your concern. It has been more than 3 years since I developed this model, so I don't remember exactly how I was doing things. But after a brief look at the code I agree with you. It does seems that the batch_size was the first dimension, which seems common to me. Since the code did not break, I assume that this was probably the default situation in the PyTorch version used ... but I am not sure.
Does this cause you an issue?

ra1995 · 2023-04-22T16:56:43Z

Yes, the model was not converging correctly for my custom dataset without the batch_first argument. After doing the necessary changes, its performing much better

Svito-zar · 2023-04-25T14:00:03Z

Oh, that's very interesting! @ra1995, can you please make a PR with these changes? ( I could do it myself, but if you make a pull request - you will have the credit for finding this)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GRU layer should have the batch_first=True flag #62

GRU layer should have the batch_first=True flag #62

ra1995 commented Apr 20, 2023

Svito-zar commented Apr 21, 2023

ra1995 commented Apr 22, 2023

Svito-zar commented Apr 25, 2023

GRU layer should have the batch_first=True flag #62

GRU layer should have the batch_first=True flag #62

Comments

ra1995 commented Apr 20, 2023

Svito-zar commented Apr 21, 2023

ra1995 commented Apr 22, 2023

Svito-zar commented Apr 25, 2023