You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for your work, this is an excellent piece of code. I’ve recently been trying to apply different pretraining strategies to trajectory prediction networks. In your code, I noticed that the pretraining epoch is set to 20, the fine-tuning epoch is set to 40, and the learning rate is the same for both (ignoring the differences due to cosine decay). However, in other papers, I’ve found that in some cases, the pretraining epochs and learning rates are much larger than for fine-tuning (e.g., SEPT, TrajMAE), or both parts use the same hyperparameters (e.g., Forecast-MAE). Currently, I’m planning to modify your code. Could you advise on how to set the pretraining and fine-tuning epochs and learning rates more reasonably?Is 20 epochs for pretraining and 40 epochs for fine-tuning a reasonable choice?
The text was updated successfully, but these errors were encountered:
Thank you for your work, this is an excellent piece of code. I’ve recently been trying to apply different pretraining strategies to trajectory prediction networks. In your code, I noticed that the pretraining epoch is set to 20, the fine-tuning epoch is set to 40, and the learning rate is the same for both (ignoring the differences due to cosine decay). However, in other papers, I’ve found that in some cases, the pretraining epochs and learning rates are much larger than for fine-tuning (e.g., SEPT, TrajMAE), or both parts use the same hyperparameters (e.g., Forecast-MAE). Currently, I’m planning to modify your code. Could you advise on how to set the pretraining and fine-tuning epochs and learning rates more reasonably?Is 20 epochs for pretraining and 40 epochs for fine-tuning a reasonable choice?
The text was updated successfully, but these errors were encountered: