Question about the training hyperparameters （epoch/lr) #1

skylineo · 2024-12-13T09:41:12Z

Thank you for your work, this is an excellent piece of code. I’ve recently been trying to apply different pretraining strategies to trajectory prediction networks. In your code, I noticed that the pretraining epoch is set to 20, the fine-tuning epoch is set to 40, and the learning rate is the same for both (ignoring the differences due to cosine decay). However, in other papers, I’ve found that in some cases, the pretraining epochs and learning rates are much larger than for fine-tuning (e.g., SEPT, TrajMAE), or both parts use the same hyperparameters (e.g., Forecast-MAE). Currently, I’m planning to modify your code. Could you advise on how to set the pretraining and fine-tuning epochs and learning rates more reasonably？Is 20 epochs for pretraining and 40 epochs for fine-tuning a reasonable choice?

KexianShen · 2024-12-15T08:30:06Z

I didn't think much about the epoch and learning rate at the time, as the pre-training results were quite good. Your suggestion is excellent.

Wishing you better results in your experiments.

skylineo · 2024-12-17T07:14:53Z

Got it! Thanks for your timely response!

skylineo closed this as completed Dec 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about the training hyperparameters （epoch/lr) #1

Question about the training hyperparameters （epoch/lr) #1

skylineo commented Dec 13, 2024

KexianShen commented Dec 15, 2024

skylineo commented Dec 17, 2024

Question about the training hyperparameters （epoch/lr) #1

Question about the training hyperparameters （epoch/lr) #1

Comments

skylineo commented Dec 13, 2024

KexianShen commented Dec 15, 2024

skylineo commented Dec 17, 2024