-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize the tokenization #143
Comments
That's a good suggestion. Indeed, tokenization may require heavy CPU usage. |
Hi, for upgrading AllenNlp, the training speed can be accelerated, since we can easily set the parameter use_amp to true in the GradientDescentTrainer, and thus use Automatic Mixed Precision to train gector. (and, of course, we need to make some extra adaptations to support characteristics in gector like cold_steps). |
Hey! I'm wondering how to modify the code to support cold_steps with amp? I tried but if I freeze the encoder for the first few epochs, it's not possible to unfreeze it with "params.requires_grad = True". The loss and acc will not decrease. Have you figured out a possible solution? I need some help. |
One simple solution is that you can save the model parameters to your disk after finishing the cold steps. Then you can start a new training procedure, reload the model parameters and unfreeze the BERT encoders. |
Okay, I found that if the requires_grad option is set inside forward method, it will work. Thank you by the way~ |
Hi @HillZhang1999 Could you please suggest what changes did you made to use latest AllenNLP? Thanks! |
Maybe you can refer to this repo: https://github.com/HillZhang1999/MuCGEC/tree/main/models/seq2edit-based-CGEC |
Thanks for replying it so quickly! |
And if you would like to train seq2edit GEC without AllenNLP bundle but with faster speed, I made a deepspeed + pytorch + transformers implementation, you can refer to this repo: |
First, thanks for your excellent work. Here is my question:
The text was updated successfully, but these errors were encountered: