-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trainer only uses TextClassifier.load_from_file #474
Comments
Hello @fbenites thank you for pointing this out - this in fact raises a few issues with the current implementation of Specifically, I think some more work needs to be done such that all explicit references to Another question: Your TextClassifier modification sounds very interesting. Would you consider contributing it to |
Hi @alanakbik, thank you for the great framework! |
Hi @fbenites and @alanakbik, I did some work to isolate the parts of It allows users to add their own architectures to train by subclassing I would be interested in contributing this idea if it is in line with where you are taking the framework! And of course, open to feedback. |
@dancsalo sounds very interesting and help is always appreciated :) After the 0.4.1 release I'll take a look and get back to you! |
Hello, following your discussion, I also had to inherit from TextClassifier to overload basic methods for the same reasons, and I encountered another issue with load_from_file which uses torch.load() in TextClassifier._load_state() method. In fact in the case I want to use a model trained on GPU on a CPU only device for inference, I cannot access to the map_location parameter of torch.load which urges me to overload all the related basic methods for loading. Thus, could you add a parameter injection from the load_from_file method which could help use this torch feature? |
Hello @FlorentPajot sure we can add this. Could you help me better understand why you are currently overloading these methods? We train models on GPU a lot and use them in CPU only environments and this seems to work so far. |
Thanks for your answer, you're right I currently use model.cpu().save() to save my model and use it on CPU instance, but I encountered the issue for model saved differently which required me to access torch.load(). However, if you save it the right way it's not an issue. |
@alanakbik in reference to your earlier comment -
I trained a SequenceTagger model with ELMo embeddings on Google Colab's GPU but I wish to evaluate it on my local machine. I tried doing the following -
but it doesn't work. I get a Could you please tell me how I could train a model on a GPU and use it on a CPU environment? |
Hm that is strange and should normally work. Could you try calling |
I tried that but it didn't work. I got the following error after that - https://pastebin.com/sEB6z3kB |
You're right, it looks like this has something to do with ELMo embeddings. The |
No worries. But say if I were to use FlairEmbeddings instead of ELMo and train it on a GPU, would I be able to load that model on a CPU environment with the above lines added? |
Yes, if you use |
Perfect. Thank you so much! |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I am using 0.4 but master seems to have this behavior:
I inherited from TextClassifier and overloaded basic methods (I changed the decoder to be a full-blown NN). I get very good results in the training phase. The problem is in the test phase when the trainer checks for type TextClassifier and uses the TextClassifier load_from_file and not mine overloaded method:
https://github.com/zalandoresearch/flair/blob/master/flair/trainers/trainer.py#L271
Wouldn't it be better to get the class of the instance? My first guess would be:
self.model = self.model.class.load_from_file(base_path / 'best-model.pt')
But that seems also wrong. Maybe load_from_file should not be a class method?
The text was updated successfully, but these errors were encountered: