Trainer only uses TextClassifier.load_from_file #474

fbenites · 2019-02-08T22:00:40Z

I am using 0.4 but master seems to have this behavior:
I inherited from TextClassifier and overloaded basic methods (I changed the decoder to be a full-blown NN). I get very good results in the training phase. The problem is in the test phase when the trainer checks for type TextClassifier and uses the TextClassifier load_from_file and not mine overloaded method:
https://github.com/zalandoresearch/flair/blob/master/flair/trainers/trainer.py#L271
Wouldn't it be better to get the class of the instance? My first guess would be:
self.model = self.model.class.load_from_file(base_path / 'best-model.pt')
But that seems also wrong. Maybe load_from_file should not be a class method?

alanakbik · 2019-02-09T14:18:41Z

Hello @fbenites thank you for pointing this out - this in fact raises a few issues with the current implementation of ModelTrainer that we need to address.

Specifically, I think some more work needs to be done such that all explicit references to SequenceTagger and TextClassifier are removed from the ModelTrainer. Instead, the flair.nn.Model interface needs to be extended so that it contains everything required by the model trainer, including load_from_file(), save_checkpoint(), etc. This would allow users to simply implement the flair.nn.Model interface and directly get access to all ModelTrainer functionality. I am not sure if we can do this in time for 0.4.1 release which has been pushed back too far already, but we could get started on this right after the release.

Another question: Your TextClassifier modification sounds very interesting. Would you consider contributing it to Flair?

fbenites · 2019-02-09T16:27:40Z

Hi @alanakbik, thank you for the great framework!
I am figuring out if my model is any good, atm I am trying it on simple questions relation prediction with the model, if I get some decent results (78% acc. 88% is sota, but I am masking the entities completly) I will post something more specific. In principle, I am using some kaggle kernel capsule implementation (https://www.kaggle.com/sfzero/fork-of-gamma-bianli-feature-1-1-i-16) instead of the default decoder: self.decoder=NeuralNet(). Another problem I have is that the documentlstm gives only a single vector (sentence embedding?) which is for the GRU+LSTM of the capsule probably useless. I will try first to figure that out. I will also try some hierarchical classification with bert and flair, and let you know how that works.

dancsalo · 2019-02-15T16:50:50Z

Hi @fbenites and @alanakbik, I did some work to isolate the parts of ModelTrainer that are currently dependent on other classes, and I've pushed my work here: https://github.com/dancsalo/classyflair.

It allows users to add their own architectures to train by subclassing TextClassifier and passing that subclass in as the model to ModelTrainer. Only one function in ModelTrainer needs to be modified for a simple training / evaluation run (final_test()).

I would be interested in contributing this idea if it is in line with where you are taking the framework! And of course, open to feedback.

alanakbik · 2019-02-15T18:26:32Z

@dancsalo sounds very interesting and help is always appreciated :)

After the 0.4.1 release I'll take a look and get back to you!

FlorentPajot · 2019-02-27T16:03:21Z

Hello, following your discussion, I also had to inherit from TextClassifier to overload basic methods for the same reasons, and I encountered another issue with load_from_file which uses torch.load() in TextClassifier._load_state() method. In fact in the case I want to use a model trained on GPU on a CPU only device for inference, I cannot access to the map_location parameter of torch.load which urges me to overload all the related basic methods for loading. Thus, could you add a parameter injection from the load_from_file method which could help use this torch feature?

alanakbik · 2019-02-28T15:06:35Z

Hello @FlorentPajot sure we can add this. Could you help me better understand why you are currently overloading these methods? We train models on GPU a lot and use them in CPU only environments and this seems to work so far.

FlorentPajot · 2019-02-28T15:19:59Z

Thanks for your answer, you're right I currently use model.cpu().save() to save my model and use it on CPU instance, but I encountered the issue for model saved differently which required me to access torch.load(). However, if you save it the right way it's not an issue.

…ression

GH-474: model interface

…ression

aayushsanghavi · 2019-05-21T06:21:57Z

@alanakbik in reference to your earlier comment -

We train models on GPU a lot and use them in CPU only environments

I trained a SequenceTagger model with ELMo embeddings on Google Colab's GPU but I wish to evaluate it on my local machine. I tried doing the following -

flair.device = torch.device('cpu')
model = SequenceTagger.load_from_file(join(args.model_path, 'best-model.pt'))

but it doesn't work. I get aRuntimeError: cuda runtime error (38) : no CUDA-capable device is detected at the statement - model.predict(sentence)

Could you please tell me how I could train a model on a GPU and use it on a CPU environment?

alanakbik · 2019-05-21T07:03:09Z

Hm that is strange and should normally work. Could you try calling model.to(flair.device) before doing the predict? Also, which version of Flair are you using?

aayushsanghavi · 2019-05-21T07:58:55Z

I tried that but it didn't work. I got the following error after that - https://pastebin.com/sEB6z3kB
I guess it isn't able to load the ELMo embeddings on cpu. And I'm using Flair 0.4.1

alanakbik · 2019-05-21T08:32:04Z

You're right, it looks like this has something to do with ELMo embeddings. The ELMoEmbedder class we use from AllenNLP does not inherit from torch.nn.Module and so is not handled by the PyTorch .to(device) logic. So unfortunately this currently does not work. Perhaps someone who knows AllenNLP well could take a look?

aayushsanghavi · 2019-05-21T09:00:34Z

No worries. But say if I were to use FlairEmbeddings instead of ELMo and train it on a GPU, would I be able to load that model on a CPU environment with the above lines added?

alanakbik · 2019-05-21T09:02:49Z

Yes, if you use FlairEmbeddings you don't have to do anything. It will automatically detect what type of machine it is on and handle everything. For instance, all our pre-trained models were trained on GPU, but you can just load them on CPU and use them.

aayushsanghavi · 2019-05-21T09:06:58Z

Perfect. Thank you so much!

stale · 2020-04-30T01:11:05Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

fbenites added the question Further information is requested label Feb 8, 2019

alanakbik mentioned this issue Feb 24, 2019

Flair 0.5 features #563

Closed

5 tasks

alanakbik pushed a commit that referenced this issue Mar 13, 2019

GH-474: unify loading/saving for TextClassifier and SequenceLabeler

0667b9e

alanakbik pushed a commit that referenced this issue Mar 13, 2019

GH-474: remove specific model reference from trainer

032a753

alanakbik pushed a commit that referenced this issue Mar 14, 2019

GH-474: move evaluation into model classes

a188a98

alanakbik pushed a commit that referenced this issue Mar 14, 2019

GH-474: remove failing test

6a846d6

alanakbik pushed a commit that referenced this issue Mar 14, 2019

GH-474: removed forward_labels_and_loss from nn.Model interface

71a34ca

alanakbik pushed a commit that referenced this issue Mar 14, 2019

GH-474: removed forward_labels_and_loss from nn.Model interface

570657a

alanakbik pushed a commit that referenced this issue Apr 20, 2019

GH-474: Model interface for sequence labeling, classification and reg…

9d32043

…ression

alanakbik pushed a commit that referenced this issue Apr 20, 2019

GH-474: Model interface for sequence labeling, classification and reg…

a505edf

…ression

alanakbik pushed a commit that referenced this issue Apr 20, 2019

GH-474: remove regression-soecific trainer

3d60de5

alanakbik pushed a commit that referenced this issue Apr 20, 2019

GH-474: remove pymagnitude Path

dc559f3

alanakbik pushed a commit that referenced this issue Apr 23, 2019

GH-474: adapt Plotter to different types of tasks

6e36bfd

alanakbik pushed a commit that referenced this issue Apr 23, 2019

GH-474: fix tests for new Plotter

4121596

alanakbik pushed a commit that referenced this issue Apr 26, 2019

Merge pull request #681 from zalandoresearch/GH-474-model-interface

81cf1b5

GH-474: model interface

stefan-it pushed a commit that referenced this issue Apr 26, 2019

GH-474: Model interface for sequence labeling, classification and reg…

43aa2a0

…ression

stefan-it pushed a commit that referenced this issue Apr 26, 2019

GH-474: Model interface for sequence labeling, classification and reg…

5e77d62

…ression

stefan-it pushed a commit that referenced this issue Apr 26, 2019

GH-474: remove regression-soecific trainer

9cdb539

stefan-it pushed a commit that referenced this issue Apr 26, 2019

GH-474: remove pymagnitude Path

e4005be

stefan-it pushed a commit that referenced this issue Apr 26, 2019

GH-474: adapt Plotter to different types of tasks

270151c

stale bot added the wontfix This will not be worked on label Apr 30, 2020

alanakbik removed the wontfix This will not be worked on label Apr 30, 2020

alanakbik closed this as completed Apr 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trainer only uses TextClassifier.load_from_file #474

Trainer only uses TextClassifier.load_from_file #474

fbenites commented Feb 8, 2019

alanakbik commented Feb 9, 2019

fbenites commented Feb 9, 2019

dancsalo commented Feb 15, 2019

alanakbik commented Feb 15, 2019

FlorentPajot commented Feb 27, 2019 •

edited

Loading

alanakbik commented Feb 28, 2019

FlorentPajot commented Feb 28, 2019 •

edited

Loading

aayushsanghavi commented May 21, 2019

alanakbik commented May 21, 2019

aayushsanghavi commented May 21, 2019

alanakbik commented May 21, 2019

aayushsanghavi commented May 21, 2019

alanakbik commented May 21, 2019

aayushsanghavi commented May 21, 2019

stale bot commented Apr 30, 2020

Trainer only uses TextClassifier.load_from_file #474

Trainer only uses TextClassifier.load_from_file #474

Comments

fbenites commented Feb 8, 2019

alanakbik commented Feb 9, 2019

fbenites commented Feb 9, 2019

dancsalo commented Feb 15, 2019

alanakbik commented Feb 15, 2019

FlorentPajot commented Feb 27, 2019 • edited Loading

alanakbik commented Feb 28, 2019

FlorentPajot commented Feb 28, 2019 • edited Loading

aayushsanghavi commented May 21, 2019

alanakbik commented May 21, 2019

aayushsanghavi commented May 21, 2019

alanakbik commented May 21, 2019

aayushsanghavi commented May 21, 2019

alanakbik commented May 21, 2019

aayushsanghavi commented May 21, 2019

stale bot commented Apr 30, 2020

FlorentPajot commented Feb 27, 2019 •

edited

Loading

FlorentPajot commented Feb 28, 2019 •

edited

Loading