Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does GPU real improve the speed? #3

Open
kakaroto2 opened this issue May 2, 2017 · 9 comments
Open

Does GPU real improve the speed? #3

kakaroto2 opened this issue May 2, 2017 · 9 comments

Comments

@kakaroto2
Copy link

I installed tensorflow-gpu on Ubuntu, and it can run both on GPU and CPU at the same time.
But the time of each epoch doesn't reduce, what can I do to improve the effeciency?

Thx!

@Franck-Dernoncourt
Copy link
Owner

Franck-Dernoncourt commented May 3, 2017

That's correct: using GPU doesn't help much. I believe this is due to using feed_dict and a batch size of 1.

Someone emailed me regarding the speed on the GPU, and had the same guesses. He did try changing feed_dict to the TFRecord mechanisms, however it didn't appear to improve the GPU usage.

Increasing the batch size remains to be investigated. TensorFlow's CRF layer supports batch so it should be decently easy to allows mini-batch during training, but so far I haven't had the need to improve the speed significantly for the GPU.

@Diego999
Copy link

Diego999 commented May 6, 2017

What are your training times on CPU and GPU (and by the way how many epochs ?) ? Just to have a rough idea how "different" they are and see if it's worth to increase the batch size.

Thank you !

@kakaroto2
Copy link
Author

@Diego999 about 50 epochs ,each epoch is about 300 seconds.

@HaniehP
Copy link

HaniehP commented May 26, 2017

In comparison with https://github.com/glample/tagger, which is implemented with Theano, do you know which one is faster using the CPU?

@carolmanderson
Copy link

I'm getting better performance on a CPU than on GPUs -- I assume this is due to the batch size of 1. Was there a reason for choosing a batch size of 1 in the first place?

@JohnGiorgi
Copy link
Contributor

Also getting (much) better performance on CPU than GPU (for the record).

@carolmanderson
Copy link

Following up on this, I'm guessing the reason a batch size of 1 was used is that each training unit is a sentence, and sentences all have different lengths. In order to create a batch, sentences have to be padded.

I came across the following repo that implemented batch training:
https://github.com/atgiannako/NeuroATE

The author also made number of other changes and rearrangements, so I had to spend a while adding back the features I needed (for example, the ability to make predictions on a deploy set). It did speed up training about two-fold.

@sa-j
Copy link

sa-j commented Apr 25, 2018

Hi carolmanderson, do you have uploaded the version where you have "added back the features" . If not can you please do it? I'm using NeuroNER, however, while using a GPU there is no speed up as it is barely used. Simply increasing the batch_size in the current version leads to errors.

Thanks!

@carolmanderson
Copy link

Hi sa, I haven't, and I'd have to check with my employer about whether I can share the code I wrote, as this was a work project. But since my last post, I've stopped using NeuroATE in favor of a different implementation that is much faster: https://github.com/UKPLab/emnlp2017-bilstm-cnn-crf

It can be used with either a CNN or LSTM for the character-level embeddings. With the LSTM, the architecture is essentially the same as NeuroNER. (I actually use the CNN though, as it's faster and produces the same performance on my task).

It supports minibatching and the minibatching method is much more efficient than in NeuroATE. In NeuroATE, sentences were randomly grouped into minibatches, and then all sentences had to be padded to match the length of the longest sentence. In the UKPLab implementation, only sentences of the same size are grouped together, so no padding is necessary. As a caveat, this also means that if your data set is very small or very heterogeneous in length, most minibatches won't be full.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants