Running Over Whole Sets/Computing Epochs Instead of Iterations #1094

davheld · 2014-09-17T01:31:20Z

I have an odd number of test images (specifically, I have 672 test images). If I want to have a batch size of 100, how many test iterations should I choose? If I pick 6 then we will only iterate through 600 of the 672 test images, but if I pick 7 (to iterate through 700 images) then we will go off the end of the database (though I still get a result, not a segfault). For the record, it seems that if I pick 7 vs 20 iterations, I get a different result, so it seems that the solver does not naturally just stop once it reaches the end of the test set. Any help / advice?

shelhamer · 2014-09-19T18:30:22Z

The test batch size needs to be a divisor of the size of the test set. You could pick 672 / 7 = 96 with 7 test iterations. The solver will compute every test iteration, each with a full batch of the given test batch size, so that when it hits the end of the test set it merely loops around (which is not what you want).

davheld · 2014-09-19T18:41:05Z

This seems like a bug - suppose my test set size is a large prime number - then I have no good options.

shelhamer · 2014-09-19T18:44:19Z

Fair enough -- perhaps the fix is to add a solver proto field for the total size of the test set and then rewrite the solver's test net routine to do a last one-off batch to complete the set with a mini-batch of whatever size is needed. This could be a nice PR for ease-of-use. Thanks for raising the issue.

davheld · 2014-09-19T18:54:00Z

No problem. Is it a problem that the comment is marked as closed?

shelhamer · 2014-09-19T18:58:28Z

Right, I've re-opened it for now to keep it on the radar but this issue will be replaced by the PR once it is opened.

xdshang · 2014-10-10T06:32:44Z

So there is no 'epoch' in caffe? For pylearn2, a set of mini-batches is chosen, sequentially or randomly, from the dataset in each epoch, and the solver loops over epochs for max_iter times. Is it that caffe simply generates batches sequentially from the dataset?

shelhamer · 2014-10-10T06:54:03Z

caffe simply generates batches sequentially from the dataset?

It depends on the type of data layer and configuration, but that's essentially right. Caffe is configured by mini-batches and not epochs. The ImageDataLayer can do true random sampling of mini-batches but for the standard DataLayer the data is usually shuffled when created and then left in that order.

If there are particular features in pylearn2 for handling data that you find helpful, please post an issue with a clear description and even a development plan or better yet start a PR ☕

kuprel · 2015-03-10T01:54:27Z

One could just use a batch size of 1 for the test set and iterate through all of them, right? Would that be significantly slower? I noticed doing this frees up memory, and I'm able to get away with a larger training batch size. Using gradient accumulation #1977 is probably a better way to use a larger training batch size though

Coderx7 · 2016-03-12T09:56:23Z

@shelhamer what happens if someone e.g for 672 images choose test_iter = 7 while having batch size of 100? what would go wrong if someone did that?

shelhamer · 2017-04-13T20:07:46Z

what happens if someone e.g for 672 images choose test_iter = 7 while having batch size of 100?

It loops around so some inputs will be double-counted.

shelhamer closed this as completed Sep 19, 2014

shelhamer added the question label Sep 19, 2014

shelhamer added the bug label Sep 19, 2014

shelhamer reopened this Sep 19, 2014

shelhamer added this to the Future milestone Sep 21, 2014

shelhamer changed the title ~~Odd number of test images~~ Running Over Whole Sets/Computing Epochs Instead of Iterations Apr 13, 2017

shelhamer mentioned this issue Apr 13, 2017

caffe test should automatically run over whole test set #1690

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running Over Whole Sets/Computing Epochs Instead of Iterations #1094

Running Over Whole Sets/Computing Epochs Instead of Iterations #1094

davheld commented Sep 17, 2014

shelhamer commented Sep 19, 2014

davheld commented Sep 19, 2014

shelhamer commented Sep 19, 2014

davheld commented Sep 19, 2014

shelhamer commented Sep 19, 2014

xdshang commented Oct 10, 2014

shelhamer commented Oct 10, 2014

kuprel commented Mar 10, 2015

Coderx7 commented Mar 12, 2016

shelhamer commented Apr 13, 2017

Running Over Whole Sets/Computing Epochs Instead of Iterations #1094

Running Over Whole Sets/Computing Epochs Instead of Iterations #1094

Comments

davheld commented Sep 17, 2014

shelhamer commented Sep 19, 2014

davheld commented Sep 19, 2014

shelhamer commented Sep 19, 2014

davheld commented Sep 19, 2014

shelhamer commented Sep 19, 2014

xdshang commented Oct 10, 2014

shelhamer commented Oct 10, 2014

kuprel commented Mar 10, 2015

Coderx7 commented Mar 12, 2016

shelhamer commented Apr 13, 2017