Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactored solvers with fluent pattern #1759

Closed
wants to merge 6 commits into from
Closed

Refactored solvers with fluent pattern #1759

wants to merge 6 commits into from

Conversation

nakosung
Copy link

I refactored solvers with fluent pattern to add a solver like rmsprop, which isn't included in this PR. After merging this PR, rmsprop will be proposed.

A cool thing introduced by this PR is keeping code DRY(Do not Repeat Yourself). You don't have to write similar code paths for cpu/gpu.

Modified ADAGRAD solver is like below:

op
      // compute square of gradient in update
      .sqr(param, update)
      // update history
      .add(update,history,history)
      // prepare update
      .sqrt(history,update)
      .add_scalar(delta,update)
      .div(param,update,update)      
      // scale and copy
      .axpby(local_rate,update,Dtype(0),param)

@nakosung
Copy link
Author

Some tests are failed due to 'different trained weights'. It may not be an 'failure' because I've changed pow into sqr/sqrt, which can drive weights different. If maintainers decide to merge this PR, I'll update test cases.

@jeffdonahue
Copy link
Contributor

This is a cool concept, but I'm not sure we want to adopt a new pattern to unify the math interface specifically for the solvers, as opposed to a more general unification at the level of the math interface itself, as in the device abstraction effort (#610, which should be restored eventually...). Maybe there are some interesting ideas here that could complement that effort though?

@jeffdonahue jeffdonahue reopened this Jan 21, 2015
@jeffdonahue
Copy link
Contributor

(Sorry, didn't mean to immediately close...)

@shelhamer
Copy link
Member

Agreed that this is a neat concept, but it's perhaps too local when done only for the solver.

Re: RMSprop, it is a solver that is on our TODO list so we'd welcome a PR for it.

@sguada
Copy link
Contributor

sguada commented Jan 22, 2015

It would be better if it were part of Blob so it could be used in the
layers.

Sergio

2015-01-21 19:07 GMT-08:00 Evan Shelhamer [email protected]:

Agreed that this is a neat concept, but it's perhaps too local when done
only for the solver.

Re: RMSprop, it is a solver that is on our TODO list so we'd welcome a PR
for it.


Reply to this email directly or view it on GitHub
#1759 (comment).

@nakosung
Copy link
Author

In fact I had implemented RMSprop in this pattern(fluent), PR for RMSprop should be re-written in the orginal caffe style.

@nakosung
Copy link
Author

Applying this pattern to whole program will not be a hard thing.

@shelhamer
Copy link
Member

@nakosung please PR RMSprop in the current Caffe style first. Once that is in, we can figure out a plan for the fluent style in Caffe. Thanks for both the consideration of how to make the Caffe code more readable and concise.

@nakosung
Copy link
Author

nakosung commented Feb 1, 2015

@shelhamer Okay, I'll try to PR RMSprop in a separate branch, but for now it doesn't have a priority.

@nakosung nakosung closed this Feb 1, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants