You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@karpathy in my timings for cpu the update step is a non-neglible part of the training time, should this not be included? E.g. for my Llm.cs port I get timings like (on AMD 5950X):
0: loss 5.269892 exp. 5.270007 OK ( 3700 ms = Forward 1043 ms ZeroGrad 0 ms Backward 2091 ms Update 566 ms)
1: loss 4.059389 exp. 4.059707 OK ( 3494 ms = Forward 919 ms ZeroGrad 28 ms Backward 2039 ms Update 508 ms)
note how update step here is ~500 ms out of ~3500 ms of total. (This is after a few optimizations). Update step should be fairly easy to optimize via openmp I'd guess for C version.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
@karpathy in my timings for cpu the update step is a non-neglible part of the training time, should this not be included? E.g. for my Llm.cs port I get timings like (on AMD 5950X):
note how update step here is ~500 ms out of ~3500 ms of total. (This is after a few optimizations). Update step should be fairly easy to optimize via openmp I'd guess for C version.
Beta Was this translation helpful? Give feedback.
All reactions