Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not bug, just curiosity #8

Open
ax-anoop opened this issue Oct 2, 2022 · 3 comments
Open

Not bug, just curiosity #8

ax-anoop opened this issue Oct 2, 2022 · 3 comments

Comments

@ax-anoop
Copy link

ax-anoop commented Oct 2, 2022

Doing some research into the method of finding an optimal learning rate.

I made the models both from scratch as the videos and also in a torch friendly way, or well using torch modules, dataloaders, optmizer, etc ..

However something weird when running the following, which 'should' be same as code from video. The lr - loss graph is showed in im1 below.

im2 is using code very similar to videos, i.e manually updating weights. Why are the results not the same ? Is the optimizer doing somethign different in the backend ? Over all the training is about the same, both will converge roughly at the same rate.

def findlr(model, data, test_dataloader):
    lrs = torch.linspace(0.01, 1, 1000)
    lrs = 10**torch.linspace(-3, 0, 1000)
    lri = []
    lss = []

    optim = torch.optim.SGD(model.parameters(), lr=lrs[0])
    for i in range(len(lrs)):
        for g in optim.param_groups:
            g['lr'] = lrs[i]

        x, y = next(iter(data))
        l = calcLoss(model(x), y)
        model.zero_grad()
        l.backward()
        optim.step()

        lri.append(lrs[i].item())
        lss.append(l.item())

        print(lrs[i], l.item())
    plt.plot(lri, lss)
    plt.show()
    ```
    
 im1: 
    
<img width="597" alt="image" src="https://user-images.githubusercontent.com/95486801/193437520-55d14507-867c-411f-9c9e-e11db3b9e67c.png">


im2: 
<img width="597" alt="image" src="https://user-images.githubusercontent.com/95486801/193437537-481b38c6-447a-4bf7-86ee-bffb291b737a.png">
@ax-anoop
Copy link
Author

ax-anoop commented Oct 2, 2022

im1:
Screen Shot 2022-10-02 at 12 13 56 AM

im2:
Screen Shot 2022-10-02 at 12 14 30 AM

@JonathanSum
Copy link

I am not really sure. But according to the video, I say this is related to weight. Maybe the weight was too large in the im2.

@ghost
Copy link

ghost commented May 3, 2023

It could also be the gain that the weights were multiplied to, that also affects the results ( the graphs).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants