Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

when I replace activation function sigmoid with relu, but the project could not generate result. #5

Open
13227018679 opened this issue Feb 1, 2021 · 8 comments

Comments

@13227018679
Copy link

No description provided.

@vaibhav0195
Copy link

I think its because the paper itself mentions that it will work with sigmoid activations. I am also looking for solutions. If you find any please keep me updated too ?
Thanks :)

@yechanp
Copy link

yechanp commented Feb 14, 2021

I also tried this. But, in my opinion, this is not the matter of differentiability. I also tried SiLU = (x*sigmoid) activation function which is smooth, but I failed to generate.

@PatrickZH
Copy link
Owner

Thanks for your interests in our work! Our experiments show the feasibility to extract the ground-truth label and image in a simple setting. I think the problem with relu is caused by the difficult optimization. Better matching loss and optimizer (including learning rate...) are needed to realize the deep leakage with various activation functions.

@yechanp
Copy link

yechanp commented Feb 15, 2021

After a few experiments yesterday, I found that initial values of the model parameters are important. If the activation function changes, stds of initializer should be adjusted. In my setting, std=0.1 works for SiLU activation with other settings unchanged. ( lr of the lfbgs optimizer could be changed.)
By the way, I have a great interest of this work. It is very intriguing.

@vaibhav0195
Copy link

I am running the experiments to get the results on the mobilenet with the relu activation and sigmoid activation functions.
I tried the Adam optimizer too and various learning rates. But it doesn't seem to converge.
I think for a deeper network we need to run it for more iterations and the decrease in loss is slow as compared to the shallow network.

@yechanp
Copy link

yechanp commented Feb 22, 2021

Have you succeeded in deeper networks? In my experiments, loss is decreasing, but it stops at some points and does not converge.

@najeebjebreel
Copy link

Hi,

It didn't work with deep CNNs like VGG16. Also, when I changed the activation function to a different one from Sigmoid, it didn't converge.

Has someone found a solution for that?

Br,

@yechanp
Copy link

yechanp commented Jun 2, 2021

It works when I changed the activation functions. It is hard for ReLU, but it works for other invertible activations

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants