Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The optimal hyperparameters for text classification tasks #4

Open
MatrixBlake opened this issue Mar 25, 2021 · 2 comments
Open

The optimal hyperparameters for text classification tasks #4

MatrixBlake opened this issue Mar 25, 2021 · 2 comments

Comments

@MatrixBlake
Copy link

MatrixBlake commented Mar 25, 2021

Hi,

I'm trying to run your code for text classification tasks but I'm not sure about the hyperparameters setting here.

Currently, I'm using alpha = 0.05, and normalize the final features as what SGC did in their paper, and then use learning rate 0.02, no weight decay, optimizer as Adam, epochs 200 with early stopping 10. However, I noticed that only when I set the number of epochs as 5000 then the model can "early stop". But even with this setting, for R8 dataset, I can only get 95.5 instead of 97.4. Can you give me some suggestions on how to set up training for the text classification task?

Also, I changed

for i in range(degree): 
      features = (1-alpha) * torch.spmm(adj, features) 
      emb += features 

to

for i in range(degree): 
      features = torch.spmm(adj, features) 
      emb +=  (1-alpha)*features 

Regards,
Kunze

@allenhaozhu
Copy link
Owner

Hi,

I'm trying to run your code for text classification tasks but I'm not sure about the hyperparameters setting here.

Currently, I'm using alpha = 0.05, and normalize the final features as what SGC did in their paper, and then use learning rate 0.02, no weight decay, optimizer as Adam, epochs 200 with early stopping 10. However, I noticed that only when I set the number of epochs as 5000 then the model can "early stop". But even with this setting, for R8 dataset, I can only get 95.5 instead of 97.4. Can you give me some suggestions on how to set up training for the text classification task?

Also, I changed

for i in range(degree): 
      features = (1-alpha) * torch.spmm(adj, features) 
      emb += features 

to

for i in range(degree): 
      features = torch.spmm(adj, features) 
      emb +=  (1-alpha)*features 

Regards,
Kunze

I guess you must use textGCN code rather than SGC code because I also experienced a similar performance. SGC code includes some preprocessing codes because you know essentially SGC and SSGC do not have linear transformation and thus the output dimension is very high (it is an advantage but also disadvantage).

Btw SSGC is not very good in text classification. We did a job about that by proposing a new aggregation method for GCN and better than SSGC. Waiting for IJCAI reviews if not accept I will release it to Axriv. I hope you could be patient with that.

@MatrixBlake
Copy link
Author

Hi,
I'm trying to run your code for text classification tasks but I'm not sure about the hyperparameters setting here.
Currently, I'm using alpha = 0.05, and normalize the final features as what SGC did in their paper, and then use learning rate 0.02, no weight decay, optimizer as Adam, epochs 200 with early stopping 10. However, I noticed that only when I set the number of epochs as 5000 then the model can "early stop". But even with this setting, for R8 dataset, I can only get 95.5 instead of 97.4. Can you give me some suggestions on how to set up training for the text classification task?
Also, I changed

for i in range(degree): 
      features = (1-alpha) * torch.spmm(adj, features) 
      emb += features 

to

for i in range(degree): 
      features = torch.spmm(adj, features) 
      emb +=  (1-alpha)*features 

Regards,
Kunze

I guess you must use textGCN code rather than SGC code because I also experienced a similar performance. SGC code includes some preprocessing codes because you know essentially SGC and SSGC do not have linear transformation and thus the output dimension is very high (it is an advantage but also disadvantage).

Btw SSGC is not very good in text classification. We did a job about that by proposing a new aggregation method for GCN and better than SSGC. Waiting for IJCAI reviews if not accept I will release it to Axriv. I hope you could be patient with that.

Thank you! Looking forward to it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants