-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
end-to-end issue #4
Comments
Hi, thanks for your attention. Most deep graph clustering methods, and even most graph representation learning methods, all directly user ogbn-supplied features instead of raw text. We consider it to be a default setting. Recently, benefitting from the strong general knowledge understanding capability of LLMs, a few methods [1, 2] deal with the raw text via LLMs. It is a potential way. The purpose of the pre-training process is to obtain the initialized cluster center embeddings. It is a widely used technique of graph learning, CV, and NLP. The related competitor S3GC [3] first performs graph representation learning and then directly performs k-means on the learned node embeddings. We consider this process to separate the representation learning and clustering optimization. Therefore, we first pre-train the encoders, and then, at the fine-tuning stage, we unified the representation learning and clustering optimization into an end-to-end framework. Without pre-training, namely, training the whole network from scratch, it is hard to achieve promising performance, especially in the purely unsupervised clustering task. There are some methods [4, 5] that are free from pre-training, but they are similar to S3GC. They first perform representation learning and then perform k-means. If you have any questions or suggestions, feel free to contact me on WeChat: ly13081857311. Any issues and pull requests are also welcomed. [1] He X, Bresson X, Laurent T, et al. Explanations as Features: LLM-Based Features for Text-Attributed Graphs[J]. arXiv preprint arXiv:2305.19523, 2023. |
Thanks for your detailed reply, pre-training to get clustering embeddings is indeed a good idea. :) |
Hello. In your paper you mentioned that this work "was unified into an end-to-end framework".
However, in your published code: 1) you directly use ogb-supplied features instead of text attributes; 2) your work includes an inevitable pre-training process.
Do you consider this work as "end-to-end" and why? Looking forward to your reply.
The text was updated successfully, but these errors were encountered: