You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
You are right that the original paper used two encoders. However, doing so would consume much more GPU memory, which is why I just used one encoder. (I guess it is very likely that separate encoders can further boost the performance) Anyway, let me add this to readme.
If you really want two encoders, feel free to experiment with it. It should be fairly easy to modify the code. (and of course you are welcome to share the results:) )
(To be honest, I'm not used to "deep learning coding" (PyTorch, Huggingface, etc...), so this might be a silly question. Keep in mind I'm a beginner.)
The original paper said that context encoder and candidate encoder are trained separately.
However I found in your code that both transformers are called as
self.bert()
.https://github.com/chijames/Poly-Encoder/blob/master/encoder.py#L20-L27
Is it OK? I doubt these two encoders have different weights after training.
FYI: In the official implementation of BLINK(https://arxiv.org/pdf/1911.03814.pdf ) paper, they prepare different methods. https://github.com/facebookresearch/BLINK/blob/master/blink/biencoder/biencoder.py#L37-L48
The text was updated successfully, but these errors were encountered: