About the implementation of Poly Encoder #13

Hannibal046 · 2023-12-28T14:27:19Z

Hi @chijames, thanks so much for this wonderful project!
After digging into the code, I have two questions:

Is there any special reason why masking is not implemented in this section?

Lines 72 to 78 in e5299e3

    
           def dot_attention(self, q, k, v): 
        
               # q: [bs, poly_m, dim] or [bs, res_cnt, dim] 
        
               # k=v: [bs, length, dim] or [bs, poly_m, dim] 
        
               attn_weights = torch.matmul(q, k.transpose(2, 1)) # [bs, poly_m, length] 
        
               attn_weights = F.softmax(attn_weights, -1) 
        
               output = torch.matmul(attn_weights, v) # [bs, poly_m, dim] 
        
               return output

Can we speed up the construction of poly_code_embeddings by using nn.Parameters? In this way, we don't need to create poly_ids and move it to GPU in every batches.

Thanks for your reply!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About the implementation of Poly Encoder #13

About the implementation of Poly Encoder #13

Hannibal046 commented Dec 28, 2023 •

edited

Loading

About the implementation of Poly Encoder #13

About the implementation of Poly Encoder #13

Comments

Hannibal046 commented Dec 28, 2023 • edited Loading

Hannibal046 commented Dec 28, 2023 •

edited

Loading