[Question] RoPE的实现和论文里不一致 #136

zehmaaa · 2023-10-04T08:47:48Z

Required prerequisites

I have read the documentation https://github.com/baichuan-inc/baichuan-7B/blob/HEAD/README.md.
I have searched the Issue Tracker and Discussions that this hasn't already been reported. (+1 or comment there if it has.)
Consider asking first in a Discussion.

Questions

请问这里的实现为啥和论文里面不一样？

def rotate_half(x):
    """Rotates half the hidden dims of the input."""
    x1 = x[..., : x.shape[-1] // 2]
    x2 = x[..., x.shape[-1] // 2:]
    return torch.cat((-x2, x1), dim=-1)

论文里的计算是

按照这种实现最后的计算结果会是

我看huggingface里面也是这样，好奇为啥选择这种实现？

Checklist

I have provided all relevant and necessary information above.
I have chosen a suitable title for this issue.

xinge333 · 2024-07-03T06:27:19Z

embedding 里面神经元的位置是没有顺序的，随便选一半做反转就行了；

zehmaaa added the question Further information is requested label Oct 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] RoPE的实现和论文里不一致 #136

[Question] RoPE的实现和论文里不一致 #136

zehmaaa commented Oct 4, 2023

xinge333 commented Jul 3, 2024

[Question] RoPE的实现和论文里不一致 #136

[Question] RoPE的实现和论文里不一致 #136

Comments

zehmaaa commented Oct 4, 2023

Required prerequisites

Questions

Checklist

xinge333 commented Jul 3, 2024