🌀 RoPE 旋转位置编码详解与计算示例

一、RoPE 的数学基础

给定维度 $d$ 和位置 $m$ ，定义频率参数：

\theta_i = \frac{1}{10000^{2i/d}}

其中：

对每个维度 $i$ ，构建对应的二维旋转矩阵：

R(i) = \begin{bmatrix} \cos(m \theta_i) & -\sin(m \theta_i) \\ \sin(m \theta_i) & \cos(m \theta_i) \end{bmatrix}

对于查询向量 $q_m[i], q_m[i+1]$ ，旋转后的位置编码为：

\begin{aligned} q'_m[i] &= q_m[i] \cdot \cos(m \theta_i) - q_m[i+1] \cdot \sin(m \theta_i) \\ q'_m[i+1] &= q_m[i] \cdot \sin(m \theta_i) + q_m[i+1] \cdot \cos(m \theta_i) \end{aligned}

对每个维度 $i = 0, 1, 2, 3$ ：

\theta_0 = \frac{1}{10000^{0/2}} = 1,\quad \theta_1 = \frac{1}{10000^{1/2}} = 0.01,\quad \theta_2 = \frac{1}{10000^{2/2}} = 0.0001,\quad \theta_3 = \frac{1}{10000^{3/2}} = 0.00001

使用 $m = 2$ ，得到：

\begin{aligned} m \theta_0 &= 2 \times 1 = 2 \\ m \theta_1 &= 2 \times 0.01 = 0.02 \\ m \theta_2 &= 2 \times 0.0001 = 0.0002 \\ m \theta_3 &= 2 \times 0.00001 = 0.00002 \end{aligned}

R(0) = \begin{bmatrix} \cos(2) & -\sin(2) \\ \sin(2) & \cos(2) \end{bmatrix} \approx \begin{bmatrix} -0.4161 & -0.9093 \\ 0.9093 & -0.4161 \end{bmatrix}

将向量 $[1, 2]$ 应用旋转：

\begin{aligned} q'_m[0] &= 1 \cdot (-0.4161) - 2 \cdot 0.9093 = -2.2347 \\ q'_m[1] &= 1 \cdot 0.9093 + 2 \cdot (-0.4161) = 0.0771 \end{aligned}

R(2) = \begin{bmatrix} \cos(0.0002) & -\sin(0.0002) \\ \sin(0.0002) & \cos(0.0002) \end{bmatrix} \approx \begin{bmatrix} 0.99999998 & -0.0002 \\ 0.0002 & 0.99999998 \end{bmatrix}

将向量 $[3, 4]$ 应用旋转：

\begin{aligned} q'_m[2] &= 3 \cdot 0.99999998 - 4 \cdot 0.0002 = 2.99999194 \\ q'_m[3] &= 3 \cdot 0.0002 + 4 \cdot 0.99999998 = 4.00000594 \end{aligned}

旋转后向量：

q'_m = [-2.2347,\ 0.0771,\ 2.99999194,\ 4.00000594]