知方号

知方号

常用矩阵求导公式推导<矩阵计算式>

常用矩阵求导公式推导

∂ A x ∂ x = A T frac{partial mathbf{Ax}}{partial mathbf{x}} = A^T ∂x∂Ax​=AT 推导 这里的 A = [ a 11 a 12 … a 1 n a 21 a 22 … a 2 n … … … … a m 1 a m 2 … a m n ] mathbf A =egin{bmatrix} a_{11} & a_{12} & dots & a_{1n} \ a_{21} & a_{22} & dots & a_{2n} \ dots & dots & dots & dots \ a_{m1} & a_{m2} & dots & a_{mn} end{bmatrix} A=⎣⎢⎢⎡​a11​a21​…am1​​a12​a22​…am2​​…………​a1n​a2n​…amn​​⎦⎥⎥⎤​ 和 x = [ x 1 x 2 ⋮ x n ] mathbf x = egin{bmatrix}x_1 \ x_2 \ vdots \ x_n end{bmatrix} x=⎣⎢⎢⎢⎡​x1​x2​⋮xn​​⎦⎥⎥⎥⎤​ A x = [ a 11 x 1 + a 12 x 2 + ⋯ + a 1 n x n a 21 x 1 + a 22 x 2 + ⋯ + a 2 n x n … a m 1 x 1 + a m 2 x 2 + ⋯ + a m n x n ] m × 1 mathbf{Ax} = egin{bmatrix} a_{11} x_{1} + a_{12} x_{2}+ dots + a_{1n}x_{n} \ a_{21} x_{1} + a_{22} x_{2}+ dots + a_{2n}x_{n} \ dots \ a_{m1} x_{1} + a_{m2} x_{2}+ dots + a_{mn}x_{n} end{bmatrix}_{m imes 1} Ax=⎣⎢⎢⎡​a11​x1​+a12​x2​+⋯+a1n​xn​a21​x1​+a22​x2​+⋯+a2n​xn​…am1​x1​+am2​x2​+⋯+amn​xn​​⎦⎥⎥⎤​m×1​ 可以看出 A x mathbf{Ax} Ax是一个向量,这里应用分母布局的分母是向量,分子是向量的求导展开,可以得到: ∂ A x ∂ x = [ a 11 x 1 + a 12 x 2 + ⋯ + a 1 n x n x 1 a 21 x 1 + a 22 x 2 + ⋯ + a 2 n x n x 1 … a m 1 x 1 + a m 2 x 2 + ⋯ + a m n x n x 1 a 11 x 1 + a 12 x 2 + ⋯ + a 1 n x n x 2 a 21 x 1 + a 22 x 2 + ⋯ + a 2 n x n x 2 … a m 1 x 1 + a m 2 x 2 + ⋯ + a m n x n x 2 … … … … a 11 x 1 + a 12 x 2 + ⋯ + a 1 n x n x n a 21 x 1 + a 22 x 2 + ⋯ + a 2 n x n x n … a m 1 x 1 + a m 2 x 2 + ⋯ + a m n x n x n ] = [ a 11 a 21 … a m 1 a 12 a 22 … a m 2 a 1 n a 2 n … a m n ] = A T frac{partial mathbf{Ax}}{partial mathbf{x}} =egin{bmatrix} frac{ a_{11} x_{1} + a_{12} x_{2}+ dots + a_{1n}x_{n}}{x_1} & frac{a_{21} x_{1} + a_{22} x_{2}+ dots + a_{2n}x_{n} }{x_1} & dots & frac{ a_{m1} x_{1} + a_{m2} x_{2}+ dots + a_{mn}x_{n} }{ x_1} \ frac{ a_{11} x_{1} + a_{12} x_{2}+ dots + a_{1n}x_{n}}{x_2} & frac{a_{21} x_{1} + a_{22} x_{2}+ dots + a_{2n}x_{n} }{x_2} & dots & frac{ a_{m1} x_{1} + a_{m2} x_{2}+ dots + a_{mn}x_{n} }{ x_2} \ dots & dots & dots& dots \ frac{a_{11} x_{1} + a_{12} x_{2}+ dots + a_{1n}x_{n} }{x_n} & frac{a_{21} x_{1} + a_{22} x_{2}+ dots + a_{2n}x_{n} }{x_n} & dots & frac{a_{m1} x_{1} + a_{m2} x_{2}+ dots + a_{mn}x_{n}}{x_n} end{bmatrix} \ =egin{bmatrix} a_{11} & a_{21} & dots & a_{m1} \ a_{12} & a_{22} & dots & a_{m2} \ a_{1n} & a_{2n} & dots & a_{mn}end{bmatrix} = A^T ∂x∂Ax​=⎣⎢⎢⎡​x1​a11​x1​+a12​x2​+⋯+a1n​xn​​x2​a11​x1​+a12​x2​+⋯+a1n​xn​​…xn​a11​x1​+a12​x2​+⋯+a1n​xn​​​x1​a21​x1​+a22​x2​+⋯+a2n​xn​​x2​a21​x1​+a22​x2​+⋯+a2n​xn​​…xn​a21​x1​+a22​x2​+⋯+a2n​xn​​​…………​x1​am1​x1​+am2​x2​+⋯+amn​xn​​x2​am1​x1​+am2​x2​+⋯+amn​xn​​…xn​am1​x1​+am2​x2​+⋯+amn​xn​​​⎦⎥⎥⎤​=⎣⎡​a11​a12​a1n​​a21​a22​a2n​​………​am1​am2​amn​​⎦⎤​=AT

∂ x T A x ∂ x = ( A + A T ) x frac{partial mathbf{x^T A x}}{partial mathbf{x}} =mathbf{ ( A + A^T)x} ∂x∂xTAx​=(A+AT)x 推导 同样,这里的 A = [ a 11 a 12 … a 1 n a 21 a 22 … a 2 n … … … … a n 1 a n 2 … a n n ] mathbf A =egin{bmatrix} a_{11} & a_{12} & dots & a_{1n} \ a_{21} & a_{22} & dots & a_{2n} \ dots & dots & dots & dots \ a_{n1} & a_{n2} & dots & a_{nn} end{bmatrix} A=⎣⎢⎢⎡​a11​a21​…an1​​a12​a22​…an2​​…………​a1n​a2n​…ann​​⎦⎥⎥⎤​ 和 x = [ x 1 x 2 ⋮ x n ] mathbf x = egin{bmatrix}x_1 \ x_2 \ vdots \ x_n end{bmatrix} x=⎣⎢⎢⎢⎡​x1​x2​⋮xn​​⎦⎥⎥⎥⎤​ 从右向左计算 x T A x mathbf{x^TAx} xTAx A x = [ a 11 x 1 + a 12 x 2 + ⋯ + a 1 n x n a 21 x 1 + a 22 x 2 + ⋯ + a 2 n x n … a n 1 x 1 + a n 2 x 2 + ⋯ + a n n x n ] n × 1 mathbf{Ax} = egin{bmatrix} a_{11} x_{1} + a_{12} x_{2}+ dots + a_{1n}x_{n} \ a_{21} x_{1} + a_{22} x_{2}+ dots + a_{2n}x_{n} \ dots \ a_{n1} x_{1} + a_{n2} x_{2}+ dots + a_{nn}x_{n} end{bmatrix}_{n imes 1} Ax=⎣⎢⎢⎡​a11​x1​+a12​x2​+⋯+a1n​xn​a21​x1​+a22​x2​+⋯+a2n​xn​…an1​x1​+an2​x2​+⋯+ann​xn​​⎦⎥⎥⎤​n×1​ x T A x = [ x 1 x 2 … x n ] 1 × n [ a 11 x 1 + a 12 x 2 + ⋯ + a 1 n x 1 x n a 21 x 1 + a 22 x 2 + ⋯ + a 2 n x n … a n 1 x 1 + a n 2 x 2 + ⋯ + a n n x n ] n × 1 = [ ( a 11 x 1 2 + a 12 x 1 x 2 + ⋯ + a 1 n x 1 x n ) + ( a 21 x 1 x 2 + a 22 x 2 2 + ⋯ + a 2 n x 2 x n ) + ⋯ + ( a n 1 x 1 x n + a n 2 x 2 x n + ⋯ + a n n x n 2 ) ] 1 × 1 mathbf{x^TAx} = egin{bmatrix} x_1 & x_2 & dots & x_n end{bmatrix}_{1 imes n} egin{bmatrix} a_{11} x_{1} + a_{12} x_{2}+ dots + a_{1n} x_1 x_{n} \ a_{21} x_{1} + a_{22} x_{2}+ dots + a_{2n}x_{n} \ dots \ a_{n1} x_{1} + a_{n2} x_{2}+ dots + a_{nn}x_{n} end{bmatrix}_{n imes 1}\=egin{bmatrix} (a_{11} x_{1}^2 + a_{12}x_1 x_{2}+ dots + a_{1n}x_1x_{n}) + (a_{21} x_{1}x_2 + a_{22} x_{2}^2+ dots + a_{2n}x_2x_{n} ) + dots + ( a_{n1} x_{1} x_n+ a_{n2} x_{2}x_n+ dots + a_{nn}x_{n}^2)end{bmatrix}_{1 imes 1} xTAx=[x1​​x2​​…​xn​​]1×n​⎣⎢⎢⎡​a11​x1​+a12​x2​+⋯+a1n​x1​xn​a21​x1​+a22​x2​+⋯+a2n​xn​…an1​x1​+an2​x2​+⋯+ann​xn​​⎦⎥⎥⎤​n×1​=[(a11​x12​+a12​x1​x2​+⋯+a1n​x1​xn​)+(a21​x1​x2​+a22​x22​+⋯+a2n​x2​xn​)+⋯+(an1​x1​xn​+an2​x2​xn​+⋯+ann​xn2​)​]1×1​ 可以看出 x T A x mathbf{x^TAx} xTAx是一个标量,这里应用分母布局的分母是向量,分子是标量的求导展开,可以得到: ∂ x T A x ∂ x = [ ∂ [ a 11 x 1 2 + a 12 x 1 x 2 + ⋯ + a 1 n x 1 x n ) + ( a 21 x 1 x 2 + a 22 x 2 2 + ⋯ + a 2 n x 2 x n ) + ⋯ + ( a n 1 x 1 x n + a n 2 x 2 x n + ⋯ + a n n x n 2 ) ] ∂ x 1 ∂ [ ( a 11 x 1 2 + a 12 x 1 x 2 + ⋯ + a 1 n x 1 x n ) + ( a 21 x 1 x 2 + a 22 x 2 2 + ⋯ + a 2 n x 2 x n ) + ⋯ + ( a n 1 x 1 x n + a n 2 x 2 x n + ⋯ + a n n x n 2 ) ] ∂ x 2 … ∂ [ ( a 11 x 1 2 + a 12 x 1 x 2 + ⋯ + a 1 n x 1 x n ) + ( a 21 x 1 x 2 + a 22 x 2 2 + ⋯ + a 2 n x 2 x n ) + ⋯ + ( a n 1 x 1 x n + a n 2 x 2 x n + ⋯ + a n n x n 2 ) ] ∂ x n ] = [ a 11 a 12 … a 1 n a 21 a 22 … a 2 n … … … … a n 1 a n 2 … a n n ] + [ a 11 a 21 … a n 1 a 12 a 22 … a n 2 … … … … a n 1 a n 2 … a n n ] = ( A + A T ) x frac{partial mathbf{x^TAx}}{partial mathbf{x}} =egin{bmatrix} frac{partial [a_{11} x_{1}^2 + a_{12}x_1 x_{2}+ dots + a_{1n}x_1x_{n}) + (a_{21} x_{1}x_2 + a_{22} x_{2}^2+ dots + a_{2n}x_2x_{n} ) + dots + ( a_{n1} x_{1} x_n+ a_{n2} x_{2}x_n+ dots + a_{nn}x_{n}^2)]}{partial x_1} \ frac{partial[(a_{11} x_{1}^2 + a_{12}x_1 x_{2}+ dots + a_{1n}x_1x_{n}) + (a_{21} x_{1}x_2 + a_{22} x_{2}^2+ dots + a_{2n}x_2x_{n} ) + dots + ( a_{n1} x_{1} x_n+ a_{n2} x_{2}x_n+ dots + a_{nn}x_{n}^2)]}{partial x_2} \ dots \ frac{partial [(a_{11} x_{1}^2 + a_{12}x_1 x_{2}+ dots + a_{1n}x_1x_{n}) + (a_{21} x_{1}x_2 + a_{22} x_{2}^2+ dots + a_{2n}x_2x_{n} ) + dots + ( a_{n1} x_{1} x_n+ a_{n2} x_{2}x_n+ dots + a_{nn}x_{n}^2)]}{partial x_n}end{bmatrix} \= egin{bmatrix} a_{11} & a_{12} & dots & a_{1n} \ a_{21} & a_{22} & dots & a_{2n} \ dots & dots & dots & dots \ a_{n1} & a_{n2} & dots & a_{nn}end{bmatrix} + egin{bmatrix} a_{11} & a_{21} & dots & a_{n1} \ a_{12} & a_{22} & dots & a_{n2} \ dots & dots & dots & dots \ a_{n1} & a_{n2} & dots & a_{nn}end{bmatrix} = mathbf{(A + A^T) x} ∂x∂xTAx​=⎣⎢⎢⎢⎢⎡​∂x1​∂[a11​x12​+a12​x1​x2​+⋯+a1n​x1​xn​)+(a21​x1​x2​+a22​x22​+⋯+a2n​x2​xn​)+⋯+(an1​x1​xn​+an2​x2​xn​+⋯+ann​xn2​)]​∂x2​∂[(a11​x12​+a12​x1​x2​+⋯+a1n​x1​xn​)+(a21​x1​x2​+a22​x22​+⋯+a2n​x2​xn​)+⋯+(an1​x1​xn​+an2​x2​xn​+⋯+ann​xn2​)]​…∂xn​∂[(a11​x12​+a12​x1​x2​+⋯+a1n​x1​xn​)+(a21​x1​x2​+a22​x22​+⋯+a2n​x2​xn​)+⋯+(an1​x1​xn​+an2​x2​xn​+⋯+ann​xn2​)]​​⎦⎥⎥⎥⎥⎤​=⎣⎢⎢⎡​a11​a21​…an1​​a12​a22​…an2​​…………​a1n​a2n​…ann​​⎦⎥⎥⎤​+⎣⎢⎢⎡​a11​a12​…an1​​a21​a22​…an2​​…………​an1​an2​…ann​​⎦⎥⎥⎤​=(A+AT)x

∂ x T x ∂ x = 2 x frac{partial mathbf{x^T x}}{partial mathbf{x}} =2mathbf{ x} ∂x∂xTx​=2x 推导 同样,这里的 x = [ x 1 x 2 ⋮ x n ] mathbf x = egin{bmatrix}x_1 \ x_2 \ vdots \ x_n end{bmatrix} x=⎣⎢⎢⎢⎡​x1​x2​⋮xn​​⎦⎥⎥⎥⎤​ 那么: x T x = [ x 1 x 2 … x n ] 1 × n [ x 1 x 2 ⋮ x n ] n × 1 = [ x 1 x 1 + x 2 x 2 + ⋯ + x n x n ] 1 × 1 mathbf{x^T x} = egin{bmatrix}x_1 & x_2 & dots & x_n end{bmatrix}_{1 imes n} egin{bmatrix}x_1 \ x_2 \ vdots \ x_n end{bmatrix}_{n imes 1} = egin{bmatrix}x_1 x_1 + x_2 x_2 + dots + x_n x_nend{bmatrix}_{1 imes1} xTx=[x1​​x2​​…​xn​​]1×n​⎣⎢⎢⎢⎡​x1​x2​⋮xn​​⎦⎥⎥⎥⎤​n×1​=[x1​x1​+x2​x2​+⋯+xn​xn​​]1×1​ 可以看出 x T x mathbf{x^Tx} xTx是一个标量,这里应用分母布局的分母是向量,分子是标量的求导展开,可以得到: ∂ x T x ∂ x = [ x 1 x 1 + x 2 x 2 + ⋯ + x n x n ∂ x 1 x 1 x 1 + x 2 x 2 + ⋯ + x n x n ∂ x 2 ⋮ x 1 x 1 + x 2 x 2 + ⋯ + x n x n ∂ x n ] = [ 2 x 1 2 x 2 ⋮ 2 x n ] = 2 x frac{partial mathbf{x^T x}}{partial mathbf{x}} = egin{bmatrix} frac{x_1 x_1 + x_2 x_2 + dots + x_n x_n}{partial x_1} \ frac{x_1 x_1 + x_2 x_2 + dots + x_n x_n}{partial x_2} \ vdots \ frac{x_1 x_1 + x_2 x_2 + dots + x_n x_n}{partial x_n}end{bmatrix} = egin{bmatrix} 2x_1 \ 2x_2 \ vdots \ 2x_n end{bmatrix} = 2mathbf x ∂x∂xTx​=⎣⎢⎢⎢⎡​∂x1​x1​x1​+x2​x2​+⋯+xn​xn​​∂x2​x1​x1​+x2​x2​+⋯+xn​xn​​⋮∂xn​x1​x1​+x2​x2​+⋯+xn​xn​​​⎦⎥⎥⎥⎤​=⎣⎢⎢⎢⎡​2x1​2x2​⋮2xn​​⎦⎥⎥⎥⎤​=2x

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容, 请发送邮件至lizi9903@foxmail.com举报,一经查实,本站将立刻删除。