主頁 >  其他 > Mathematics for Machine Learning 學習筆記

Mathematics for Machine Learning 學習筆記

2020-11-10 07:50:25 其他

學習Coursera上Mathematics for Machine Learning Specialization后所做的筆記與整理

文章目錄

  • 第一部分 Linear Algebra 線性代數
    • 1. Vector operations 矢量運算
      • 1.1 dot or inner product 點積/數量積/內積
      • 1.2 scalar and vector projection 投影
    • 2. basis 基
    • 3. Matrices 矩陣
      • 矩陣與向量相乘
    • 4. change of basis 基變換/坐標變換
    • 5. Gram-Schmidt process for constructing an orthonoral basis 用格拉姆-施密特正交化構建正交基
    • 6. Transformation in a Plane or other object
    • 7. Eigenstuff 特征分解
    • 8. PageRank
  • 第二部分 Multivariate 多元微積分
    • 1. Defenition of a derivative 積分定義
    • 2. Time saving rules
    • 3. Derivatives of named functions 常見函式的導數
    • 4. Derivative structures
    • 5. Taylor Series 泰勒展開式
      • 5.1 Maclaurin 麥克勞林展開式
      • 5.2 泰勒展開式
    • 6. Optimization and Vector Calculus
  • 第三部分 PCA (Principal Component Analysis) 主成分分析
    • 1. 1-D datasets 一維資料
    • 2. Definite symmetric matrix
    • 3. higher-dimensional datasets 高維資料
    • 4. Effect of Linear Transformations 線性變換對均值與方差對影響
    • 5. Dot product 點積
    • 6. Inner product 內積
      • 點積向其它資料型別對拓展
    • 7. Projection 投影
      • 7.1 Projection onto 1D subspaces
      • 7.2 Projection onto k k k-dimensional subspaces
    • 8. PCA derivation 主成分分析推導
      • 8.1 Setting up ( X n = ∑ i = 1 D β i n b i X_n=\sum_{i=1}^D\beta_{in}b_i Xn?=i=1D?βin?bi?, X n ~ = ∑ i = i M β i n b i \tilde{X_n} = \sum_{i=i}^M\beta_{in}b_i Xn?~?=i=iM?βin?bi?, J = 1 N ∥ X n ? X n ~ ∥ 2 \mathbf{J} =\frac{1}{N}\|X_n-\tilde{X_n}\|^2 J=N1?Xn??Xn?~?2, S = 1 N ∑ n = 1 N X n X n T \mathrm{S}=\frac{1}{N}\sum_{n=1}^N X_nX_n^T S=N1?n=1N?Xn?XnT?)
      • 8.2 got coordinate/code β i n \beta_{in} βin? ( β i n = X n T b i \beta_{in}=X_n^Tb_i βin?=XnT?bi?)
      • 8.3 rewrite the formula ( X n ? X n ~ = ∑ i = M + 1 D ( b i T X n ) b i X_n-\tilde{X_n}=\sum_{i=M+1}^D (b_i^T X_n) b_i Xn??Xn?~?=i=M+1D?(biT?Xn?)bi?)
      • 8.4 redefine J = B ′ B ′ T S \mathrm{J} = B'B'^TS J=BBTS
      • 8.5 solve b i b_i bi?
    • 9. Key steps of PCA algorithm
      • 9.1 *zscore* transformation
      • 9.2 Projection matrix computation
      • 9.3 Projection
    • 10. PCA in high dimensions 高維資料的優化

第一部分 Linear Algebra 線性代數

1. Vector operations 矢量運算

  • commutative 交換律: r + s = s + r \text{commutative 交換律:} \quad r + s = s + r commutative 交換律:r+s=s+r
  • 2 r = r + r 2r = r + r 2r=r+r
  • ∥ r ∥ 2 = ∑ i r i 2 \|r\|^2 = \sum_{i} r_i^2 r2=i?ri2?

1.1 dot or inner product 點積/數量積/內積

點積是一種特殊的內積
r ? s = ∑ i r i s i r \cdot s = \sum_{i} r_i s_i r?s=i?ri?si?

  • commutative 交換律: r ? s = s ? r \text{commutative 交換律:} \quad r \cdot s = s \cdot r commutative 交換律:r?s=s?r
  • distributive 分配律: r ? ( s + t ) = r ? s + r ? t \text{distributive 分配律:} \quad r \cdot (s + t) = r \cdot s + r \cdot t distributive 分配律:r?(s+t)=r?s+r?t
  • associative 結合律 r ? ( a s ) = a ( r ? s ) \text{associative 結合律} \quad r \cdot (a s) = a(r \cdot s) associative 結合律r?(as)=a(r?s)
  • r ? r = ∥ r ∥ 2 r \cdot r = \|r\|^2 r?r=r2
  • r ? s = ∥ r ∥ ∥ s ∥ cos ? θ r \cdot s = \|r\| \|s\| \cos \theta r?s=rscosθ

1.2 scalar and vector projection 投影

  • scalar projection 投影/標量投影
    例:向量s在向量r上的投影 r ? s ∥ r ∥ \frac{r \cdot s}{\|r\|} rr?s?
  • vector projection 矢量投影
    例:向量s在向量r上的投影 r ? s r ? r r \frac{r \cdot s} {r \cdot r} r r?rr?s?r

2. basis 基

A basis is a set of n n n vectors that:

  • are not linear combinations of each other
  • span the space
    The Space is then n-dimensional.

在線性空間 V V V中,如果存在 n n n個元素 a 1 , a 2 , … , a n a_1,a_2,\dots,a_n a1?,a2?,,an?,滿足:

  • a 1 , a 2 , … , a n a_1,a_2,\dots,a_n a1?,a2?,,an?線性無關;
  • V V V中任一元素 a a a總可由 a 1 , a 2 , … , a n a_1,a_2,\dots,a_n a1?,a2?,,an?線性表示,

那么, a 1 , a 2 , … , a n a_1,a_2,\dots,a_n a1?,a2?,,an?就稱為線性空間 V V V的一個 n n n稱為線性空間 V V V維數,只含有一個零元素的線性空間沒有基,規定它的維數為0.
維數為 n n n的線性空間稱為 n n n維線性空間,記作 V n V_n Vn?
(同濟大學線性代數第五版第六章第二節)

3. Matrices 矩陣

m × n m \times n m×n個數 a i j ( i = 1 , 2 , ? ? , m ; j = 1 , 2 , … , n ) a_{ij}(i=1,2,\cdots,m;j=1,2,\dots,n) aij?(i=1,2,?,m;j=1,2,,n)排成的 m m m n n n列的數表稱為 m m m n n n列矩陣,簡稱 m × n m \times n m×n矩陣,記作
A = ( a 11 a 12 ? a 1 n a 21 a 22 ? a 2 n ? ? ? ? a m 1 a m 2 ? a m n ) A = \begin{pmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{pmatrix} A=??????a11?a21??am1??a12?a22??am2???????a1n?a2n??amn????????
(同濟大學線性代數第五版第二章第一節)

矩陣與向量相乘

[ a b c d ] [ e f ] = [ a e + b f c e + d f ] \begin{bmatrix} a & b \\ c & d \end{bmatrix} \begin{bmatrix} e \\ f \end{bmatrix} = \begin{bmatrix} ae + bf \\ ce + df \end{bmatrix} [ac?bd?][ef?]=[ae+bfce+df?]

  • 向量與矩陣相乘可以理解為: 向量 r r r經過矩陣A變換為 r ′ A r = r ′ r' \quad Ar=r' rAr=r
  • A ( n r ) = n ( A r ) = n r ′ A(nr) = n(Ar) = nr' A(nr)=n(Ar)=nr
  • 分配律 A ( r + s ) = A r + A s \quad A(r + s) = Ar + As A(r+s)=Ar+As
  • Identity 單位矩陣
    I = [ 1 0 0 1 ] I = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} I=[10?01?]
  • clockwise rotation by θ \theta θ 順時針旋轉 θ \theta θ角度
    [ cos ? θ sin ? θ ? sin ? θ cos ? θ ] \begin{bmatrix} \cos \theta & \sin \theta \\ - \sin \theta & \cos \theta \end{bmatrix} [cosθ?sinθ?sinθcosθ?]
  • determinant of 2×2 matrix 行列式
    ∣ A ∣ = d e t A = d e t [ a b c d ] = a d ? b d |A| = det A= det \begin{bmatrix} a & b \\ c & d \end{bmatrix} = ad - bd A=detA=det[ac?bd?]=ad?bd
  • inverse of 2×2 matrix 逆矩陣
    [ a b c d ] ? 1 = 1 a d ? b c [ d ? b ? c a ] \begin{bmatrix} a & b \\ c & d \end{bmatrix}^{-1} = \frac {1} {ad -bc} \begin{bmatrix} d & -b \\ -c & a \end{bmatrix} [ac?bd?]?1=ad?bc1?[d?c??ba?]
  • summation convention for multiplying matrices A A A and B B B
    A B = C AB=C AB=C
    c i k = a b i k = ∑ j a i j b j k c_{ik} = ab_{ik} = \sum_j a_{ij}b_{jk} cik?=abik?=j?aij?bjk?

4. change of basis 基變換/坐標變換

Change from an original basis to a new, pried basis. The columns of the transformation matrix B B B are the new basis vectors in the original coordinate system. So
B r ′ = r Br' = r Br=r
where r ′ r' r is the vector in the B B B-basis, and r r r is the vector in the original basis. Or;
r ′ = B ? 1 r r' = B^{-1}r r=B?1r
If a matrix A A A is orthonormal (all the columns are of unit size and orthogonal to each other) then
矩陣 A A A是正交矩陣(正交陣)的充分必要條件是 A A A等列向量都是單位向量,且兩兩正交,
A T = A ? 1 A^T = A^{-1} AT=A?1

A T A ? 1 = E A^TA^{-1}=E ATA?1=E

[ a 1 T a 2 T … a n T ] ( a 1 , a 2 , … , a n ) = E \begin{bmatrix} a_1^T \\ a_2^T \\ \dots \\ a_n^T \end{bmatrix} (a_1,a_2,\dots,a_n)=E ?????a1T?a2T?anT???????(a1?,a2?,,an?)=E
也即
( a i T a j ) = ( δ i j ) (a_i^Ta_j) = (\delta_{ij}) (aiT?aj?)=(δij?)
相當于 n 2 n^2 n2個關系式
a i T a j = δ i j = { 1 when i = j 0 when i ≠ j a_i^Ta_j = \delta_{ij} = \begin{cases} 1 & \quad \text{when } i = j\\ 0 & \quad \text{when } i \neq j \end{cases} aiT?aj?=δij?={10?when i=jwhen i?=j?
因為 A T = A ? 1 A^T=A^{-1} AT=A?1,所以上述結論對 A A A的行向量亦成立,
(補充閱讀:同濟大學線性代數第五版第六章第三節)

5. Gram-Schmidt process for constructing an orthonoral basis 用格拉姆-施密特正交化構建正交基

Start with n n n linearly independent basis vectors v = { v 1 , v 2 , … , v n } v = \{ v_1,v_2,\dots,v_n \} v={v1?,v2?,,vn?}. Then
e 1 = v 1 ∥ v 1 ∥ e_1 = \frac {v_1} {\|v_1\|} e1?=v1?v1??
u 2 = v 2 ? ( v 2 ? e 1 ) e 1 u_2 = v_2 - (v_2 \cdot e_1)e_1 u2?=v2??(v2??e1?)e1? so e 2 = u 2 ∥ u 2 ∥ e_2 = \frac {u_2} {\|u_2\|} e2?=u2?u2??
… and so on for u 3 u_3 u3? being the remnant part of v 3 v_3 v3? not composed of the preceding e e e-vectors, etc. …

6. Transformation in a Plane or other object

First transform into the basis referred to the reflection plane, or whicherev; E ? 1 E^{-1} E?1.
Then do the reflection or other transformation, in the plane of the object T E T_E TE?.
Then transform back intor the original basis E.
So our transformed vector r ′ = E T E E ? 1 r r' = ET_EE^{-1}r r=ETE?E?1r

7. Eigenstuff 特征分解

To investigate the characteristics of the n n n by n n n matrix A A A, you are looking for the solutions the the equation,
A x = λ x Ax=\lambda x Ax=λx
where λ \lambda λ is a scalar eigenvalue. Eigenvalues will staisfy the following condition
( A ? λ I ) x = 0 (A-\lambda I)x = 0 (A?λI)x=0
where I I I is a n n n by n n n dimensional identity matrix

8. PageRank

To find the dominant eigenvector of link matrix L L L, the Power Method can be iteratively applied, staring from a uniform initial vector r ? \vec{r} r .
r i + 1 = L r i r^{i+1} = Lr^i ri+1=Lri
A damping factor, d, can be implement to stabilize this method as follows.
r i + 1 = d L r i + 1 ? d n r^{i+1} = dLr^i + \frac{1-d}{n} ri+1=dLri+n1?d?

第二部分 Multivariate 多元微積分

1. Defenition of a derivative 積分定義

f ′ ( s ) = d f ( x ) d x = lim ? x → 0 f ( x + Δ x ) ? f ( x ) Δ x f'(s) = \frac{\mathrm{d}f(x)}{\mathrm{d}x} = \lim\limits_{x \to 0} \frac{f(x + \Delta x) -f(x)}{\Delta x} f(s)=dxdf(x)?=x0lim?Δxf(x+Δx)?f(x)?

2. Time saving rules

  • Sum Rule:
    d d x ( f ( x ) + g ( x ) ) = d d x ( f ( x ) ) + d d x ( g ( x ) ) \frac{\mathrm{d}}{\mathrm{d}x}(f(x)+g(x)) = \frac{\mathrm{d}}{\mathrm{d}x}(f(x)) + \frac{\mathrm {d}}{\mathrm{d}x}(g(x)) dxd?(f(x)+g(x))=dxd?(f(x))+dxd?(g(x))
  • Power Rule:
    f ( x ) = a x b f(x) = ax^b f(x)=axb
    f ′ ( x ) = a b x b ? 1 f'(x) = abx^{b-1} f(x)=abxb?1
  • Product Rule:
    A ( x ) = f ( x ) g ( x ) A(x) = f(x)g(x) A(x)=f(x)g(x)
    A ′ ( x ) = f ′ ( x ) g ( x ) + f ( x ) g ′ ( x ) A'(x) = f'(x)g(x) + f(x)g'(x) A(x)=f(x)g(x)+f(x)g(x)
  • Chain Rule:
    If h = h ( p ) h = h(p) h=h(p) and p = p ( m ) p = p(m) p=p(m)
    then d h d m = d h d p × d p d m \frac{\mathrm {d}h}{\mathrm{d}m} = \frac{\mathrm{d}h}{\mathrm{d}p} × \frac{\mathrm{d}p}{\mathrm{d}m} dmdh?=dpdh?×dmdp?
  • Total derivative:
    For the function f ( x , y , z , … ? ) f(x, y, z, \dots) f(x,y,z,), where each variable is a function of parameter t t t, the total derivative is
    d f d t = ? f ? x d x d t + ? f ? y d y d t + ? f ? z d z d t + … \frac{\mathrm{d}f}{\mathrm{d}t} = \frac{\partial f}{\partial x}\frac{\mathrm{d}x}{\mathrm{d}t} + \frac{\partial f}{\partial y}\frac{\mathrm{d}y}{\mathrm{d}t} + \frac{\partial f}{\partial z}\frac{\mathrm{d}z}{\mathrm{d}t} + \dots dtdf?=?x?f?dtdx?+?y?f?dtdy?+?z?f?dtdz?+

3. Derivatives of named functions 常見函式的導數

  • ? ? x 1 x = ? 1 x 2 \frac{\partial}{\partial x}\frac{1}{\mathrm x}=-\frac{1}{\mathrm x^2} ?x??x1?=?x21?
  • ? ? x sin ? x = cos ? x \frac{\partial}{\partial x}\sin x = \cos x ?x??sinx=cosx
  • ? ? x cos ? x = ? sin ? x \frac{\partial}{\partial x}\cos x = - \sin x ?x??cosx=?sinx
  • ? ? x exp ? x = exp ? x \frac{\partial}{\partial x}\exp x = \exp x ?x??expx=expx

4. Derivative structures

f = f ( x , y , z ) f = f(x,y,z) f=f(x,y,z)

  • Jacobian:
    J f = [ ? f ? x , ? f ? y , ? f ? z ] \mathbf J_f = \begin{bmatrix} \frac{\partial f}{\partial x}, & \frac{\partial f}{\partial y}, & \frac{\partial f}{\partial z} \end{bmatrix} Jf?=[?x?f?,??y?f?,??z?f??]
  • Hessian:
    H f = [ ? 2 f ? x 2 ? 2 f ? x ? y ? 2 f ? x ? z ? 2 f ? y ? x ? 2 f ? y 2 ? 2 f ? y ? z ? 2 f ? z ? x ? 2 f ? z ? y ? 2 f ? z 2 ] \mathbf H_f = \begin{bmatrix} \frac{\partial^2 f}{\partial x^2} & \frac{\partial^2 f}{\partial x \partial y} & \frac{\partial^2 f}{\partial x \partial z} \\ \frac{\partial^2 f}{\partial y \partial x} & \frac{\partial^2 f}{\partial y^2} & \frac{\partial^2 f}{\partial y \partial z} \\ \frac{\partial^2 f}{\partial z \partial x} & \frac{\partial^2 f}{\partial z \partial y} & \frac{\partial^2 f}{\partial z^2} \end{bmatrix} Hf?=?????x2?2f??y?x?2f??z?x?2f???x?y?2f??y2?2f??z?y?2f???x?z?2f??y?z?2f??z2?2f??????

5. Taylor Series 泰勒展開式

5.1 Maclaurin 麥克勞林展開式

f ( x ) = f ( 0 ) + f ′ ( c ) ( x ) + 1 2 f ′ ′ ( 0 ) ( x ) 2 + ? = ∑ n = 0 ∞ f ( n ) ( 0 ) n ! ( x ) n f(x) = f(0) + f'(c)(x) + \frac{1}{2}f''(0)(x)^2 + \dots = \sum_{n=0}^{\infty}\frac{f^{(n)}(0)}{n!}(x)^n f(x)=f(0)+f(c)(x)+21?f(0)(x)2+?=n=0?n!f(n)(0)?(x)n

5.2 泰勒展開式

  • Univariate 一元:
    f ( x ) = f ( c ) + f ′ ( c ) ( x ? c ) + 1 2 f ′ ′ ( c ) ( x ? c ) 2 + ? = ∑ n = 0 ∞ f ( n ) ( c ) n ! ( x ? c ) n f(x) = f(c) + f'(c)(x-c) + \frac{1}{2}f''(c)(x-c)^2 + \dots = \sum_{n=0}^{\infty}\frac{f^{(n)}(c)}{n!}(x-c)^n f(x)=f(c)+f(c)(x?c)+21?f(c)(x?c)2+?=n=0?n!f(n)(c)?(x?c)n

f ( x + Δ x ) = f ( x ) + f ′ ( x ) Δ x + 1 2 f ′ ′ ( x ) Δ x 2 = ∑ n = 0 ∞ f ( n ) ( x ) n ! Δ x n f(x+\Delta x) = f(x) + f'(x)\Delta x+ \frac{1}{2}f''(x)\Delta x^2 = \sum_{n=0}^{\infty}\frac{f^{(n)}(x)}{n!}\Delta x^{n} f(x+Δx)=f(x)+f(x)Δx+21?f(x)Δx2=n=0?n!f(n)(x)?Δxn

  • Multivariate 多元:
    f ( x ) = f ( c ) + J f ( c ) ( x ? c ) + … 1 2 ( x ? c ) t H f ( c ) ( x ? c ) + … f(x) = f(c) + \mathbf J_f(c)(x-c) + \dots \\ \frac{1}{2}(x-c)^t\mathbf H_f(c)(x-c) + \dots f(x)=f(c)+Jf?(c)(x?c)+21?(x?c)tHf?(c)(x?c)+
    x x x and c c c are vector for variable and constant

6. Optimization and Vector Calculus

  • Newton-Raphson:
    x i + 1 = x i ? f ( x i ) f ′ ( x i ) x_{i+1} = x_i - \frac{f(x_i)}{f'(x_i)} xi+1?=xi??f(xi?)f(xi?)?
  • Grad:
    ? f = [ ? f ? x ? f ? y ? f ? z ] \nabla f = \begin{bmatrix} \frac{\partial f}{\partial x} \\ \frac{\partial f}{\partial y} \\ \frac{\partial f}{\partial z} \end{bmatrix} ?f=?????x?f??y?f??z?f??????
  • Directional Gradient:
    ? f ? r ^ \nabla f\cdot\hat{r} ?f?r^
  • Gradient Descent:
    s n + 1 = s n ? γ ? f s_{n+1} = s_n - \gamma \nabla f sn+1?=sn??γ?f
  • Lagrange Multipliers λ \lambda λ:
    ? f = λ ? g \nabla f = \lambda \nabla g ?f=λ?g
    [ ? f ? x ? f ? y ] = λ [ ? g ? x ? g ? y ] \begin{bmatrix} \frac{\partial f}{\partial x} \\ \frac{\partial f}{\partial y} \end{bmatrix} = \lambda \begin{bmatrix} \frac{\partial g}{\partial x} \\ \frac{\partial g}{\partial y} \end{bmatrix} [?x?f??y?f??]=λ[?x?g??y?g??]
    ? L ( x , y , λ ) = [ ? f ? x ? λ ? g ? x ? f ? y ? λ ? g ? y ? g ( x ) ] \nabla \mathcal{L} (x,y,\lambda) = \begin{bmatrix} \frac{\partial f}{\partial x} - \lambda \frac{\partial g}{\partial x} \\ \frac{\partial f}{\partial y} - \lambda \frac{\partial g}{\partial y} \\ -g(x) \end{bmatrix} ?L(x,y,λ)=????x?f??λ?x?g??y?f??λ?y?g??g(x)????
  • Least Squares - χ 2 \chi^2 χ2 minimization:
    χ 2 = ∑ i n ( y i ? y ( x i ; a k ) ) 2 σ i \chi^2 = \sum_i^n \frac{(y_i-y(x_i;a_k))^2}{\sigma_i} χ2=in?σi?(yi??y(xi?;ak?))2?
    criterion: ? χ 2 = 0 \nabla \chi^2 = 0 ?χ2=0
    a n e x t = a c u r ? γ ? χ 2 = a c u r + γ ∑ i n ( y i ? y ( x i ; a k ) ) σ i ? y ? a k a_{next} = a_{cur} - \gamma \nabla \chi^2 \\ = a_{cur} + \gamma \sum_i^n \frac{(y_i-y(x_i;a_k))}{\sigma_i} \frac{\partial y}{\partial a_k} anext?=acur??γ?χ2=acur?+γin?σi?(yi??y(xi?;ak?))??ak??y?

第三部分 PCA (Principal Component Analysis) 主成分分析

1. 1-D datasets 一維資料

Given a data set D = { x 1 , ? ? , x N } D = \{x_1,\cdots,x_N\} D={x1?,?,xN?}, x n ∈ R x_n \in R xn?R,

  • Mean Value
    E [ D ] = 1 N ∑ n = 1 N x n E[D]=\frac{1}{N}\sum_{n=1}^{N}x_n E[D]=N1?n=1N?xn?
  • Variance
    V [ D ] = E [ ( x n ? μ ) 2 ] = 1 N ∑ n = 1 n ( x n ? μ ) 2 V[D]=E[(x_n-\mu)^2]=\frac{1}{N}\sum_{n=1}^{n}(x_n-\mu)^2 V[D]=E[(xn??μ)2]=N1?n=1n?(xn??μ)2

2. Definite symmetric matrix

Given a symmetric real matrix M ∈ R n × n M \in R^{n×n} MRn×n, ? z ∈ R n × x + \forall z \in R^{n×x}+ ?zRn×x+. When z T M z > 0 z^TMz > 0 zTMz>0, then M M M is a positive-definite matrix. When z T M z ≥ 0 z^TMz \geq 0 zTMz0, then M M M is a positiv semi-definite matrix.

3. higher-dimensional datasets 高維資料

Given a data set X = { x 1 , ? ? , x N } X = \{x_1,\cdots,x_N\} X={x1?,?,xN?}, x n ∈ R D × 1 x_n \in R^{D×1} xn?RD×1, X ∈ R D × N X \in R^{D×N} XRD×N
X = [ x 1 , 1 x 1 , 2 ? x 1 , N x 2 , 1 x 2 , 2 ? x 2 , N ? ? ? ? x D , 1 x D , 2 ? x D , N ] X = \begin{bmatrix} x_{1,1} & x_{1,2} & \cdots & x_{1,N} \\ x_{2,1} & x_{2,2} & \cdots & x_{2,N} \\ \vdots & \vdots & \ddots & \vdots \\ x_{D,1} & x_{D,2} & \cdots & x_{D,N} \end{bmatrix} X=??????x1,1?x2,1??xD,1??x1,2?x2,2??xD,2???????x1,N?x2,N??xD,N????????

  • Mean Value
    μ = E [ X ] = 1 N ∑ n = 1 N x n = [ μ 1 μ 2 ? μ D ] ∈ R D × 1 \mu=E[X]=\frac{1}{N}\sum_{n=1}^{N}x_n = \begin{bmatrix} \mu_1 \\ \mu_2 \\ \vdots \\ \mu_D \end{bmatrix} \in R^{D×1} μ=E[X]=N1?n=1N?xn?=??????μ1?μ2??μD????????RD×1
  • Variance
    V [ X ] = 1 N ∑ n = 1 N ( x n ? μ ) ( x n ? μ ) T = 1 N [ ( x n , 1 ? μ n , 1 ) ( x n , 1 ? μ ) T ( x n , 2 ? μ ) ( x n , 1 ? μ ) T ? ( x n , D ? μ ) ( x n , 1 ? μ ) T ( x n , 1 ? μ ) ( x n , 2 ? μ ) T ( x n , 2 ? μ ) ( x n , 2 ? μ ) T ? ( x n , D ? μ ) ( x n , 2 ? μ ) T ? ? ? ? ( x n , 1 ? μ ) ( x n , D ? μ ) T ( x n , 2 ? μ ) ( x n , D ? μ ) T ? ( x n , D ? μ ) ( x n , D ? μ ) T ] ∈ R D × D \begin{aligned}V[X]&=\frac{1} {N}\sum_{n=1}^N(x_n - \mu) (x_n - \mu)^T \\ &= \frac{1}{N} \begin{bmatrix} (x_{n,1} - \mu_{n,1})(x_{n,1} - \mu)^T & (x_{n,2} - \mu)(x_{n,1} - \mu)^T & \cdots & (x_{n,D} - \mu)(x_{n,1} - \mu)^T \\ (x_{n,1} - \mu)(x_{n,2} - \mu)^T & (x_{n,2} - \mu)(x_{n,2} - \mu)^T & \cdots & (x_{n,D} - \mu)(x_{n,2} - \mu)^T \\ \vdots & \vdots & \ddots & \vdots \\ (x_{n,1} - \mu)(x_{n,D} - \mu)^T & (x_{n,2} - \mu)(x_{n,D} - \mu)^T & \cdots & (x_{n,D} - \mu)(x_{n,D} - \mu)^T \end{bmatrix} \in R^{D×D} \end{aligned} V[X]?=N1?n=1N?(xn??μ)(xn??μ)T=N1???????(xn,1??μn,1?)(xn,1??μ)T(xn,1??μ)(xn,2??μ)T?(xn,1??μ)(xn,D??μ)T?(xn,2??μ)(xn,1??μ)T(xn,2??μ)(xn,2??μ)T?(xn,2??μ)(xn,D??μ)T??????(xn,D??μ)(xn,1??μ)T(xn,D??μ)(xn,2??μ)T?(xn,D??μ)(xn,D??μ)T???????RD×D?
    D D D維資料的方差為 D × D D \times D D×D的矩陣,對角線上的元素 a i i a_{ii} aii?為第 i i i維資料第方差,其它元素 a i j a_{ij} aij?是第 i i i維與 j j j維資料第協方差,

4. Effect of Linear Transformations 線性變換對均值與方差對影響

Given a data set D = { x 1 , ? ? , x N } D = \{x_1,\cdots,x_N\} D={x1?,?,xN?}, x n ∈ R D × 1 x_n \in R^{D×1} xn?RD×1, D ∈ R D × N D \in R^{D×N} DRD×N, with
E [ D ] = μ E[D] = \mu E[D]=μ
V [ D ] = Q V[D] = Q V[D]=Q
linear transformations:
x i ′ = A x i + b x'_i = Ax_i +b xi?=Axi?+b
then
E [ D ′ ] = A μ + b E[D'] = A\mu+b E[D]=Aμ+b
V [ D ′ ] = A Q A T V[D'] = AQA^T V[D]=AQAT
where D ′ = { x 1 ′ , x 2 ′ , ? ? , x N ′ } D' = \{x'_1,x'_2,\cdots,x'_N\} D={x1?,x2?,?,xN?}

5. Dot product 點積

  • dot product
    x T y = ∑ d = 1 D x d y d , x , y ∈ R D x^Ty=\sum_{d=1}^Dx_dy_d, \quad x,y \in R^D xTy=d=1D?xd?yd?,x,yRD
  • length
    ∥ x ∥ = x T x \|x\|=\sqrt{x^Tx} x=xTx ?
  • angle ω \omega ω between vectors x x x, y y y
    c o s ω = x T y ∥ x ∥ ∥ y ∥ cos\omega=\frac{x^Ty}{\|x\|\|y\|} cosω=xyxTy?

6. Inner product 內積

Consider a vector space V. A positive definete, symmetric bilinear mapping ? ? , ? ? : V × V → R \langle\cdot,\cdot\rangle: V \times{} V \to R ??,??:V×VR is called an inner product on V V V.

  • symmetric: ? x , y ∈ V ? x , y ? = ? y , x ? \forall x, y \in V \quad \langle x,y \rangle = \langle y,x\rangle ?x,yV?x,y?=?y,x?
  • positive definite: ? x ∈ V \ { 0 } ? x , x ? > 0 , ? 0 , 0 ? = 0 \forall x \in V\backslash\{0\} \quad \langle x, x \rangle > 0, \langle 0,0 \rangle=0 ?xV\{0}?x,x?>0,?0,0?=0
  • bilinear: ? x , y , z ∈ V , λ ∈ R \forall x,y,z \in V, \lambda \in R ?x,y,zV,λR
    ? λ x + y , z ? = λ ? x , z ? + ? y , z ? \langle \lambda x+y,z\rangle = \lambda\langle x,z\rangle+\langle y,z\rangle ?λx+y,z?=λ?x,z?+?y,z?
    ? x , λ y + z ? = λ ? x , y ? + ? x , z ? \langle x,\lambda y+z\rangle = \lambda\langle x,y\rangle+\langle x,z\rangle ?x,λy+z?=λ?x,y?+?x,z?
  • length of a vector x ∈ V x \in V xV
    ∥ x ∥ = ? x , x ? \|x\|=\sqrt{\langle x,x\rangle} x=?x,x? ?
  • distance between two vectors x , y ∈ V x,y \in V x,yV
    d ( x , y ) = ∥ x ? y ∥ = ? x ? y , x ? y ? d(x,y)=\|x-y\|=\sqrt{\langle x-y,x-y\rangle} d(x,y)=x?y=?x?y,x?y? ?
  • angle ω \omega ω between two vectors x , y ∈ V x,y\in V x,yV
    c o s ω = ? x , y ? ∥ x ∥ ∥ y ∥ cos\omega=\frac{\langle x,y\rangle}{\|x\|\|y\|} cosω=xy?x,y??
    where ∥ x ∥ \|x\| x is defined via ineer product as ? x , x ? \sqrt{\langle x,x\rangle} ?x,x? ?

點積向其它資料型別對拓展

  • Ineer product for continuous data

? f , g ? = ∫ a b f ( x ) g ( x ) d x \langle f,g \rangle =\int\limits_a^b f(x)g(x)\mathrm{d}x ?f,g?=ab?f(x)g(x)dx

  • Inner product for random variables

? X , Y ? = C o v ( X , Y ) \langle X,Y \rangle=Cov(X,Y) ?X,Y?=Cov(X,Y)

7. Projection 投影

7.1 Projection onto 1D subspaces

Consider a vector space V V V and a subspace U U U of V V V. With a basis vector b b b of U U U, we obtain the orthogonal projection of any x ∈ V x \in V xV onto U U U via
π u ( x ) = λ b , λ = b T x b T b = b T x ∥ b ∥ 2 \pi_u(x) = \lambda b, \quad \lambda=\frac{b^Tx}{b^Tb}=\frac{b^Tx}{\|b\|^2} πu?(x)=λb,λ=bTbbTx?=b2bTx?
where λ \lambda λ is the coordinate of π u ( x ) \pi_u(x) πu?(x) with respect to b b b.
The projection matrix P P P is
P = b b T b T b = b b T ∥ b ∥ 2 P=\frac{bb^T}{b^Tb}=\frac{bb^T}{\|b\|^2} P=bTbbbT?=b2bbT?
such that
π u ( x ) = P x \pi_u(x)=Px πu?(x)=Px
for all x ∈ V x\in V xV

7.2 Projection onto k k k-dimensional subspaces

Consider a vector space V V V and a subspace U U U of V V V. With a basis vector b 1 , ? ? , b k b_1,\cdots,b_k b1?,?,bk? of U U U, we obtain the orthogonal projection of any x ∈ V x \in V xV onto U U U via
π u ( x ) = B λ , λ = ( B T B ) ? 1 B T x \pi_u(x) = B\lambda,\quad \lambda=(B^TB)^{-1}B^Tx πu?(x)=Bλ,λ=(BTB)?1BTx
B = ( b 1 ∣ ? ∣ b k ) ∈ R n × k B=(b_1|\cdots|b_k)\in R^{n\times k} B=(b1??bk?)Rn×k
where λ \lambda λ is the coordinate of π u ( x ) \pi_u(x) πu?(x) with respect to b 1 , ? ? , b k b_1,\cdots,b_k b1?,?,bk? of U U U.
The projection matrix P P P is
P = B ( B T B ) ? 1 B T P=B(B^TB)^{-1}B^T P=B(BTB)?1BT
such that
π u ( x ) = P x \pi_u(x)=Px πu?(x)=Px
for all x ∈ V x\in V xV

8. PCA derivation 主成分分析推導

8.1 Setting up ( X n = ∑ i = 1 D β i n b i X_n=\sum_{i=1}^D\beta_{in}b_i Xn?=i=1D?βin?bi?, X n ~ = ∑ i = i M β i n b i \tilde{X_n} = \sum_{i=i}^M\beta_{in}b_i Xn?~?=i=iM?βin?bi?, J = 1 N ∥ X n ? X n ~ ∥ 2 \mathbf{J} =\frac{1}{N}\|X_n-\tilde{X_n}\|^2 J=N1?Xn??Xn?~?2, S = 1 N ∑ n = 1 N X n X n T \mathrm{S}=\frac{1}{N}\sum_{n=1}^N X_nX_n^T S=N1?n=1N?Xn?XnT?)

Given a data set X = x 1 , ? ? , x n X={x_1,\cdots,x_n} X=x1?,?,xn?, x i ∈ R D x_i\in R^D xi?RD, E [ X n ] = 0 \mathrm{E}[X_n]=0 E[Xn?]=0, Original basis A = ( a 1 , ? ? , a n ) A=(a_1,\cdots,a_n) A=(a1?,?,an?), project to a new orthonormal basis B = ( b 1 , ? ? , b n ) , b n ∈ R D B=(b_1,\cdots,b_n),\quad b_n \in R^D B=(b1?,?,bn?),bn?RD

S = 1 N ∑ n = 1 N ( x n ? μ ) ( x n ? μ ) T = 1 N ∑ n = 1 N X n X n T ( E [ X n ] = 0 ) \begin{aligned}\mathrm{S} &= \frac{1} {N}\sum_{n=1}^N (x_n - \mu) (x_n - \mu)^T \\ &= \frac{1}{N}\sum_{n=1}^N X_{n} X_{n}^T \quad (\mathrm{E}[X_n]=0) \end{aligned} S?=N1?n=1N?(xn??μ)(xn??μ)T=N1?n=1N?Xn?XnT?(E[Xn?]=0)?
X X X represented in new basis
X n = ∑ i = 1 D β i n b i X_n=\sum_{i=1}^D\beta_{in}b_i Xn?=i=1D?βin?bi?
Our goal is represent X n X_n Xn? in D-dimentional space to a lower M-dimentional
X n ~ = ∑ i = i M β i n b i \tilde{X_n} = \sum_{i=i}^M\beta_{in}b_i Xn?~?=i=iM?βin?bi?
with minimum difference between X n X_n Xn? and X n ~ \tilde{X_n} Xn?~?. The cost function
J = 1 N ∥ X n ? X n ~ ∥ 2 \mathbf{J} =\frac{1}{N}\|X_n-\tilde{X_n}\|^2 J=N1?Xn??Xn?~?2

8.2 got coordinate/code β i n \beta_{in} βin? ( β i n = X n T b i \beta_{in}=X_n^Tb_i βin?=XnT?bi?)

? J ? β i n = ? J ? X n ~ ? X n ~ ? β i n = ? 2 N ( X n ? X n ~ ) T b i = ? 2 N ( X n ? ∑ i = i M β i n b i ) T b i ( X n ~ = ∑ i = i M β i n b i ) = ? 2 N ( X n ? β i n ∑ i = i M b i ) T b i ( β i n is scalar ) = ? 2 N ( X n T b i ? β i n b i T b i ) ( ONB ) = ? 2 N ( X n T b i ? β i n ) ( ONB ) \begin{aligned} \frac{\partial\mathbf{J}}{\partial\beta_{in}} &= \frac{\partial\mathbf{J}}{\partial\tilde{X_n}}\frac{\partial\tilde{X_n}}{\partial\beta_{in}} \\ &= -\frac{2}{N}(X_n-\tilde{X_n})^Tb_i \\ &= -\frac{2}{N}(X_n-\sum_{i=i}^M\beta_{in}b_i)^Tb_i \quad (\tilde{X_n} = \sum_{i=i}^M\beta_{in}b_i) \\ &= -\frac{2}{N}(X_n-\beta_{in}\sum_{i=i}^Mb_i)^Tb_i \quad (\beta_{in}\text{ is scalar})\\ &= -\frac{2}{N}(X_n^Tb_i-\beta_{in}b_i^Tb_i) \quad (\text{ONB}) \\ &= -\frac{2}{N}(X_n^Tb_i-\beta_{in}) \quad (\text{ONB}) \end{aligned}

轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/208946.html

標籤:其他

上一篇:周期信號的傅里葉級數展開分析(利用MATLAB)

下一篇:【數學建模】1層次分析法模型部分

標籤雲
其他(157675) Python(38076) JavaScript(25376) Java(17977) C(15215) 區塊鏈(8255) C#(7972) AI(7469) 爪哇(7425) MySQL(7132) html(6777) 基礎類(6313) sql(6102) 熊猫(6058) PHP(5869) 数组(5741) R(5409) Linux(5327) 反应(5209) 腳本語言(PerlPython)(5129) 非技術區(4971) Android(4554) 数据框(4311) css(4259) 节点.js(4032) C語言(3288) json(3245) 列表(3129) 扑(3119) C++語言(3117) 安卓(2998) 打字稿(2995) VBA(2789) Java相關(2746) 疑難問題(2699) 细绳(2522) 單片機工控(2479) iOS(2429) ASP.NET(2402) MongoDB(2323) 麻木的(2285) 正则表达式(2254) 字典(2211) 循环(2198) 迅速(2185) 擅长(2169) 镖(2155) 功能(1967) .NET技术(1958) Web開發(1951) python-3.x(1918) HtmlCss(1915) 弹簧靴(1913) C++(1909) xml(1889) PostgreSQL(1872) .NETCore(1853) 谷歌表格(1846) Unity3D(1843) for循环(1842)

熱門瀏覽
  • 網閘典型架構簡述

    網閘架構一般分為兩種:三主機的三系統架構網閘和雙主機的2+1架構網閘。 三主機架構分別為內端機、外端機和仲裁機。三機無論從軟體和硬體上均各自獨立。首先從硬體上來看,三機都用各自獨立的主板、記憶體及存盤設備。從軟體上來看,三機有各自獨立的作業系統。這樣能達到完全的三機獨立。對于“2+1”系統,“2”分為 ......

    uj5u.com 2020-09-10 02:00:44 more
  • 如何從xshell上傳檔案到centos linux虛擬機里

    如何從xshell上傳檔案到centos linux虛擬機里及:虛擬機CentOs下執行 yum -y install lrzsz命令,出現錯誤:鏡像無法找到軟體包 前言 一、安裝lrzsz步驟 二、上傳檔案 三、遇到的問題及解決方案 總結 前言 提示:其實很簡單,往虛擬機上安裝一個上傳檔案的工具 ......

    uj5u.com 2020-09-10 02:00:47 more
  • 一、SQLMAP入門

    一、SQLMAP入門 1、判斷是否存在注入 sqlmap.py -u 網址/id=1 id=1不可缺少。當注入點后面的引數大于兩個時。需要加雙引號, sqlmap.py -u "網址/id=1&uid=1" 2、判斷文本中的請求是否存在注入 從文本中加載http請求,SQLMAP可以從一個文本檔案中 ......

    uj5u.com 2020-09-10 02:00:50 more
  • Metasploit 簡單使用教程

    metasploit 簡單使用教程 浩先生, 2020-08-28 16:18:25 分類專欄: kail 網路安全 linux 文章標簽: linux資訊安全 編輯 著作權 metasploit 使用教程 前言 一、Metasploit是什么? 二、準備作業 三、具體步驟 前言 Msfconsole ......

    uj5u.com 2020-09-10 02:00:53 more
  • 游戲逆向之驅動層與用戶層通訊

    驅動層代碼: #pragma once #include <ntifs.h> #define add_code CTL_CODE(FILE_DEVICE_UNKNOWN,0x800,METHOD_BUFFERED,FILE_ANY_ACCESS) /* 更多游戲逆向視頻www.yxfzedu.com ......

    uj5u.com 2020-09-10 02:00:56 more
  • 北斗電力時鐘(北斗授時服務器)讓網路資料更精準

    北斗電力時鐘(北斗授時服務器)讓網路資料更精準 北斗電力時鐘(北斗授時服務器)讓網路資料更精準 京準電子科技官微——ahjzsz 近幾年,資訊技術的得了快速發展,互聯網在逐漸普及,其在人們生活和生產中都得到了廣泛應用,并且取得了不錯的應用效果。計算機網路資訊在電力系統中的應用,一方面使電力系統的運行 ......

    uj5u.com 2020-09-10 02:01:03 more
  • 【CTF】CTFHub 技能樹 彩蛋 writeup

    ?碎碎念 CTFHub:https://www.ctfhub.com/ 筆者入門CTF時時剛開始刷的是bugku的舊平臺,后來才有了CTFHub。 感覺不論是網頁UI設計,還是題目質量,賽事跟蹤,工具軟體都做得很不錯。 而且因為獨到的金幣制度的確讓人有一種想去刷題賺金幣的感覺。 個人還是非常喜歡這個 ......

    uj5u.com 2020-09-10 02:04:05 more
  • 02windows基礎操作

    我學到了一下幾點 Windows系統目錄結構與滲透的作用 常見Windows的服務詳解 Windows埠詳解 常用的Windows注冊表詳解 hacker DOS命令詳解(net user / type /md /rd/ dir /cd /net use copy、批處理 等) 利用dos命令制作 ......

    uj5u.com 2020-09-10 02:04:18 more
  • 03.Linux基礎操作

    我學到了以下幾點 01Linux系統介紹02系統安裝,密碼啊破解03Linux常用命令04LAMP 01LINUX windows: win03 8 12 16 19 配置不繁瑣 Linux:redhat,centos(紅帽社區版),Ubuntu server,suse unix:金融機構,證券,銀 ......

    uj5u.com 2020-09-10 02:04:30 more
  • 05HTML

    01HTML介紹 02頭部標簽講解03基礎標簽講解04表單標簽講解 HTML前段語言 js1.了解代碼2.根據代碼 懂得挖掘漏洞 (POST注入/XSS漏洞上傳)3.黑帽seo 白帽seo 客戶網站被黑帽植入劫持代碼如何處理4.熟悉html表單 <html><head><title>TDK標題,描述 ......

    uj5u.com 2020-09-10 02:04:36 more
最新发布
  • 2023年最新微信小程式抓包教程

    01 開門見山 隔一個月發一篇文章,不過分。 首先回顧一下《微信系結手機號資料庫被脫庫事件》,我也是第一時間得知了這個訊息,然后跟蹤了整件事情的經過。下面是這起事件的相關截圖以及近日流出的一萬條資料樣本: 個人認為這件事也沒什么,還不如關注一下之前45億快遞資料查詢渠道疑似在近日復活的訊息。 訊息是 ......

    uj5u.com 2023-04-20 08:48:24 more
  • web3 產品介紹:metamask 錢包 使用最多的瀏覽器插件錢包

    Metamask錢包是一種基于區塊鏈技術的數字貨幣錢包,它允許用戶在安全、便捷的環境下管理自己的加密資產。Metamask錢包是以太坊生態系統中最流行的錢包之一,它具有易于使用、安全性高和功能強大等優點。 本文將詳細介紹Metamask錢包的功能和使用方法。 一、 Metamask錢包的功能 數字資 ......

    uj5u.com 2023-04-20 08:47:46 more
  • vulnhub_Earth

    前言 靶機地址->>>vulnhub_Earth 攻擊機ip:192.168.20.121 靶機ip:192.168.20.122 參考文章 https://www.cnblogs.com/Jing-X/archive/2022/04/03/16097695.html https://www.cnb ......

    uj5u.com 2023-04-20 07:46:20 more
  • 從4k到42k,軟體測驗工程師的漲薪史,給我看哭了

    清明節一過,盲猜大家已經無心上班,在數著日子準備過五一,但一想到銀行卡里的余額……瞬間心情就不美麗了。最近,2023年高校畢業生就業調查顯示,本科畢業月平均起薪為5825元。調查一出,便有很多同學表示自己又被平均了。看著這一資料,不免讓人想到前不久中國青年報的一項調查:近六成大學生認為畢業10年內會 ......

    uj5u.com 2023-04-20 07:44:00 more
  • 最新版本 Stable Diffusion 開源 AI 繪畫工具之中文自動提詞篇

    🎈 標簽生成器 由于輸入正向提示詞 prompt 和反向提示詞 negative prompt 都是使用英文,所以對學習母語的我們非常不友好 使用網址:https://tinygeeker.github.io/p/ai-prompt-generator 這個網址是為了讓大家在使用 AI 繪畫的時候 ......

    uj5u.com 2023-04-20 07:43:36 more
  • 漫談前端自動化測驗演進之路及測驗工具分析

    隨著前端技術的不斷發展和應用程式的日益復雜,前端自動化測驗也在不斷演進。隨著 Web 應用程式變得越來越復雜,自動化測驗的需求也越來越高。如今,自動化測驗已經成為 Web 應用程式開發程序中不可或缺的一部分,它們可以幫助開發人員更快地發現和修復錯誤,提高應用程式的性能和可靠性。 ......

    uj5u.com 2023-04-20 07:43:16 more
  • CANN開發實踐:4個DVPP記憶體問題的典型案例解讀

    摘要:由于DVPP媒體資料處理功能對存放輸入、輸出資料的記憶體有更高的要求(例如,記憶體首地址128位元組對齊),因此需呼叫專用的記憶體申請介面,那么本期就分享幾個關于DVPP記憶體問題的典型案例,并給出原因分析及解決方法。 本文分享自華為云社區《FAQ_DVPP記憶體問題案例》,作者:昇騰CANN。 DVPP ......

    uj5u.com 2023-04-20 07:43:03 more
  • msf學習

    msf學習 以kali自帶的msf為例 一、msf核心模塊與功能 msf模塊都放在/usr/share/metasploit-framework/modules目錄下 1、auxiliary 輔助模塊,輔助滲透(埠掃描、登錄密碼爆破、漏洞驗證等) 2、encoders 編碼器模塊,主要包含各種編碼 ......

    uj5u.com 2023-04-20 07:42:59 more
  • Halcon軟體安裝與界面簡介

    1. 下載Halcon17版本到到本地 2. 雙擊安裝包后 3. 步驟如下 1.2 Halcon軟體安裝 界面分為四大塊 1. Halcon的五個助手 1) 影像采集助手:與相機連接,設定相機引數,采集影像 2) 標定助手:九點標定或是其它的標定,生成標定檔案及內參外參,可以將像素單位轉換為長度單位 ......

    uj5u.com 2023-04-20 07:42:17 more
  • 在MacOS下使用Unity3D開發游戲

    第一次發博客,先發一下我的游戲開發環境吧。 去年2月份買了一臺MacBookPro2021 M1pro(以下簡稱mbp),這一年來一直在用mbp開發游戲。我大致分享一下我的開發工具以及使用體驗。 1、Unity 官網鏈接: https://unity.cn/releases 我一般使用的Apple ......

    uj5u.com 2023-04-20 07:40:19 more