导图社区 机器学习
这是一篇关于机器学习的思维导图,主要从线性拟合、解析几何、矩阵分析、概率和分布和向量微积分等方面进行知识点的概括。
编辑于2021-07-14 02:06:56机器学习
Linear Algebra(线性拟合)
Matrix Addition and Multiplication
Inverse and Transpose(矩阵的逆和转置)
A:E=E:A^-1
(AB)^-1 = B^-1*A^-1
particular and general solution(特解和通解)
Elementary Transformations(矩阵的初等变换)
row echelon form
reduced row echelon form
vector space
groups
All x,y in G,x oper y belongs to G(运算后还在集合内)
All x,y in G, (x oper y) oper z = x oper (y oper z)(满足结合律)
Exist e,All x in G, e oper x = x and x oper e = x2(存在单位元)
Exist x in G,All y in G , x oper y = e and y oper x = e(存在逆元)
Abelian Groups
All x,y in G,x oper y = y oper x(交换律)
vector space V= (v,+,.,)
(v,+)is Abelian group
a(x+y) = ax+ay
(a+b)x = ax+bx
a(bx) = (ab)x
1x = x
vector subspace U
All a int R,All x in U, ax in U
All x,y in R, x+y = U
linear independence
Gaussian elimination(高斯消元法)
row echelon form
linear dependent
linear independent
Basis(最大无关组) and Rank(矩阵的秩)
Basis
row echelon form
dim(A)+dim(B) = dim(A+B)+dim(A∩B)
Rank
Ax=b有解->rk(A:b) = rk(A) ( = n,有唯一解)
Basis change(基底变换)
p57?
Image and kernel
R^n |-> R^m
Image ∈ R^n
span(过渡矩阵A)
kernel ∈ R^m
Span(x):Ax = 0
For vector spaces V,W and a linear mapping $:V->W dim(ker($))+dim(Im($)) = dim(V)
Affine Space(仿射空间)
Analytic Geometry(解析几何)
norm(范数)
Manhattan norm(L1范数)||x||1
Σ|xi|
多维 列绝对值和最大
Euclidean norm(L2范数)||x||2
模(根号下xTx)
Inner Products(内积)
All x,y,z in V,All a,b in R
Ω(ax+by,z) = aΩ(x,z)+bΩ(y,z)
Ω(x,ay+bz) = aΩ(x,y)+bΩ(x,z)
All x,y in V Ω(x,y) = Ω(y,x)
All x in V but {0}: Ω(x,x)>0, Ω(0,0) = 0
lengths and Distance
||x|| = 根号下<x,x>
d(x,y) = 根号下<x-y,x-y>
Angles and Orthogonality
cos w = <x,y>/(||x|| ||y||)
Orthonormal Basis(标准正交基)
模等于1
向量之间垂直(积为0)
Orthogonal Projection(正交投影)
One-Dimensional Subspace
P83
ΠU(x)投影点
<x-ΠU(x),b>=<x-ib,b> = <x,b>-i<b,b> i = <x,b>/<b,b>
ΠU(x) = (bbT/||b||^2)x
Ppi (投影矩阵) = bbT/||b||^2
General Subspaces
P86
BT( x - Bi) = 0 BTBi = BTx i = (BTB)^-1 BT x
ΠU(x) = B(BTB)-1BTx
Ppi = B(BTB)^-1BT
Gram-Schmidt Orthogonalization(施密特正交化)
Rotation
Matrix Decomposition(矩阵分解)
Determinant and Trace
det(A)A的行列式
r,b,g span 的体积 |det(r,g,b)|
tr(A):对角线上元素之和
Eigenvalue and eigenvector
Ax = i x
X 特征向量
i特征值
ATA->V AAT->U
行列式的值->化成上三角行列式
求齐次线性方程的解->化成行最简型矩阵
Vector Calculus(向量微机分)
differentiation of Univariate Function(一元函数的微分)
Taylor Series(泰勒展开式)
P142
PartialDifferentiation and Gradients(偏导和梯度)
Gradients(对每个变量求偏导的到梯度)
Chain rule 链式求导法则
Gradients of vector-valued functions(向量值函数的梯度)
Jacobian(雅可比矩阵)
Gradients of matrices
Probability and Distributions(概率和分布)
Discrete probabilities
1.if AB = NULL then P(AUB) = P(A)+P(B)
2.All A,B,C P(AUB) = P(A)+P(B)-P(AB)
3.P(A) = (i:1~n)(+)P(Bi)P(A|Bi)
4.if A and B independent,P(AB) = P(A)P(B)
Continuous Probabilities
Probability Density Function(概率密度函数)
All x,f(x)>=0
∫f(x)dx = 1
Cumulative Distribution Function(累计分布函数)
Fx(0.5) = P(x<=0.5)
Bayes' theorem(贝叶斯定理)
P(x|y) = P(y|x)P(x)/P(y)
P(xy) = P(x|y)P(y)
P(xy) = P(y|x)P(x)
Summary Statistics and Independence(摘要统计与独立性)
Means and Covariances(均值和方差)
Expect Value(期望值)Ex[g(x)] = ∫g(x)p(x)dx Ex[g(x)] = Σg(x)p(x)
Mean(均值、平均数)
Continuous Optimization(连续优化)
Gradient Descent(梯度下降法)
Constrained Optimization and Lagrange Multipliers
Convex Optimization
Quadratic Programming(二次规划)
子主题