导图社区 导图Multiple Linear Regression
CFA 二级 思维导图 多元线性回归 包括假设、假设的违背、模型的设定、模型的不当设定、定性变量的分析。
CFA二级Reading 21,Analysis of Dividends and Share Repurchases 的知识点概要。
CFA二级,大数据分析的步骤、数据挖掘、数据分析、机器学习
不同机器学习分类下一些常用的算法, 算法的基本原理、和应用。
社区模板帮助中心,点此进入>>
互联网9大思维
费用结算流程
租赁费仓储费结算
E其它费用
F1开票注意事项
F2结算费用特别注意事项
洛嘉基地文件存档管理类目
组织架构-单商户商城webAPP 思维导图。
域控上线
CFA一级Ethics-standard思维导图
Multiple Linear Regression
Introduction
t-test
ANOVA
two types of uncertainty:
SEE(standard error of estimate):uncertainty in the regression model itself
b0,b1: uncertainty about the esti mates of the regression coefcients.
随着自变量Xi的数量的增加,R^2会增加,R^2的可靠性降低,此时要对照adjusted R^2来看
Dummy Variables
Dummy variables in a regression model can help analysts determine whether a particular qualitative independent variable explains the model’s dependent variable.一个定性的自变量能否解释因变量
值(0,1)
要在n个分类中确认,需要有n-1个虚拟变量
截距表示被省略的分类X对应的Y平均值,斜率表示每个分类对Y的的增量效果(incremental effect )
与一个变量的线性回归类似
Asumptions and Violations
Asumtions
A linear relation exists between the Xj and Y.
Xj are not random;no exact linear relation exists between Xj,Xk
E(ε)=0
Var(ε)=Var(Yi)
ε is uncorrelated across observations.
ε is normally distributed
Violations
heteroskedasticity
no conditional heteroskedasticity
conditional heteroskedasticity
Breusch–Pagan test
serial correlation
Positive serial correlation
方差会减小
t-statistics:inflates
F-statistic:inflates
Durbin–Watson statistic (DW)
DW=2*(1-r)
DW的值介于0-4
参考值:DW=2,
DW偏离DW=2太远,表明有序列相关问题
multicollinearity
1个或多个自变量X存在高度相关性
不是完美相关(not perfectly)
t-statistics:不显著 t值小
F-statistic:显著,F值大
单个斜率系数的方差会增加,总体回访差减小
Model Specification misSpecification
Model Specification
cogent economic reasoning
The model should be grounded in cogent economic reasoning
functional form .(LN,对数化)
The functional form chosen for the variables in the regression should be appropriate given the nature of the variables.(LN,对数化)
parsimonious(简约)
The model should be parsimonious(简约).
小X,大Y,见微知著。
assumptions violations
be examined for violations of regression assumptions before being accepted.
useful out of sample
The model should be tested and be found useful out of sample before being accepted.
misSpecification
functional form
variables could be omitted
variables may need to be transformed
pools data from different samples
X correlated with the error term
estimated regression coefcients to be biased and inconsistent
time-series misspecifcation
lagged dependent variables as independent
including a function of dependent variable as an independent variable
independent variables that are measured with error
qualitative dependent variable
Probit models
based on the normal distribution
logit models
based on the logistic distribution