导图社区 中级计量经济学思维导图(详细版)
矩阵语言下的计量经济学思维导图,内容详细,包括:Lecture 1: Algebra and Geometry of OLS;Lecture 2: Properties of Residuals, Regression Fit, and Partitioned Regression;Lecture 3: Unbiasedness and Efficiency of OLS Estimator; Variance Estimation。
编辑于2023-01-22 10:49:08 云南CFA道德部分超详细思路导图,汇总了Quant、Eqty、FSA、Eco、Deri、Alt、Prtfl、CF、Ethnics的内容,建议下载收藏。
公司金融导图lecture01,内容有:The Four Types of Firms、Ownership Versus Control of Corporations、The Stock Market.
矩阵语言下的计量经济学思维导图,内容详细,包括:Lecture 1: Algebra and Geometry of OLS;Lecture 2: Properties of Residuals, Regression Fit, and Partitioned Regression;Lecture 3: Unbiasedness and Efficiency of OLS Estimator; Variance Estimation。
社区模板帮助中心,点此进入>>
CFA道德部分超详细思路导图,汇总了Quant、Eqty、FSA、Eco、Deri、Alt、Prtfl、CF、Ethnics的内容,建议下载收藏。
公司金融导图lecture01,内容有:The Four Types of Firms、Ownership Versus Control of Corporations、The Stock Market.
矩阵语言下的计量经济学思维导图,内容详细,包括:Lecture 1: Algebra and Geometry of OLS;Lecture 2: Properties of Residuals, Regression Fit, and Partitioned Regression;Lecture 3: Unbiasedness and Efficiency of OLS Estimator; Variance Estimation。
E300 Econometric Methods
Lecture 1: Algebra and Geometry of OLS
Multiple regression in matrix notation
Scalar, vector, and matrix notation

OLS problem in different notation
OLS estimator in matrix notations
Geometric interpretation of OLS
Projection matrices. Projection on the space of X. Residual making matrix

Orthogonal projections

Lecture 2: Properties of Residuals, Regression Fit, and Partitioned Regression
Properties of projection matrix

Properties of annihilator matrix

Properties of residuals

Mean of residuals

Regression fit

Frisch-Waugh-Lovell theorem (partitioned regression)
Theorem

Interpretation

proof

Another way to obtain ^β

Lecture 3: Unbiasedness and Efficiency of OLS Estimator; Variance Estimation
Finite Sample: Gauss-Markov assumptions
Variance Estimation
G-M assumptions
(A1) No perfect multicollinearity

(A2) Strict exogeneity

unbiasedness of OLS
conditionally unbiased

unconditional unbiasedness (重要公式L.I.E.)

(A3) Homoskedasticity

(A4) No serial correlation

variance of OLS
cond. cov. matrix of ɛ

重要公式(矩阵方差提出常数)

推导 Cond. variance of OLS

单个variance of ^
推导:使用FWL方法

另一种表达

Factors in the variance

σ²: unobserved determinatnt of Y (可能OVB) 使方差膨胀
TSS_j : larger the total variation of x_ij (样本足够大)使方差变小
R²_j: multicollinearity 使方差膨胀
(A2-A4)归结为spherical error (注意没有规定normality)

G-M结果:OLS is BLUE

Linear estimators

The best estimator

违背的后果:Violation of G-M
(A1) violated (perfect multicollinearity)

problem: non-inversible and not unique beta_hat

Solution: drop some highly corr var.

(A2) violated (endogeneity)

problem: biased

Solution: IV to extract the exog part
(A3) violated (heteroskedasticity)

problem: inefficient but unbiased

Solution: HAC or GLS
(A4) violated (serial correlation)

problem: inefficient but unbiased

Solution: HAC or GLS
Estimation of σ²
公式cond and uncond unbiased ^σ²

推导:从RSS出发寻找E[RSS|X]的形式 会考局部证明
step1: RSS=ɛ'Mxɛ

step2: 作为scalar的E[RSS|X]可写作tr(.)

step3: 使用tr(AB)=tr(BA)得到σ²tr(Mx)

step4: 使用tr(AB)=tr(BA)打开tr(Mx)

Lecture 4: Normal Regression
Large Sample: normality assumptions
normality assumptions
(A1) No perfect multicollinearity

(A2) Strict exogeneity

(A3-A4) Homoske + No serial corr

(A5) Conditional normality of errors

Normality的结果
Distri of OLS in norm reg (^β|X的cond distri)

^β|X的独立于^σ²

(n-k-1)^σ²/σ²~chisquare(n-k-1)

Lecture 5: Testing Joint Hypotheses, The F-test
Linear hypotheses in matrix form

F-test
F-statistics公式:记忆(约束的平方和被scaled)

t-和F-统计量的定义和关系

under H0时 F-stat分布的推导
step1: 基于^β|X的分布找restr的分布

step2: under H0 标准化为Z~N(0,Im)

step3: under H0 有Z'Z~chisquare(m)

step4: (n-k-1)^σ²/σ²~chisquare(n-k-1)

F-stat的另一种表达: Constrained regression

带(Rβ-r)约束的最优化问题

Constrained/unconstrained RSS

t-test
under H0时 t-stat分布 (F-stat的 single restriction case)

t- 与 F-test比较:优势在于one-sided

Joint test ≠ sequence of individual tests (joint test更容易拒绝)

Lecture 6: OLS Asymptotics
大样本OLS的假设、结果和检验
Probability theory reminder
Convergence in prob (给出了consistency的定义)

Convergence in distr

Slutsky's theorem

A Law of Large Numbers (LLN)

A Central Limit Theorem (CLT)

Large sample OLS assumptions
(OLS0) (yi , xi ) is an i.i.d. sample
(OLS1) E[xix'i]<∞ and is non-singular

(OLS2) Strict Exogeneity

(OLS3) Var(ɛi|xi)=σ² 注意不是矩阵形式,因为no serial corr出自OLS0

(OLS4) Large outliers are unlikely

Large sample OLS的结果
性质1:OLS consistency及其证明
step0: Convenient expressions for consitency

step1: by OLS0 OLS2 OLS4 使用LLN+Slutsky消去表达式的尾项,则^β→β

性质2:OLS asymptotic normality及其证明
step0: Convenient expressions for asy. norm.

step1: by OLS0 OLS2-4 由CLT得到表达式后项的分布,其均值和方差由L.I.E.得到

step2: by OLS1+Slutsky得到表达式的分布,整理得

Large sample OLS的检验: Asymptotic F- and t-test
Asymptotic F: 由标准F的分母退化得到

Critical Vallue: 大样本下F→chisquare(m)/m~F (m, n-k-1),故用旧表即可

Asymptotic t→N(0,1),使用旧表C.V.即t(n-k-1)

Lecture 7: Endogeneity and Interpretation of Regressions
放松OLS2/A2(Exogeneity)
endogeneity【定义】是对Exogeneity(uncorr)的违背
Exogeneity假设意味着uncorr
exog的矩条件表达式

当vector xi包含constant时,cov(x_ij, ɛ_i)=0
 这里再次体现出constant term的作用,使得E[ɛ_i]=0
cov=0则有corr=0
Endogeneity定义:注意,并不是所有variables都endog,可能是部分有部分没有

Endogeneity【后果】: 所有^β都inconsistent,不仅只是endog var有

【检验】endog无法通过统计手段加以检验,但取决于interpretation of regression
Causal interpretation: 关注RHS即x对y的因果作用

Best linear predictor: 关注LHS即如何能最好地fit y(不关注因果)

Statistician's loss, BLP, and its error 如何找到最好的fit?通过minimize Error Sum of Squares(而非RSS)

BLP's error and coefficients 由于BLP下不存在endog(免费获得exog!!),故可使用exog矩条件

OLS as the method of moments (MM) estimator of BLP

Example: 下雨y和打伞x的例子

【解决方案】当使用causal mechanism时,需要justify exog

① use economic theory

例子:电力公司

② use experimental or quasi-experimental data

例子:师生比

③ instrumental variables

lec#22~24
Lecture 8: Heteroskedasticity, Het-robust Standard Errors
放松OLS3/A3(homoske)
Heteroskedasticity (Het.) 【定义】

Heteroskedasticity 【后果】 het本质上是关于方差的假设,故影响方差为多
对^β的估计:unbiased+consistent 但是 not efficient

cond variance of ^β的形式改变,且其估计biased

asymptotic variance of ^β的形式改变,且其估计inconsistent

t- and F-tests失效

Heteroskedasticity 【检验】
H0: homoskedasticity

H1: het=>条件方差随x而变

形式一:linearity (B-P test),使用F-test 注意不一定用全部variables,可以用一部分

形式二:general functional form (White test),使用F-test

Heteroskedasticity 【解决方案】
① OLS + het. robust s.e. 并使用het. robust t- F-test
效果
【估计量】OLS estimator is still unbiased and consistent
【标准误】het. robust s.e. is consistent for ture one (但是biased in small samples)
【检验】het. robust t- F-test is asymptotic valid
【缺点】OLS is no longer best →estimates might be noisy
OLS asymptotics under het.
step1: OLS3 violated, 则var(xi, ɛi)的形式改变, CLT改变

step2: OLS1+ Slutsky 带入Convenient expressions for asy. norm.

het. robust s.e.
Avar中各成分的估计 (相当于找出hat_Avar的形式)

使用White的consistent hat_V估计 V=E[ɛ²xx']

使用(1/n)X'X估计E[xx']

从het. robust Avar到s.e. (解释了为什么有1/n)

由hat_Avar的形式确定standard error

② Generalized Least Squares (GLS)
Lecture 9: Generalized Least Squares
作为对放松OLS3/A3(homoske)的补救措施
Generalized Least Squares
Motivation: 如果知道var(ɛ|X)=Ω的具体形式,就可以解决het (注意,这里没有要求Ω必须是diagonal)

GLS procedure
step1: 平方分解Ω=PP=PP', P为symmetric&unique (注意,Ω与P均为positive semi-definite)

step2: GLS transformation (premultiply P^(-1)) 至此,het被消除

GLS estimator

GLS β is BLUE

当Ω为diagonal时GLS为WLS

Wieghted LS

Feasible GLS
Motivation: Ω未知,用resid获取^Ω

Feasible GLS estimator中^Ω的形式仍然未知, het的情况下,通常规定为var(ɛ|X)=f(x_i)

【Feasible GLS procedure】 (Multiplicative het function)

step1: 如何估计这些参数 从ɛ|X的分布出发,找到关系式ɛi²=ef(xi)并取对数
 
step2: predicted ^f(xi) → ɛ²=exp{ E[log(ei)] } f(xi ) 而exp{ E[log(ei)] }=0.28因为ei|X~chisquare(1)

step3: 得到^Ω=(1/0.28) diagonal{^f(x1),...,^f(xn)}

【FGLS estimator & cov matrix】
FGLS ^β:(1/0.28)不重要,因为会被消去

biased in small sample 但是 consistent (这种consistent不受misspecification影响)
^var(^β |X):(1/0.28)很重要

【FGLS as WLS】
赋权weight=√(^f(xi))得到新方程

FGLS ^var(^β |X)为新方程的OLS ^var(^β |X)

注意:新error的cov matrix为(1/0.28)In,而非In

【FGLS 的潜在问题】 biased in small sample

原因:由于^Ω不仅依赖于X还依赖于Y(和ɛ),无法提出L.I.E.的内期望符导致^β-β≠0 (biased)
注意^Ω=^f(xi)是由ɛi²=ei*f(xi)这个关系式估计出来的,因此^Ω本身取决于ɛ
当mis-specification是biasedness被放大
尽管有bias但是small,仍好于OLS,故在TimeSeries中常用
解决het,用FGLS还是robust s.e.?
FGLS s.e.比robust s.e.更小

FGLS在明确het的具体形式时更有效(特效药 specific remedy) het-robust s.e.在一般情况下更普遍适用(万金油 universally applicable)
Lecture 10(a): Serial Correlation
放松OLS0/A4(no serial corr)
Serial Correlation 【定义】:由于OLS0不再iid,cov≠0

Serial Correlation 【后果】 Serial Corr本质上是关于方差的假设,故影响方差为多
对^β的估计:unbiased+consistent 但是 not efficient

cond variance of ^β的形式改变,且其估计biased

asymptotic variance of ^β的形式改变,且其估计inconsistent

t- and F-tests失效

Serial Correlation 【检验】
lec#12
① 检验AR of ɛ:F-test
② 采用LM统计量:Breusch-Godfrey test
Serial Correlation 【解决方案】
lec#13
① 照常OLS及t-/F-test,但使用HAC s.e
② FGLS approach
Prais-Winsten estimation
Cochrane-Orcutt procedure
③ Find a better model
Lecture 10(b): Time Series Modelling, White Noise, and AR Models
关注stationary时TS的性质
Modeling time series
Properties of TS
serially correlated

Mean function

Autocovariance function

Joint normal distribution

further restrictions
由于现实只有一次实现值,需加以限制
Stationarity (关于分布)
Weakly (covariance) stationary process (仅前2阶矩)

Strictly stationary (所有阶矩)

Weak dependence (关于独立性)

Autocorrelations
Autocorr function: 用以测度单变量stationary process的degree of dependence

Necessary condition for weak dependence: ρ(τ)→0 as τ→∞

自相关图:Theoretical correlogram

Sample mean, sample correlogram

White noise
Strict white noise:是iid W.N.

Weak white noise: 均方不变且cov=0(即weak 在于分布不同)

Martingale difference: implies Weak W.N.

Autoregressive process AR(p)

AR(p) as adaptive expectations

AR(p) processes via lag polynomial

Stability of AR(p) processes 当关于L的根z大于1时为stable
Stability of AR(p)

Stability of AR(1): 要求|φ|<1

AR(1) process
AR(1) processes via lag polynomial

AR(1) in the infinite past
AR(1) without intercept时wold representation

Stable AR(1) loses memory 越远的shocks影响越微小

Properties of AR(1) without intercept时使用wold representation

Expectation

Variance

AR(1) is Weak stationary: stable可推出stationary,反之不行

Autocovariance of AR(1) without intercept时
引理:Yt的有限lag表示(区别于无限wold) (即起点与所有加权冲击之和)

with intercept 的时候(2020-2021final)

公式:autocov function for stable AR(1)

验证了stable AR(1) process is weak stationary

autocorr function: Correlogram of AR(1)

AR(1) in the finite past
Yt的有限lag表示,直至Y0 (有限lag公式中令τ=t)

Yt will be weakly stationary

General AR(1) with intercept
给出μ的表达式:通过两边求期望

当intercept is constant时,autocov(γ)不变

Lecture 11: MA Models, ARMA Models and Forecasting
Moving average process MA(p)
MA(q) process and its Lag polynomial

Properties of MA(q)
Mean

Variance

Autocovariance

Autocovariance of MA(q)
τ>q时无交集,autocov=0

0≤τ≤q时交集为,autocov存在 Yt的部分为角标τ~角标τ+q,Yt-τ为角标0~角标q

MA(q) is weakly stationary

Autocorr function: Correlogram of MA(1)

Example: MA(1)

Wold's theorem: MA(∞) representation of AR(1)
Yt的无限lag形式即为MA(∞)

Wold's theorem: weakly stationary processes基本上都可以由MA(∞)表示
Invertibility of MA(q)
Invertibility of MA(q)

Invertibility of MA(1): 要求|θ|<1

Identification problem: invertible and non-invertible有相同的数据分布(相同的autocorr),但在分析时两者有差别,后者用于time to plan

Autoregressive moving average ARMA(p,q) 注意AR部分no intercept

lag polynomials

Motivations for ARMA
①Many economic time series are aggregates. Even if each of the series that enter the aggregate is AR(1), the sum will be an ARMA

②Many economic time series are measured with error. Even if the true series is AR(1) and the measurement error is a white noise, the available noisy data will follow ARMA.

③ From a purely statistical perspective, ARMA models with relatively small number of parameters may match the observed autocorrelation pattern in economic data well.
Sum of two ARMA processes

Optimal forecasting
【最优化问题】找到能min MSFE 的f(Y_T, Y_T-1,...), 即为optimal predictor of Y_t+τ

【结论】Conditional expectation即为opt predictor

proof
step1: 最优化问题内±E[Y_t+τ|g_t]

step2: 展开平方和,LIE消去interaction term

step3: 不等式implies最优解

【应用】Optimal forecasting in AR(p) models

【例子】AR(2) example

Lecture 12: OLS with Stationary Data
Time series OLS; Serial Corr的检验
Finite Distributed Lag models
Static and dynamic regression model
Static model: x在T-1期为0,T期及以后为1,那么E[y]在T,T+1,...始终保持δ0

Dynamic model: E[y]在T为δ0,但T+1期及以后保持δ0+δ1

Finite Distributed Lag (FDL) regression model

Properties of FDL(q)
Impact multiplier: δ0

Lag distribution

Long-run multiplier: x在T-1期为0,T期及以后为1,则长期内有δ0+δ1+...+δq

FDL in matrix notation

OLS with stationary weakly dependent data
Strict exogeneity and no feedback
Motivation: FDL model若满足G-M assumptions则仍能BLUE 但TS中解释变量和被解释变量都是一种序列导致无法strict exog
Strict exogeneity: cond on past, current, and future

Feedback: strict exogeneity violated if yt affects future xt (显然AR的yt会成为future xt)

Strict exogeneity fails for AR(p)

AR(1) Example

in matrix notation

Autoregressive distributed lag model ADL(p,q)

long run effect of per unit change

ADL(p,q) lag operator notations: stability 取决于AR部分的根|z|>1

OLS in Time Series Regression 【假设】

OLS-TS1: Strict stationarity同分布 and weak dependence独立

OLS-TS2: No perfect multicollinearity

OLS-TS3: Contemporaneous exogeneity

OLS-TS4: Contemporaneous homoskedasticity

OLS-TS5: No serial correlation

在Dynamic completeness下得到保证

OLS in Time Series Regression 【结果】
OLS ˆβ is consistent for β but biased in small samples.
OLS ˆβ is asy normally distributed and the usual s.e. are valid
Testing for serial correlation 假设ɛ服从AR(q)形式, H0: no serial corr

方法一:F-test

step1: obtain ^ɛt
step2: reg ^ɛt on (xt) (^ɛt-1 ^ɛt-2 ... ^ɛt-q) 并F-test (^ɛt-1 ^ɛt-2 ... ^ɛt-q) 的系数, reject if F>chisquare(q)/q
方法二:Breusch-Godfrey test (LM statistics)

step1: obtain ^ɛt
step2: reg ^ɛt on (xt) (^ɛt-1 ^ɛt-2 ... ^ɛt-q) obtain R², 计算LM stat=(T-q)R², reject if LM>chisquare(q)
Lecture 13: HAC Standard Errors, Cochrane-Orcutt FGLS
作为对放松OLS0/A4(serial corr)的补救措施
Serial Correlation 【解决方案】
① 照常OLS及t-/F-test,但使用HAC s.e
优势:OLS estimator consistent, HAC s.e. consistent,HAC F- and t-test asymptotically valid

劣势:OLS very inefficient and HAC s. e. severely biased in small samples.

② FGLS approach
Prais-Winsten estimation
Cochrane-Orcutt procedure
③ Find a better model
【解决方法一】HAC s. e. (Het and Autocorr Consistent standard errors.)
Motivation: usual OLS s.e. are wrong因为Var(ɛ|x)不再diagonal
HAC errors 由Avar(√T(^β-β))得来
convenient expression

LLN+CLT,其中Avar表示为V

An explicit expression for V
step1: 展开V=var[(1/√T)(∑xt ɛt)]=(1/T)∑∑(ɛt xt x's ɛs)

step2: 展开(ɛt xt x's ɛs)的具体形式

step3: E[ɛt xt x's ɛs]=cov(xtɛt, xsɛs) depends on m=t-s.

step4: 此时∑∑(ɛt xt x's ɛs)可展开

step5: V=(1/T)∑∑(ɛt xt x's ɛs)表示为het+serial corr两部分

Estimating V by ^V

得到 Newey-West HAC standard errors
Newey-West estimator(对于var(^β)的估计) 满足convergence

缺点:biased in small samples
替代方案:如果小样本(面临biasedness)+strictly exogenous,则用FGLS
【解决方法二】Feasible GLS
GLS procedure
首先规定serial corr的具体形式为AR(1)

autocov function of AR(1)

cov matrix Ω的具体形式

Model transformation

transformation matrix P

GLS transformation (注意区分第一个新变量)

FGLS estimate
Prais-Winsten estimation

step1: OLS obtain ^ɛt
step2: 由^ɛt和^ɛt-1得到^ρ (by AR(1))
step3: 根据^ρ得到P,进行Model transformation

注意:constant term也需要transform

step4: 对新模型OLS

Cochrane-Orcutt procedure:舍去PW estimation的第一个方程(放弃第一组观测值)

Lecture 14: Vector Autoregression (VAR)
多个AR同时回归:macro analysis
Vector Autoregression of order p VAR(p)
VAR(p)形式

VAR(p) estimation

属于seemingly unrelated regression (SUR) systems,本应用FGLS SUR FGLS turns out to be equivalent to equation-by-equation OLS
VAR各方程解释变量相同,故OLS eq by eq
(Weak) Stationarity of VAR(p)
判别条件:roots of the equation
性质:只要vector yt 是joint stationary,那么OLS eq by eq的结果是consistent+asy normal

VAR(p) as VAR(1) Yt=FYt-1+ut

紧凑VAR(1) Stationarity的判别:使用eigenvalues of F (|λ|<1)

Vector MA(∞) form of VAR(p) Wold's Theorem的vector版本
展开紧凑VAR(1)

stationary时可消去只剩下VMA项 φ^τ→0类比||F||^r→0

只需关心Ψ的 upper-left m×m部分

Impulse response and forecast revision
Impulse Response function(IRF)
Ψs,ij的含义,Ψs,ij是作为s的函数

IRF描述了ɛjt对yjt产生的一次性冲击(变量j在时间点t) 所造成的的影响/反应(变量j在时间点t+s)

forecast revision
the best forecast of yt+h

forecast revision: eg已知ɛ1t可以对forecast进行修正,但存在问题(ɛt内部各元素相关→不正交)

Orthogonalized impulse response
Cholesky decomposition将ɛt转换为正交的ut

得到新~Ψs,ij作为正交 impulse response

改变ɛt的排列顺序会对改变正交impulse response

Lecture 25: Generalized Method of Moments
Method of Moments (MM)
#moments conditions = #parameters
the analogue principle
population moments 如期望值E[Y]=μ

sample moments 如样本均值(1/n)∑Yi

MM例子 (2021Final)

step1 把待估参数整理为可由已知moments condition表示的式子

step2 根据moments conditions找到sample moments

step3 带入待估参数,得到用MM估计结果

GMM的思路
Motivation: #moments conditions > #parameters 一个例子:MM法下同一个参数θ有两个estimator用哪个

General principle of GMM: 通过某种方式,使用多个population moment equations 来最小化Avariance of the estimator
从MM估计到GMM估计 (用OLS的例子,k个自变量)

OLS的MM估计
Moment condition: E[xiεi]=0=E[xi(yi-xi'β)]=0 =E[xiyi]-E[xixi'β]=0

ie Pop moment
根据pop moment写出sample moment,发现MM即为OLS (1/n)∑xiyi+(1/n)∑xixi'^β_MM=0

OLS的GMM估计
Moments condition (基于OLS2得到E[xiεi]=0): 目前已经有k个conditions
GMM: OLS2(E[xi|εi]=0)还能得到更多的conditions (即E[f(xi)εi]=0) 比如:两个例子

此例f反映了GMM的优势:utilize more info and can improve OLS in case of heteroske
GMM的正式定义
GMM【原理】从矩条件E[g(wi,θ)]=0出发 g有m维(m个conditions), θ有k维(k个参数)
just-identification (m=k): vetor θ has unique solution, 根据E[g(wi,θ)]=0用MM估计(k个等式)
over-identification (m>k): 找不到让E[g(wi,θ)]=0的θ 换种思路 找到能"最小化平方和"E[g'(wi,θ)g(wi,θ)](≈0)的θ
为了不失一般性,再scaled by a weighting matrix (使各conditions的st dev相同)
【GMM estimator】
^θ_GMM = arg min J(θ)=g'(wi,θ)(^W)g(wi,θ)
要素一: moment condition g
要素二: weight matrix ^W
Weighting matix W

^Wn→W for W>0 作用: 赋权,强的矩条件(方差小)得到更多权重

带有n下标的表示基于sample
opt GMM use W=Ω^(-1) cov-matrix的逆 Ω=E[g(wi,θ)g'(wi,θ)] (注意这是gg'而非g'g)
opt ^Wn 的计算
带有n下标的表示基于sample,如g_n=(1/n)∑g(wi.θ)
首先用In暂作权重矩阵,求解g'_n(wi,θ)(In)g_n(wi,θ)得到^θ*
用^θ*构造^Ω=g(wi,^θ*)g'(wi,^θ*),^Ω作为对Ω的estimator
取opt ^Wn=[^Ω]的逆 ,则可求解g'_n(wi,θ)^Wng_n(wi,θ)得到^θ_GMM
2SLS as GMM
要素一moment conditions
由于endog,原moment condition失效(E[xi(yi-xi'β)]≠0)

利用IV exogeneity得到新的moment condition E[zi(yi-xi'β)]=0; g(wi,β)=zi(yi-xi'β)
要素二optimal weight
首先找Ω
step1: Ω=E[g(wi,θ)g'(wi,θ)]=E[zi(yi-xi'β)²zi']=E[εi²zizi']

step2: 假设2SLS2(homoske), 则有Ω=σε²E[zizi']

step3: estimator ^Ω=^σε²(1/n)∑[zizi'] =^σε²(1/n)Z'Z
则有opt weight
表达式: (^Wn)=^Ω的逆=[^σε²(1/n)Z'Z]^(-1)

是true W的一致估计: (^Wn)→W=Ω的逆

GMM和2SLS的关系 (under homoske)
求解^β_GMM发现与2SLS相同
已有 moment conditions + optimal weight

最优化问题等价于 min(Y-Xβ)'Z(Z'Z)^(-1)Z'(Y-Xβ) 阴影部分为Pz,则有 min(PzY-PzXβ)'(PzY-PzXβ)

^σε²和1/n在最优化时可省去
解得^β_GMM=[(PzX)'(PzX)]^(-1)(PzX)'(PzY) =^β_2SLS=[(^X)'(^X)]^(-1)(^X)'(Y) 发现两者等价!!

GMM vs 2SLS (under heteroske)
under heteroskedasticity GMM is more efficient than 2SLS 但是如果没有heteroske,优势就不明显

GMM=GLS+2SLS
Why GMM is useful

【consistent+asy norm】GMM estimator是consistent+asy norm
【efficiency】weighting matrix ^Wn一旦选择正确 ,则efficient(优于其他estimator)
【不需要强假设】hold under general condition,不需要假设normality
【只需要moment condition】通常可有economic theory得知
【其他estimator都是GMM的特例】包括OLS, 2SLS, MLE
Lecture 24: Asymptotics of 2SLS, Testing Endogeneity and Overidentifying Restrictions
2SLS渐进性质+工具变量的检验
Consistency and asymptotic normality of 2SLS
Large sample 2SLS assumptions
(2SLS1) (yi, zi', xi')' i.i.d

(2SLS2) Rank condition: E[zizi'] is non-singular + E[zixi'] full column rank

对于2SLS而言,Rank condition包括两方面: (i) IV无多重共线(用于stage1); (ii) iv relevence
(2SLS3) IV Exogeneity

(2SLS4) E[ɛi²|zi]=σε²

(OLS4) Large outliers are unlikely

性质1:2SLS consistency及其证明
step1: Convenient expressions for consitency 展开^β_2SLS=β+(X'PzX]^(-1) X'Pzε

step2: 使用Pz=Z(Z'Z)^(-1)Z'变形得 ^β_2SLS=β+[X'Z(Z'Z)^(-1)Z'X]^(-1)X'ZX'Z(Z'Z)^(-1)Z'ε

step3 使用LLN+Slutsky+Instrumental Exog 消去表达式的尾项,则^β→β

性质2:2SLS asymptotic normality及其证明 (过程见图片)

step1: Convenient expressions for asy. norm.

step2: 由CLT得到表达式后项的分布,其均值和方差由L.I.E.得到

step3: by 2SLS1+Slutsky得到表达式的分布,整理得

Asymptotic variance estimator (估计σ²) ^σ²=[1/(n-k-1)]∑(^εi)² 注意^εi=yi-xi^β

Cov martix estimator (估计Var(^β) ^Var(^β)=^σ²[X'Z(Z'Z)^(-1)Z'X]^(-1) =^σ²[X'PzX]
Testing for xi endogeneity
原解释变量xi是否真的内生? (放松OLS2)
原理:1ststage reg中残差项ei可能是endog(则xi为endog)

reg原回归残差项on辅助回归残差项 εi=δei+ui (δ≠0即表明endog: xi与εi相关)

据此,将原回归改写为 yi=xi'β+δei+ui (其中蓝色部分为εi) H0: δ=0 (xi是exog) H1: δ≠0 (xi是endog)

检验procedure
step1:使用residual回归: yi=xi'β+δ^ei+ui (其中蓝色部分为^εi)

step2:reject the null of exog in favour of endog if |t|>1.96

FWL interpretation of 2SLS: 原回归中加入^ei消除了endog part of xi

Detecting weak instruments
zi是否是weak IV? (放松2SLS2)
weak IV【检验】
原理:1ststage reg中的系数α是否small

F-test α=0

weak IV【解决方案】
① find better IVs
② 如果有多个IVs, drop the weaker ones
③ 如果只有少量IVs且都weak, 使用LIML
Motivation: weak IV下的 tests are invalid (原因见weak IV的后果)
给定weak IV【统计推断】: AR test 例子

AR-test procedure (前提: IV exog)

intuition:如β1*确实true,那么yi*表示的是去除内生变量educ产生的影响后的yi reg yi* on exog IVs , 应该得到IV是0系数(因为被IV的内生变量educ已经不在了)
step1: 要检验β1=β1* compute yi*=yi-β1*educ
step2: reg yi* on educ及其他regressors
step3: F-test IVs (此例是meduc feduc)
step4: rej H0 if AR>chisquare(r)/r
Testing overidentifying restrictions (只能partially test endogeneity)
zi是否是endog IV? (放松2SLS3)
Motivation: endog IV导致inconsistent
H0: all IVs are exog ; H1: at least one is endog
方法一:the Hausman test 数字例子

前提:只适用于overidentifying restrictions (r>k),且只能partially 检验出endog

procedure: 只用IV1回归得到^β1和只用IV2回归得到~β1 如果(^β1-~β1)足够大,那么有三种可能:IV1内生/IV2内生/两个都内生

缺点:可能存在(^β1-~β1)很小但两个IV都内生的情况,检测不出来

方法二:Regression-based test

① run 2SLS 得到 ^εi (注意用的是yi-xi^β)
② reg^εi on all exog_var IV 得到R²
③ under the null, all IVs are exog 则有nR²~chisqaure(q) 如果nR²>95perecntile of chisqaure(q)则rej the null
q=r-k
2SLS with heteroskedasticity
zi是否E[ɛi²|zi]=σε²? (放松2SLS4)
2SLS asymptotics under het. 类比OLS
注意,目前仍有^X=PzX,且^β_2SLS-β=[(^X)'^X]^(-1)(^X)'ɛ

step1: 2SLS3 violated, 则var((^X)'ɛ)的形式改变, CLT改变 1/(√n)((^X)'ɛ)→N(0,V) 其中V=E[ɛ²(^X)'(^X)]提不出来

使用White的consistent hat_V估计 V=E[ɛ²xx']

step2: 2SLS2+ Slutsky 带入Convenient expressions for asy. norm.
类比OLS的形式
het. robust s.e. 类比OLS

Avar中各成分的估计 (相当于找出hat_Avar的形式)

使用(1/n)X'X估计E[xx']

从het. robust Avar到s.e. (解释了为什么有1/n)

由hat_Avar的形式确定standard error

2SLS in time series regressions 【假设】
Motivation: 在时间序列中2SLS的 asy inference何时valid?
2SLS-TS1: 要求(yt,xt',zt')'满足Strict stationarity + weak dependence (即CLT和LLN成立)

2SLS-TS2: Rank condition (E[ztzt']=r 可逆) + instrument relevance (E[ztxt']=k full column rank)

ztxt'是n×k矩阵,rank=k列满秩,IV relevance

2SLS-TS3: Instrumental exogeneity (E[εt|zt])

2SLS-TS4: Homoskedasticity (E[εt²|zt]=σε²)

2SLS-TS5: No serial correlation (E[εtεs|zt,zs]=0 for all t≠s)

在Dynamic completeness下得到保证

Lecture 23: Two-stage Least Squares
Motivation: k个endog,r个IV (r≥k)

zi为r×1,Z为n×r
instrument exog: zi uncorr with εi for consistency
instrument relevance: zi corr with xi for consistency
No weak IV: zi is strongly corr with xi for good finite sample performance
IV estimator and identification
Order condition
Just-identified case: r = k. (^β_IV=(Z'X)^(-1)Z'Y)
Over-identified case: r > k.
Rank condition: rank(Z'X)=k (从而保证Z'X可逆,使得^β_IV存在)

Over-identified的时候Z'X不是方阵! 所以无法用IV法,应该用2SLS
Asymptotically normal (genneral case: 不把εi²提出来)

heteroske时sample cov matrix= ^Var(^β_IV)=(Z'X)^(-1)Z'(^Ω)Z(Z'X)^(-1) 用到了E[εi²|X]=Ω

Two Stage Least Squares (2SLS)
2SLS的定义: 从multi-regression例子说开去

1st stage: extracts exog part of the endog var

2nd stage: use this "exgo part" for regression

2SLS in matrix notation
1st stage: 得到predicted ^x 和^X (^X=PzX)

注意: 别看元素中只有zi'^π是predicted,实际上1 exper exper² 都是predicted values,只不过它们都是自己的IV所以predicted=original

2st stage: ^β_2SLS=[(^X)'X)]^(-1)[(^X)'Y] =[(^X)'(^X)]^(-1)[(^X)'Y]

红色部分^X在r=k(just-identify即IV)时,即为Z
given^X=PzX=Z(Z'Z)^(-1)Z'X 整理得 ^β_2SLS=[(PzX)'X]^(-1)(PzX)'Y =[X'PzX]^(-1)X'PzY

如何体现出The 2 stages
用到idempotent Pz: ^β_2SLS=[(PzX)'PzX]^(-1)(PzX)'(PzY)

two-stage procedure.
1st stage: reg all(exog&endog) regressors on (exog&IV) ^X=PzX

2nd stage: reg Y on ^X

特别注意:two-stage算出来的数据只有^β_2SLS可靠 s.e.是错的(不能用于inference!!)

(因为yi-^β^xi是错的residual,应该用xi而非^xi)
例子:comparing OLS, IV and 2SLS

Lecture 22: Endogeneity and Instrumental Variables
【Motivation】 Causal regression中内生性会影响consistency
OLS consistency 表达式: BLP解释时免费获得E[xiεi]=0

Endogeneity【定义】Causal regression解释时 存在cov(xij,εi)=0

Endogeneity【来源】
①Selection bias: non-random assignment of variables (not i.i.d. data)
②Omitted variables: relevant variables are hidden in the errors

③Measurement errors: observations of xi are not accurate.

④Simultaneous causality: yi affects xi , and vice-versa

Endogeneity【解决方案】
Include proxy variables
Instrumental variables: extracts exog part of the endog explanatory var and use this "exgo part" for regression

IV指的是m个endog var.用r=m个IV, 2SLS指的是m个endog var.用r>m个IV
IV for endogeneity: 例子univariate regression

IV的两大条件
instrument exogeneity, Cov(zi, εi)=0
instrument relevance, Cov(zi, xi)=0
IV estimator
[population]:cov(zi,yi)=β1cov(zi, xi)+cov(zi, εi) 即β1=cov(zi,yi)/cov(zi, xi) 进一步改写为βyz/βxz

^β1=^cov(zi,yi)/^cov(zi, xi) =∑(zi-zbar)(yi-ybar)/∑(zi-zbar)(xi-xbar)

the analogy principle (method of moments)
例子:用游戏IV学时

IV estimator Properties
大样本下consistency
LLN: sample cov → pop cov

Slutsky:^β1_IV→β1

小样本下通常biased

原因:weak IV (sample cov(zi,xi)≈0)

结果:biased,甚至E[^β1_IV]不存在

Asymptotically normal
convenient expression √n(^β1_IV-β1)

给定z满足IV两个条件且E[εi²|zi]=σε²,使用LLN+CLT

exog IV使得∑中的cov项=0
Slutsky: √n(^β1_IV-β1)→N(0,σε²σz²/σzx²)

Variance of IV vs Variance of OLS 弱工具的启示 如果出现weak IV(samll ρzx²)则引发large Avar(^β1_IV)
利用ρzx展开Avar(^β1_IV) =σε²σz²/(σzσxρzx)² =σε²/(σx²ρzx²) 即IV: √n(^β1_IV-β1)→N(0,σε²/(σx²ρzx²))
OLS under G-M: √n(^β1_OLS-β1)→N(0,σε²/σx²)
注意ρzx=1即在OLS中 x是自己的iv
Example: Angrist and Krueger (1991) 入学时间对工作收入的影响

对IV两大基本条件的违背
[1] Weak Instrument (ρzx≈0)
原理:分析^β1_IV-β1在LLN+CLT下的approx

后果:(^β1_IV-β1) 偏大+√n(^β1_IV-β1) 偏离norm distr normal asy approx will not work well.常见的结果是s.e.大

因此用AR test
[2] Exog failure (ρzε≠0)
原理:IV不够外生导致 inconsistency of IV
由于不再exog,只得写作^β1_IV-β1={(1/n)∑(zi-zbar)(εi-εbar)/(1/n)∑(zi-zbar)²} /{(1/n)∑(zi-zbar)(xi-xbar)/(1/n)∑(zi-zbar)²}

by LLN ^β1_IV-β1→ρzεσε/ρzxσx

注意OLS中x是自己的iv,则有 ^β1_OLS-β1→ρxεσε/ρxxσx
对比inconsistency两者 可知外生性ρzx不足的影响 (^β1_IV-β1)/(^β1_OLS-β1)→ρzε/(ρxερzx)偏大

后果:用了IV之后(^β1_IV-β1) 的inconsistency 比不用IV的OLS的inconsistency还要更大
解决方案:including a sufficient # of exog var. (eg 加入gi就remove了error中的gi,使zi更exog)
IV in the multiple regression model

此时用OLS造成所有βi都inconsistent

pop β=E[zixi]^(-1)E[ziyi]

sample β (method of moments): ^β=(Z'X)^(-1)Z'Y

consistency

Lecture 21: Models for Limited Dependent Variables
Linear probability model
Conditional expectation function

LPM: P(Yi=1|Xi)=Xi'β

Partial effect is constant: βi

Problem: not bounded [0,1]

Probit
Motivation: Link function (CDF)

Probit: the standard normal distribution

Example of probit

基础是Latent variable model
Motivation: Derive probit from an underlying U*i=Xi'β+ɛi, Yi=I if U*>0

基于P(Yi=1|Xi)=P(U*i>0|Xi)使用ɛi|Xi~N(0,1)得到link func Φ(Xi'β)

Interpretation of the coefficients (β不再是partial effect,因为还有个link)
注意effect取决于X!!
Partial effect
continuous variable: βjφ(Xi'β)

discrete variable

Partial effect at the Average PEA=βjφ(barXi'β)
Average partial effect APE=(1/n)∑βjφ(Xi'β)
Ratio of the effect不取决于X

Maximum Likelihood for Probit
MLE for Probit
Yi|Xi是Bernoulli

cond likelihood function

total likelihood function (注意marg f(Xi;β)只要不取决于β就可以略去,即使取决也最好略去, 因为misspecification会导致inconsistent MLE)

Example: labour force participation model

PE for discrete variable: Estimated effect of a child

Delta method
Motivation: 用于求非线性function of ^β
内容:已知原始^β的Asy distr, 若g(.)可微且g'(β_0)≠0 则有g(^β)的asy distr: √n[g(^β)-g(β_0)]→N[0,(dg/dβ_0)²σ²]

推导:non-linear function F由泰勒展开给出

一撇为转置符号
由asy norm of ^β_ML,则有F(^β)≈N[F(β_0), (dF/dβ_0)'g(β_0)^(-1)(dF/dβ_0)

蓝色部分为MLE结论 ^θ_ML≈N(θ_0, g(β_0)^(-1))
Logit
Link fucntion: logistic distribution P(Yi=1|Xi)=Λ(Xi'β)=e^(Xi'β)/[1+e^(Xi'β)]

logistic disr: CDF Λ(x)=e^x/[1+e^x], PDF λ(x)=Λ'(x)=Λ(x)[1-Λ(x)]

partial effects=βjΛ(Xi'β)[1-Λ(Xi'β)]

Lecture 20: Likelihood-based Tests
MLE对应的三种等价统计量
notation
true parameter (restricted) θ0=θr
ML parameter (unrestricted) ^θ=^θML=θu
①Likelihood Ratio test logL(θr)和logL(θu)都estimate
=-2[logL(θr)-logL(θu)]小减大乘以-2则变正
【LR statistic】-2logλ=-2[logL(θr)-logL(θu)]~chisquare(1)
推导/原理Likelihood ratio: 受限制的L(θr)更小,因此λ=L(θr)/L(θu)≤1

Example: LR test in linear regression 注意此时-2logλ~chisquare(m)

Comparison with F-statistic: mF=(n-k-1)[RSSr/RSSu-1] ≈n log[RSSr/RSSu]=-2logλ

【图示】Geometry of the LR test

Asymptotic distribution of LR statistic logL(θ0)在^θML处的2-order Taylor expansion
step1: Taylor expansion using FOC=0

step2: rearrange得到LR=-2logλ≈ -[logL(^θML)]''(θ0-^θML)²

step3: 用到MLE properties -(1/n)[logL(^θML)]''→g1(θ0) √n(θML-θ0)→N[0,g1(θ0)^(-1)]

step4: by Slutsky's [{-(1/n)[logL(^θML)]''}^(1/2)√n(θML-θ0)]² →N(0,1)²=chisquare(1)

②Wald test 计算logL(^θML)+在^θML处近似logL(θ0)
【W statistic】W=(θML-θ0)²|[logL(^θML)]''|~chisquare(1)

推导/原理: logL(θu)-logL(θr)
计算logL(^θML)

近似logL(θ0)

两式相减得到W1/2,统计量逻辑上与LR相同

【图示】Geometry of the W test

Multivariate Wald test

③Lagrange Multiplier test 计算logL(θ0)+在θ0处近似logL(^θML)
【LM statistic】LM=[S(θ0)]²|[logL(θ0)]''|^(-1)~chisquare(1)

推导/原理: logL(θu)-logL(θr)
计算logL(θ0)

近似logL(^θML)

两式相减得到LM1/2

【图示】Geometry of the LM test

三种等价统计量的特点
asymptotically等价(same results) 但different in small sample
LR基于r+u ML; W基于r model; LM 基于u model (因此从computational perspective看,LR更难implement)
LR test 的好处是invariant if re-parameterizations,而W和LM都无
Linear Regression中三种统计量的关系
三种statistics表达式

更易拒绝W≥LR≥LM更难拒绝

Lecture 19: Properties of MLE
Properties of the ML estimator
MLE要求知道correct specification of entire conditional distribution
这样,ME estimator就是consistent, asymptotically normal, and asymptotically efficient.
如果有misspecifications,那么ML inconsistency.
Why does the ML work? 证明MLθ即为max likelihood的true parameter
step1:最优化max {l(θ)}等价max {(1/n)l(θ)-(1/n)l(θ0)}

step2: LLN+Jensen's inequality→原式≤log{E[f(θ)/f(θ0)]}

step3: log{E[f(θ)/f(θ0)]} 在θ=θ0处max=0

Asymptotic normality of ML
Main idea: 由CLT知score func在纵轴上的shifts distributed normally,因此横轴上的θ也normally distr

首先证明 Asymptotic normality of the score
Mean of the score at θ0 is zero

Variance of the score at θ0
展开var(S(θ0))

var(score)叫做Fisher information =l二阶导的负期望

Example: Fisher information

vector case: var=E[SS']=-E[H]

至此,则有Asy norm of the score

其次证明 The asymptotic normality of MLE
step1: 对score进行Taylor expansion,并乘√n化简为convenient expression

step2: convenient expression使用LLN+CLT得到√n(θML-θ0)→N[0,g1(θ0)^(-1)]

multivariate MLE

必考:The Cramer-Rao lower bound
适用前提:^θ是unbiased estimator of θ

CRLB: var(^θ)≥g(θ0)^(-1) 大于等于fisher info的倒数

vector case: +ve semi-definite matrix

Asymptotic efficiency of MLE
已证明MLE ^θ是consistent+achieveCRLB

因此,我们称MLE为Asymptotic efficiency 尽管biased in small sample

(在consistent estimator里面有best var的)
Example with exponential distribution finished

Lecture 18: Maximum Likelihood Estimation
Motivation: TS中strict exog常常无法得到满足;non-linear model需要其他estimator
Likelihood of the i.i.d. data density f (y, x; θ)

Likelihood function: ∏f

Log-likelihood function: ∑logf

Example: exponential distribution

The conditional likelihood function

conditional log-likelihood

marginal log-likelihood

情况一:marg density不取决于θ 则只需要对cond density最优化

情况二:marg density取决于θ 则需要对cond+marg density最优化

Likelihood in the time series context
motivation: obs不是iid
repeated factorization 注意其joint density即为likelihood

log-likelihood

Example
Gaussian AR(1)

Gaussian MA(1)

ARCH model

Lecture 17: Spurious Regression, Cointegration
unit root带来伪回归问题,如果存在conit关系则可以用ECM补救
Spurious Regression
Motivation: 满足I(0)的TS可以直接回归(in levels), 但是直接对I(1)使用水平回归会带来伪回归问题
注意I(1)有coint时可以用差分+ECM I(1)无coint时使用差分回归
Spurious regression: 如果x和y都I(1)但有不同的stochastic trend,则会产生“seemingly significant” regression,即使两者不相关

The source of the problem
假设x和y都~I(1)

y=βx+ɛ中不存在β使得ɛ~I(0)

而对于any β,ɛ恰恰是RW 导致inconsistent estimator+nonstandard distr

Cointegration
定义:确存在β使得ɛ~I(0),则称y和x为coint cointegrating vector为(1,-β)
OLS estimator of cointegrating coefficients (例子:当y-βx~I(0)时,OLS结果)

①super-consistent est

②β non-standard asy distri (t统计量可能不再asy norm):

当ɛ和x相关(endog)或ɛ序列相关,这里的“可能”就起效
当error term 表现不良好的时候 (endog / serial corr)
结果
好消息:^β依然super consistent
坏消息:t统计量的asy distr是non-standard norm, 且^β biased in small sample
解决办法
对于endog使用 Dynamic OLS regression

对于serial corr使用Dynamic OLS + HAC s.e.

Testing for no cointegration (H0: no cointegration) 本来no coint,还假设有coint,就会有伪回归
当coint vector (1,-β)已知时:对error term进行unit root的ADF test

当coint vector 未知时 EG cointerating procedure (intercept-only model)
step1: OLS得到^β

step2: 使用ADF test检验^ɛ是否unit root

统计检验表(不同于之前,是新表) rej unit root = ɛ~I(0) = coint = rej H0

例子

Error Correction Models (ECM)
对于单整序列回归,根据是否存在coint关系时选择
Dynamic model in difference: 当y和x均为I(1)且not coint 通过差分得到I(0),再回归

Error correction model: 当y和x均为I(1)且按照(1,-β)coint 通过差分加上ECM,再回归

Engle-Granger ECM procedure
^β super-consistent且不影响standard asy inference
step1: 当coint efficient未知时,用OLS/dynamicOLS估计
step2: (y_t-1 - ^β*x_t-1)作为error correction term
Vector Error Correction Models

Lecture 16: Testing for unit roots
non-stationary,如存在单位根,则需要更多手续
所有检验中,只有unit root和no conint 的H0是“不好的”
Motivation

Dickey-Fuller test
常规检验手段中^ρ under H0 is non-standard distr. (H1中ρ<1是很自然的,因为现实中少有ρ>1 explosive)

the OLS estimator ^ρ is not normal.

Dickey-Fuller test: Intercept only(第一类模型)

先差分再检验θ=0 (关键假设:ɛ serially uncorr)

DF t-statistic表 (第一类intercept only) (关键步骤: 拒绝H0时自动默认α=0)

为何不joint test α=θ=0?因为F-test 使用α info带来的power of test↑被two sidedness带来的power of test↓所抵消掉了

Dickey-Fuller test: Intercept and trend(第二类模型)

先差分再检验θ=0 (关键假设:ɛ serially uncorr)

DF t-statistic表 (第二类intercept+trend) (关键步骤: 拒绝H0时自动默认β=0,但是α还在)

Dickey-Fuller test: 各种设定Summary (尽管存在α/β等restriction,我们只检验ρ=1)
RW / Zero mean stationary AR(1)

restrictive
第一类RW / Nonzero mean stationary AR(1)

RW with a drift α≠0 / Nonzero mean stationary AR(1)

可用常规test
RW / Nonzero mean stationary AR(1) with a trend

和第二类是一样的,ie α可以任取
第二类RW with a drift α≠0 / Nonzero mean stationary AR(1) with a trend

Augmented Dickey-Fuller test (ADF检验)
ADF procedure
Motivation: 残差项ɛ很可能serial corr
ADF Model:对△y进行滞后,以消除ɛ的ser corr

DF t-statistic表 (同之前的第一类/第二类)

example: 消费logC

如何选择Intercept-only or trend

关键在于看alternative
Intercept-only specification: H1 stationary around a constant
Intercept & trend specification: H1 stationary around a linear time trend
Perron (1989): deterministic breaks
overspecify: 趋势项βt冗余时

后果:H0和H1都是β=0(使用了无关变量) test power("β")↓ 但是 test size("α")不变,仍可以正确检验

underspecify: 趋势项βt遗漏时
后果:模型misspecification导致test power小于test size 给定一些H1下更难拒绝H0→ biased test

例子:An illustration 最好用with trend保险

Lecture 15: Trends, Integrated Processes and Unit Roots
进入non-stationary TS的性质
Linear trend (Non-stationary)
Linear time trend model yt=α0+α1t+ɛt

E[yt]一阶矩随t变化

var(yt)二阶矩为常数

OLS of α 是Superconsistency (无截距项的simple trend model)

OLS estimator of α

var(α-^α)

the Avar(^α)=var[√T(α-^α)]→0 而非OLS那样(常数)

Detrending interpretation of regressions with t Using Frisch-Waugh theorem (加入时间趋势=detrending time)

步骤 by FWL

【interpretation】 β captures the remaining association between yt&xt after removing the part of association due to the association of both yt&t and xt&t (partioning out)
OLS properties when time trend is included 与OLS TS的结果一样,只要OLS-TS1~5得到满足

与lec12一样,只多了αt
OLS in Time Series Regression 【假设】

OLS-TS1: Strict stationarity同分布 and weak dependence独立

OLS-TS2: No perfect multicollinearity

OLS-TS3: Contemporaneous exogeneity

OLS-TS4: Contemporaneous homoskedasticity

OLS-TS5: No serial correlation

在Dynamic completeness下得到保证

OLS in Time Series Regression 【结果】
OLS ˆβ is consistent for β but biased in small samples.
OLS ˆβ is asy normally distributed and the usual s.e. are valid
Random walk (属于Intergrated Process) 本质上是AR(1)

RW的AR(1))有限展开:yt=y0+∑ɛi

RW Properties NOT weak stationary
矩Moments
E[yt]=y0

var(yt|y0)=tσ²

cov(yt,yt+h|y0)=tσ²

High persistence and no mean reversion
E[yt+h|yt]=yt 当前值对未来值有持续影响

与stab AR(1)形成对比: mean reversion E[yt+h|yt]→α/(1-ρ)=E[yt]

Corr(yt,yt+h|y0)=√(t/(t+h))→1 as t→∞ (注意这里已不再stationary,不能用lec11公式)

RW as Integrated process
定义:accumulates or “integrates” past shocks
写作:yt~I(1)

RW with a drift

drift RW的AR(1))有限展开:yt=y0+tδ+∑ɛi (AR(1)现在带有linear deterministic trend)

best prediction: E[yt+h|yt]=yt+tδ
矩Moments
E[yt]=y0+tδ
var(yt|y0)=tσ²不变
cov(yt,yt+h|y0)=tσ²不变
Corr(yt,yt+h|y0)=√(t/(t+h))不变
Unit roots (RW with a drift)
推导:RW写作lag operator

unit root判别:是否stable

I(0)为平稳的时间序列,I(1)则需要一次差分才能平稳
由unit root process变为stationary process
Differencing-stationary process: 一次差分后stationary,也叫 I(1) process

Trend-stationarity process: 去掉趋势项后stationary