Any instrumental variables (IV) estimator relies on two key assumptions in order to identify causal effects:

- That the excluded instrument or instruments only effect the dependent variable through their effect on the endogenous explanatory variable or variables (the exclusion restriction),
- That the correlation between the excluded instruments and the endogenous explanatory variables is strong enough to permit identification.

The first assumption is difficult or impossible to test, and shear belief plays a big part in what can be perceived to be a good IV. An interesting paper was published last year in the Review of Economics and Statistics by Conley, Hansen, and Rossi (2012), wherein the authors provide a Bayesian framework that permits researchers to explore the consequences of relaxing exclusion restrictions in a linear IV estimator. It will be interesting to watch research on this topic expand in the coming years.

Fortunately, it is possible to quantitatively measure the strength of the relationship between the IVs and the endogenous variables. The so-called weak IV problem was underlined in paper by Bound, Jaeger, and Baker (1995). When the relationship between the IVs and the endogenous variable is not sufficiently strong, IV estimators do not correctly identify causal effects.

The Bound, Jaeger, and Baker paper represented a very important contribution to the econometrics literature. As a result of this paper, empirical studies that use IV almost always report some measure of the instrument strength. A secondary result of this paper was the establishment of a literature that evaluates different methods of testing for weak IVs. Staiger and Stock (1997) furthered this research agenda, formalizing the relevant asymptotic theory and recommending the now ubiquitous “rule-of-thumb” measure: a first-stage partial-F test of less than 10 indicates the presence of weak instruments.

In the code below, I have illustrated how one can perform these partial F-tests in R. The importance of clustered standard errors has been highlighted on this blog before, so I also show how the partial F-test can be performed in the presence of clustering (and heteroskedasticity too). To obtain the clustered variance-covariance matrix, I have adapted some code kindly provided by Ian Gow. For completeness, I have displayed the clustering function at the end of the blog post.

# load packages library(AER) ; library(foreign) ; library(mvtnorm) # clear workspace and set seed rm(list=ls()) set.seed(100) # number of observations n = 1000 # simple triangular model: # y2 = b1 + b2x1 + b3y1 + e # y1 = a1 + a2x1 + a3z1 + u # error terms (u and e) correlate Sigma = matrix(c(1,0.5,0.5,1),2,2) ue = rmvnorm(n, rep(0,2), Sigma) # iv variable z1 = rnorm(n) x1 = rnorm(n) y1 = 0.3 + 0.8*x1 - 0.5*z1 + ue[,1] y2 = -0.9 + 0.2*x1 + 0.75*y1 +ue[,2] # create data dat = data.frame(z1, x1, y1, y2) # biased OLS lm(y2 ~ x1 + y1, data=dat) # IV (2SLS) ivreg(y2 ~ x1 + y1 | x1 + z1, data=dat) # do regressions for partial F-tests # first-stage: fs = lm(y1 ~ x1 + z1, data = dat) # null first-stage (i.e. exclude IVs): fn = lm(y1 ~ x1, data = dat) # simple F-test waldtest(fs, fn)$F[2] # F-test robust to heteroskedasticity waldtest(fs, fn, vcov = vcovHC(fs, type="HC0"))$F[2] #################################################### # now lets get some F-tests robust to clustering # generate cluster variable dat$cluster = 1:n # repeat dataset 10 times to artificially reduce standard errors dat = dat[rep(seq_len(nrow(dat)), 10), ] # re-run first-stage regressions fs = lm(y1 ~ x1 + z1, data = dat) fn = lm(y1 ~ x1, data = dat) # simple F-test waldtest(fs, fn)$F[2] # ~ 10 times higher! # F-test robust to clustering waldtest(fs, fn, vcov = clusterVCV(dat, fs, cluster1="cluster"))$F[2] # ~ 10 times lower than above (good)

Further “rule-of-thumb” measures are provided in a paper by Stock and Yogo (2005) and it should be noted that whole battery of weak-IV tests exist (for example, see the Kleinberg-Paap rank Wald F-statistic and Anderson-Rubin Wald test) and one should perform these tests if the presence of weak instruments represents a serious concern.

# R function adapted from Ian Gows' webpage: # http://www.people.hbs.edu/igow/GOT/Code/cluster2.R.html. clusterVCV <- function(data, fm, cluster1, cluster2=NULL) { require(sandwich) require(lmtest) # Calculation shared by covariance estimates est.fun <- estfun(fm) inc.obs <- complete.cases(data[,names(fm$model)]) # Shared data for degrees-of-freedom corrections N <- dim(fm$model)[1] NROW <- NROW(est.fun) K <- fm$rank # Calculate the sandwich covariance estimate cov <- function(cluster) { cluster <- factor(cluster) # Calculate the "meat" of the sandwich estimators u <- apply(est.fun, 2, function(x) tapply(x, cluster, sum)) meat <- crossprod(u)/N # Calculations for degrees-of-freedom corrections, followed # by calculation of the variance-covariance estimate. # NOTE: NROW/N is a kluge to address the fact that sandwich uses the # wrong number of rows (includes rows omitted from the regression). M <- length(levels(cluster)) dfc <- M/(M-1) * (N-1)/(N-K) dfc * NROW/N * sandwich(fm, meat=meat) } # Calculate the covariance matrix estimate for the first cluster. cluster1 <- data[inc.obs,cluster1] cov1 <- cov(cluster1) if(is.null(cluster2)) { # If only one cluster supplied, return single cluster # results return(cov1) } else { # Otherwise do the calculations for the second cluster # and the "intersection" cluster. cluster2 <- data[inc.obs,cluster2] cluster12 <- paste(cluster1,cluster2, sep="") # Calculate the covariance matrices for cluster2, the "intersection" # cluster, then then put all the pieces together. cov2 <- cov(cluster2) cov12 <- cov(cluster12) covMCL <- (cov1 + cov2 - cov12) # Return the output of coeftest using two-way cluster-robust # standard errors. return(covMCL) } }