texreg: A package for beautiful and easily customizable LaTeX regression tables from R

There was a very informative post last week showing how the R package stargazer is used to generate nice LaTeX tables from a number of R objects. This package looks very useful. However, I would like to extol the virtues of another R package that converts model objects in R into LaTeX code: texreg.

For me, the texreg package has one very important advantage compared to stargazer, which is (and please correct me if I am wrong on this point) that the regression model’s output is very easy to customize. There are a number of examples where this feature of texreg is important, such as the inclusion of robust/cluster-robust standard errors, or changing the coefficients of a generalized linear model to read as marginal effects.

In the example below, I have replicated (with some modifications) the code used for my last blog post showing the importance of clustering standard errors. Basically this code simulates a bunch of data, estimates a regression model, then duplicates the dataset three times, which in turn reduces the standard errors of the regression model. The last step shows how the clustered standard errors will not be reduced by duplicating the dataset. Once I perform this analysis, I use the output code to generate a texreg object that gives the code needed to create the regression table in LaTeX.

In this example, I have used the texreg override commands in the texreg function so that the standard errors and p-values (needed to calculate the stars in the regression table) reflect the clustering performed with the robust.se() function. The output from this, once the code is run through a LaTeX compiler (I use TeXworks), is shown in the image below (I have also adjusted the caption so that it is on the top of the table rather than the bottom). One can download the latest version of texreg from the packages R-forge project page. This page also features a very helpful and active forum.


library(lmtest) ; library(sandwich)

# data sim

x <- rnorm(1000)
y <- 5 + 2*x + rnorm(1000,0,40)
# regression
m1 <- lm(y~x)

# triple data
dat <- data.frame(x=c(x,x,x),y=c(y,y,y),g=c(1:1000,1:1000,1:1000))
# regressions
m2 <- lm(y~x, dat) # smaller StErrs

# cluster robust standard error function
robust.se <- function(model, cluster){
  M <- length(unique(cluster))
  N <- length(cluster)
  K <- model$rank
  dfc <- (M/(M - 1)) * ((N - 1)/(N - K))
  uj <- apply(estfun(model), 2, function(x) tapply(x, cluster, sum));
  rcse.cov <- dfc * sandwich(model, meat = crossprod(uj)/N)
  rcse.se <- coeftest(model, rcse.cov)
  return(list(rcse.cov, rcse.se))

m3 <- robust.se(m2,dat$g)[[2]] # StErrs now back to what they are

       caption="The Importance of Clustering Standard Errors",