The common approach to estimating a binary dependent variable regression model is to use either the logit or probit model. Both are forms of generalized linear models (GLMs), which can be seen as modified linear regressions that allow the dependent variable to originate from non-normal distributions.

The coefficients in a linear regression model are marginal effects, meaning that they can be treated as partial derivatives. This makes the linear regression model very easy to interpret. For example, the fitted linear regression model y=x*b tells us that a one unit increase in x increases y by b units.

Unfortunately, this is not the case in GLMs, because fitted GLMs take the form, y=G (x*b), where G(.) is the known link function (i.e. inverse logistic for logit). Since we want to calculate the slope of x which is inside the function G(.), calculating marginal effects that are comparable to their linear model counterparts involves using the chain rule.

Empirical economic research typically cites the marginal effects since they are intuitive and easy to digest. Therefore, it is important for researchers to be able to compute these results. A more formal treatment of this problem can be found in the following paper, where we see that the solution is to multiply the estimated GLM coefficients by the probability density function of the linked distribution (which is the derivative of the cumulative density function). Interestingly, the linked paper also supplies some R code which calculates marginal effects for both the probit or logit models. In the code below, I demonstrate a similar function that calculates ‘the average of the sample marginal effects’.

mfxboot <- function(modform,dist,data,boot=1000,digits=3){ x <- glm(modform, family=binomial(link=dist),data) # get marginal effects pdf <- ifelse(dist=="probit", mean(dnorm(predict(x, type = "link"))), mean(dlogis(predict(x, type = "link")))) marginal.effects <- pdf*coef(x) # start bootstrap bootvals <- matrix(rep(NA,boot*length(coef(x))), nrow=boot) set.seed(1111) for(i in 1:boot){ samp1 <- data[sample(1:dim(data)[1],replace=T,dim(data)[1]),] x1 <- glm(modform, family=binomial(link=dist),samp1) pdf1 <- ifelse(dist=="probit", mean(dnorm(predict(x1, type = "link"))), mean(dlogis(predict(x1, type = "link")))) bootvals[i,] <- pdf1*coef(x1) } res <- cbind(marginal.effects,apply(bootvals,2,sd),marginal.effects/apply(bootvals,2,sd)) if(names(x$coefficients[1])=="(Intercept)"){ res1 <- res[2:nrow(res),] res2 <- matrix(as.numeric(sprintf(paste("%.",paste(digits,"f",sep=""),sep=""),res1)),nrow=dim(res1)[1]) rownames(res2) <- rownames(res1) } else { res2 <- matrix(as.numeric(sprintf(paste("%.",paste(digits,"f",sep=""),sep="")),nrow=dim(res)[1])) rownames(res2) <- rownames(res) } colnames(res2) <- c("marginal.effect","standard.error","z.ratio") return(res2) }

This command also provides bootstrapped standard errors, which account for both the uncertainty in the predicted values and the estimated coefficients. The R package ‘erer’ also has a function that calculates these marginal effects. Illustrations of the above function, alongside code for a nice ggplot2 figure are displayed below.

library(AER) data(SwissLabor) mfx1 <- mfxboot(participation ~ . + I(age^2),"probit",SwissLabor) mfx2 <- mfxboot(participation ~ . + I(age^2),"logit",SwissLabor) mfx3 <- mfxboot(participation ~ . + I(age^2),"probit",SwissLabor,boot=100,digits=4) mfxdat <- data.frame(cbind(rownames(mfx1),mfx1)) mfxdat$me <- as.numeric(as.character(mfxdat$marginal.effect)) mfxdat$se <- as.numeric(as.character(mfxdat$standard.error)) # coefplot library(ggplot2) ggplot(mfxdat, aes(V1, marginal.effect,ymin = me - 2*se,ymax= me + 2*se)) + scale_x_discrete('Variable') + scale_y_continuous('Marginal Effect',limits=c(-0.5,1)) + theme_bw() + geom_errorbar(aes(x = V1, y = me),size=.3,width=.2) + geom_point(aes(x = V1, y = me)) + geom_hline(yintercept=0) + coord_flip() + opts(title="Marginal Effects with 95% Confidence Intervals")

Thanks for the post. The tricky part is of course to get the polynomial right which is part of the reason why one needs to be careful with these marginal effects. The other reason is that the effect might be far from linear over the range of the regressor. Personally I recommend using the effects package for computing and visualizing effects directly. If one uses poly (age, 2) to specify the quadratic polynomial in age then it also assesses its effect correctly. Then you can simply do something like plot (allEffects (obj), ask = FALSE, rescale = FALSE)).

Hey Z. You’re correct. I took this example from the ‘Econometrics with R’ book, and do not advocate using this approach for evaluating nonlinear relationships.

Personally, I recommend using the ‘mcgv’ package to model nonlinear relationships. However, I have had limited exposure to the ‘effects’ package.

Thanks for pointing this out!

I wasn’t referring to the polynomial being inappropriate for modeling the nonlinear effect of age, that is ok in this application. (And it certainly would also be ok to use a smooth GAM here, depending on what exactly you want to learn from the model.) The point was that in the linear predictor a * age + b * age^2 taking the derivative with respect to age yields a + 2 * b * age. Thus, it doesn’t make sense to take one derivative with respect to age (yielding a) and another one with respect to age^2 (yielding b), although this is often done in practice.

Now back at the computer (rather than just my mobile phone), I can also give a complete example for the effects visualization:

data(“SwissLabor”, package = “AER”)

m <- glm(participation ~ income + education + poly(age, 2, raw = TRUE) +

youngkids + oldkids + foreign, data = SwissLabor, family = binomial)

library("effects")

plot(allEffects(m), ask = FALSE, rescale.axis = FALSE)

The model m has the same coefficients as the age + I(age^2) approach but the effects package understands that both pertain to the same variable age. Furthermore, the visualization shows that for some regressors a "typical" slope as computed by marginal effects is ok while for other regressors (e.g., youngkids), the slope changes considerably over the range of "typical" values of the regressors.

Both aspects will be pointed out in more detail in the next edition of the AER book.

For more details on the effects package, see http://www.jstatsoft.org/v08/i15/ and http://www.jstatsoft.org/v32/i01/.

Thanks for the response Z. My previous response is somewhat ambiguous. I should have been clear — I agree that polynomials are not inappropriate for modeling nonlinear relationships, but an interpretation of their marginal effects requires extra consideration.

I must familiarize myself with the ‘effects package’. Another blogpost await I think!

Looking forward to the next AER book 🙂

Just one quick question, is the predict(x,…) on lines 15 and 16 supposed to be predixt(x1,…)?

Fixed. Good spot. Thanks!

Hi, I was just wondering – could you adapt this to multinomial logistic regression somehow? I am in the progress of trying (unsuccessfully) to convert the code – but hoping that maybe someone smarter would have some helpful suggestions! thanks so much in advance and the code is great, btw 🙂

Hi! Thanks for the comment. I am not familiar with multinomial models. However, I want to help you. Let me read up on this, and I will email you back.

Actually, have you seen the mlogit package? Looks like you can calculate marginal effects with the effects() function.

http://cran.r-project.org/web/packages/mlogit/

Hi! Wow, thanks for the quick reply. I in fact new of th package, but didn’t realize it has a command for marginal effects. Whooops. If I manage to do a bootstrap with it, just as you demonstrated with the logit/probit, I’ll let you know… Best

“knew” of course and not “new”

Hi again, sorry for the late reply but here is some code that takes the example marginal effects calculation from the mlogit package example, and runs this procedure through a bootstrapping function with the package boot. Hopefully, this will help you.

library(mlogit)

library(boot)

# function to obtain regression weights

data(“Fishing”, package = “mlogit”)

Fish <- mlogit.data(Fishing, varying = c(2:9), shape = "wide", choice = "mode")

m <- mlogit(mode ~ price | income | catch, data = Fish)

z <- with(Fish, data.frame(price = tapply(price, index(m)$alt, mean),

catch = tapply(catch, index(m)$alt, mean),

income = mean(income)))

# compute the marginal effects (the second one is an elasticity

effects(m, covariate = "income", data = z)

bs <- function(formula, data, indices) {

d <- data[indices,] # allows boot to select sample

mfx <- effects(formula, covariate = "income", data = d)

return(mfx)

}

boot(data=z, statistic=bs, R=1000, formula=m)

I modified some of this code to do some of the things I needed, and had a couple suggestions. First, while (I assume) glm() passes the dataframe in the “data=” argument to the actual glm function, many other functions like tobit() won’t do this–tobit(modform, data=mydata) merely instructs the survival package to look for a global dataframe called mydata. This means two things: 1) the dataframe you pass to mfxboot needs to be a global object, and 2) samp1 also needs to be made a global object,which is easy if you just add an extra carat: “samp1<<-mydata[sample(),]"

A bigger concern I have though is that this function merely calculates the average of the marginal effects for the data as given. It would be better to generalize the function to allow it to calculate marginal effects that are local to specific values of the regressors–for example, the average marginal effect for just women rather than for men and women. If I did it right, this merely requires modifying the use of the predict() function, which takes a "newdata=" argument where you can input data with specified characteristics to make the predictions on.

Hi Matthew,

Please see this package: http://cran.r-project.org/web/packages/mfx/index.html.

You have the complete package that compute cluster robust standard errors as well here,

http://www.google.fr/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0CCkQFjAB&url=http%3A%2F%2Fcran.r-project.org%2Fweb%2Fpackages%2Fmfx%2Fmfx.pdf&ei=iTmYU6Ac48jTBfj9gNAN&usg=AFQjCNGiA0vSASr9WOlegMehst1yamCm2A&sig2=ZSWT_CE8nnp5qwb9x9rTGQ&bvm=bv.68693194,d.d2k

I know, I wrote the package!

My apologies. Great job!

@diffuseprior: So I did stumble upon your mfxboot function and used for some of my work. Nonetheless, I have stumbled on some errors when running it lately. Here is the error message I get:

“Error in bootvals[i, ] <- pdf1 * coef(x1) :

number of items to replace is not a multiple of replacement length"

Any idea why that might be? Here is the link to the question I posted today on the matter: http://stats.stackexchange.com/questions/120917/mfxboot-function-for-marginal-effects-for-probit-regressions

I am particularly interested in your code as it features a nice ggplot code for plotting, which is quite convenient.

Thanks

Use the mfx package on CRAN instead.

I had to change the last line in the plot to

theme(title=element_text(‘Marginal Effects with 95% Confidence Intervals’))

to keep track with changes in ggplot2

thanks justin

Thanks so much for this post, here it is 3 years later and people (well me anyhow) are still finding it when searching this topic and finding it very helpful.

I have a very rudimentary question. I’m still confused on the interpretation of the ‘marginal effect’ of predictors in a logistic regression. In your example above (which I plugged into R and easily reproduced your results), the marginal effect of age is 0.687. But what does that mean? It can not mean that an increase in one year of age increases the probability of ‘participation’ by 0.687, because then an increase in 2 years of age would increase the probability by 1.37 and that can’t be.

I appreciate any clarity you can provide.

Hi Steven, sorry for the late reply. Age is quadratic so you have to take into account the squared term, something mfx does not do at the moment, but may do in the future. Look at John Foxes effects package for some help interpreting nonlinear regressors.