Whenever somebody mentions “The Simpsons” it always stirs up feelings of nostalgia in me. The characters, uproarious gags, zingy one-liners, and edgy animation all contributed towards making, arguably, the greatest TV ever. However, it’s easy to forget that as a TV show “The Simpsons” is still ongoing—in its twenty-fourth season no less.
For me, and most others, the latter episodes bear little resemblance to older ones. The current incarnation of the show is stale, and has been for a long time. I haven’t watched a new episode in over ten years, and don’t intend to any time soon. When did this decline begin? Was it part of a slow secular trend, or was there a sudden drop in the quality, from which there was no recovery?
To answer these questions I use the Global Episode Opinion Survey (GEOS) episode ratings data, which are published online. A simple web scrape of the “all episodes” page provides me with 423 episode ratings, spanning from the first episode of season 1, to the third episode of season 20. After S20E03, the ratings become too sparse, which is probably a function of how bad the show, in its current condition, is. To detect changepoints in show ratings, I have used the R package changepoint. An informative introduction of both the package and changepoint analyses can be found in this accompanying vignette.
The figure above provides a summary of my results. Five breakpoints were detected. The first occurring in the first episode of the ninth season: The City of New York Vs. Homer Simpson. Most will remember this; Homer goes to New York to collect his clamped car and ends up going berserk. Good episode, although this essentially marked the beginning of the end.
According to the changepoint results, the decline occurred in three stages. The first lasted from the New York episode up until episode 11 in season 10. The shows in this stage have an average rating of about 7, and the episode where the breakpoint is detected is: Wild Barts Can’t Be Broken. The next stage roughly marks my transition, as it is about this point that I stopped watching. This stage lasts as far as S15E09, whereupon the show suffers the further ignominy of another ratings downgrade. Things possibly couldn’t get any worse, and they don’t, as the show earns a minor reprieve after the twentieth episode of season 18.
So now you know. R code for the analysis can be found in the below.
# packages
library(Hmisc) ; library(changepoint)
# clear ws
rm(list=ls())
# webscrape data
page1 = "http://www.geos.tv/index.php/list?sid=159&collection=all"
home1 = readLines(con<-url(page1)); close(con)
# pick out lines with ratings
means = '<td width="60px" align="right" nowrap>'
epis = home1[grep(means, home1)]
epis = epis[57:531]
epis = epis[49:474]
# prune data
loc = function(x) substring.location(x,"</span>")$first[1]
epis = data.frame(epis)
epis = cbind(epis,apply(epis, 1, loc))
epis$cut = NA
for(i in 1:dim(epis)[1]){
epis[i,3] = substr(epis[i,1], epis[i,2]-4, epis[i,2]-1)
}
#create data frame
ts1 = data.frame(rate=epis$cut, episode=50:475)
# remove out of season shows and movie
ts1 = ts1[!(ts1$episode %in% c(178,450,451)),]
# make numeric
ts1$rate = as.numeric(as.character(ts1$rate))
# changepoint function
mean2.pelt = cpt.mean(ts1$rate,method='PELT')
# plot results
plot(mean2.pelt,type='l',cpt.col='red',xlab='Episode',
ylab='Average Rating',cpt.width=2, main="Changepoints in 'The Simpsons' Ratings")
# what episodes ?
# The City of New York vs. Homer Simpson
# Wild Barts Can't Be Broken
# I, (Annoyed Grunt)-Bot -
# Stop Or My Dog Will Shoot!

Interesting! I’m always interested in what data people like to analyze. (I might have to borrow this example for one of my classes.)
I’m not sure whether the ratings are repeated measures for the same people, or by different people, or both.
Anyway, I would be more comfortable with continuous changes rather than jumps (people gradually gaining or losing interest), and would try lowess for example here.
There’s probably also some short-term dependence that might be included by using time-series methods on these data.
I tried lowess and a time series model, though they don’t shed much light. Code, to add to the bottom of yours:
# lowess
plot(ts1$rate,type=”l”)
lines(lowess(ts1$rate))
# auto.arima
library(forecast)
ts1.aa=auto.arima(ts1$rate)
ts1.aa
plot(forecast(ts1.aa,h=100))
The lowess just shows a general downward trend, and the time series obtains an MA(1) model on the first differences.
I played with the lowess a little. Though the default f=2/3 usually works pretty well for me, I found that f=0.2 seemed to track the trend better. There’s always a tradeoff between insight and over-smoothing, though: is that really a decrease and increase at the end, or just chance?
Thanks for sharing your code, so that this mini-analysis took me about 2 minutes!
Thanks for the reply Ken.
You are right, there are loads of different ways in which one could model this series. However, given the question at hand, trying to pinpoint the episode “The Simpsons” went sour and the show’s evolution from that point. For me, the change point technique is the most appropriate in this case.
Just a question. What is the meaning of these lines?
epis = epis[57:531]
epis = epis[49:474]
Those lines just prune the character vector taking out lines that do not contain ratings.
Pingback: TV shows rated by episode as a Shiny App | PremierSoccerStats
Interesting analysis
I just developed a fun Shiny App based on your code http://glimmer.rstudio.com/pssguy/TVShowRatings
with a brief blog post
http://www.premiersoccerstats.com/wordpress/?p=1380
That app is cool. Nice one!
Thanks. Apparently, it made the front page of digg.com today so I have added a line on it with a link to your article
Thanks for not hogging all the limelight!
Pingback: Reading Digest: Quantifying Zombie Simpsons Edition | Dead Homer Society
Pingback: Momento R do Dia | De Gustibus Non Est Disputandum
Pingback: Weekly links for May 13 | God plays dice
Thanks for the interesting reproducible code… though, is the x-axis mislabeled? The episodes run 50-475, axis goes 0-425.. ?
Hi Wes, That was just me being lazy. I think I just used the plot function that comes with the changepoint package. I thought this would be fine because the first 50 shows weren’t really “The Simpsons” per se, but more part of another show.
Hey there are using WordPress for your site platform?
I’m new to the blog world but I’m trying to get started and create my
own. Do you need any coding expertise to make your own blog?
Any help would be really appreciated!
Yes, I am using a basic WordPress template. No programming experience is needed, so just register and get going.
You’ve made some decent points there. I looked on the web to find out more about the issue and found most individuals will go along with your views on this web site.
Pingback: When did “How I Met Your Mother” become less legen.. wait for it… | Data and Analysis with R, for Work and Fun
Pingback: R e a TV: quebra estrutural no blog? | De Gustibus Non Est Disputandum
Pingback: It's a 'Simpsons World'—and we're just living in it | buzzcarl
I though that the graphic in the vignette is wrong, the GEOS information is correct but is not about rating, it’s about the mean score that the people gave to the episode, so a correct name for the graphic is Changepoints in The Simpson GEOS mean score, if it were about the rating then by example the fall after the first episode of the ninth season would be from 10.5
P.d. Thanks for the code and the data analysis, it has been very useful
Fascinating analysis and it’s the first I’ve seen that corresponds exactly with my feelings about The Simpson’s decline. Season 8 had, in my opinion, a few of the best episodes of all time but it also contained clear cracks. Not many. It was, on the whole, a great series. But cracks are there. Series 9 was the opposite, mostly sub-par (although not terrible) with a few good ones. Series 10 onwards was dreadful for anyone who thinks Season 4 was back-to-back genius. The episode where it is revealed that Principle Skinner is not really who he claims to be is where it all fell apart for me. They even caveat the end of that episode, with the characters saying “Let’s never talk about this again.”
Great comment Andrew.
Thanks diffuseprior. I’ve been looking at the rest of your site and it’s really interesting.
As for The Simpsons, most (or all) other articles I’ve read generally point to the decline as being much later than I or my friends tend to think it began. So it was refreshing to see your analysis.