Yes yes yes... no!

Exploring the similarities and differences of shows that ended on a controversial note

There are some shows that are/were really popular, everyone is excited about them, and then they go down the drain in a both abrupt and spectactular kind of way. Some take their time over a whole season, others have you hoping (and quite possibly in denial) until the end, and then they just kick you in the ol’ hope organ.

I was wondering (for no particular recent-eventsy kind of reason at all, I swear) if some of the shows I recall being considered “bad enders” have something in common, or more interestingly, end badly, differently.

Code: Data collection

library(tRakt)
library(kableExtra)
library(dplyr)

shows <- tribble(
  ~show, ~slug,
  "Dexter", "dexter",
  "Lost", "lost-2004",
  "How I Met Your Mother", "how-i-met-your-mother",
  "Scrubs", "scrubs",
  "Battlestar Galactica (2003)", "battlestar-galactica-2003",
  "Game of Thrones", "game-of-thrones"
)

if (file_not_cached(cache_path, episodes)) {
  episodes <- purrr::pmap_df(shows, ~{
  trakt.seasons.summary(.y, extended = "full", episodes = TRUE) %>%
    pull(episodes) %>%
    bind_rows() %>%
    select(-available_translations)
    mutate(
      show = .x,
      season = as.character(season),
      episode_abs = seq_along(first_aired)
    )
  })
  
  cache_file(cache_path, episodes)
} else {
  episodes <- read_cache_file(cache_path, episodes)
}

Here’s the highest rated episodes per show to get started:

Code: Show code

episodes %>%
  group_by(show) %>%
  top_n(3, rating) %>%
  arrange(rating, .by_group = TRUE) %>%
  mutate(
    rating = round(rating, 1)
  ) %>%
  select(
    show, season, episode, title, rating
  ) %>%
  kable(
    col.names = c("Show", "Season", "Episode", "Title", "Rating"),
    caption = "Top 3 episodes per show",
    digits = 2
  ) %>%
  kable_styling(bootstrap_options = c("condensed")) %>%
  collapse_rows(1)
Table 1: Top 3 episodes per show
ShowSeasonEpisodeTitleRating
Battlestar Galactica (2003)420Daybreak (2)8.5
34Exodus (2)8.6
320Crossroads (2)8.6
Dexter712Surprise, Motherfucker!8.7
612This is the Way the World Ends8.9
412The Getaway8.9
Game of Thrones74The Spoils of War8.8
610The Winds of Winter8.8
69Battle of the Bastards8.9
How I Met Your Mother724The Magician’s Code: Part Two8.4
29Slap Bet8.4
812The Final Page: Part Two8.6
Lost517The Incident (2)8.4
45The Constant8.5
323Through The Looking Glass (2)8.5
Scrubs314My Screw Up8.4
818My Finale8.4
520My Lunch8.4

Per-episode ratings are always neat to look at.

ggplot(episodes, aes(x = episode_abs, y = rating)) +
  geom_point(alpha = .75) +
  scale_y_continuous(breaks = 0:10, minor_breaks = seq(0, 10, .5)) +
  facet_wrap(~show, ncol = 1) +
  labs(
    title = "Episode Ratings per Show",
    subtitle = "Ratings on trakt.tv",
    x = "Absolute Episode #",
    y = "Rating (1-10)"
  )
Episode ratings per show
Episode ratings per show

Since I’m primarily interested in the rating of the ending compared to the average for the specific show, we’ll standardize the ratings using mean and standard deviation of each show. Just in case, we’ll get both centered and standardized ratings.

episodes <- episodes %>%
  group_by(show) %>%
  mutate(
    rating_c = rating - mean(rating),
    rating_z = rating_c / sd(rating)
  )

episodes %>%
  group_by(show) %>%
  filter(rating == max(rating) | rating == min(rating)) %>%
  arrange(rating, .by_group = TRUE) %>%
  mutate_at(
    vars(starts_with("rating")), ~round(.x, 1)
  ) %>%
  select(
    show, season, episode, title, starts_with("rating")
  ) %>%
  kable(
    col.names = c("Show", "Season", "Episode", "Title", 
                  "Rating", "Rating (centered)", "Rating (standardized)"),
    caption = "Best and worst episode by show with centered/standardized ratings"
  ) %>%
  kable_styling(bootstrap_options = c("condensed")) %>%
  collapse_rows(1)
Table 2: Best and worst episode by show with centered/standardized ratings
ShowSeasonEpisodeTitleRatingRating (centered)Rating (standardized)
Battlestar Galactica (2003)214Black Market7.3-0.6-2.5
320Crossroads (2)8.60.62.3
Dexter812Remember the Monsters?6.7-1.5-6.0
412The Getaway8.90.72.9
Game of Thrones86The Iron Throne6.7-1.6-4.7
69Battle of the Bastards8.90.72.1
How I Met Your Mother911Bedtime Stories7.2-0.8-3.9
812The Final Page: Part Two8.60.63.2
Lost212Fire + Water7.5-0.4-2.4
323Through The Looking Glass (2)8.50.53.1
Scrubs912Our Driving Issues6.7-1.0-3.8
520My Lunch8.40.72.5

Plot them all together:

ggplot(episodes, aes(x = episode_abs, y = rating_c, fill = show)) +
  geom_point(alpha = .75, shape = 21) +
  scale_y_continuous(breaks = seq(-10, 10, .5), minor_breaks = seq(-10, 10, .25)) +
  labs(
    title = "Episode Ratings per Show",
    subtitle = "Centered Ratings",
    x = "Absolute Episode #",
    y = "Rating (centered)",
    fill = ""
  )
Centered episode ratings per show
Centered episode ratings per show

We should also normalize the episode count, so we’ll take the absolute episode number and scale them to the interval [0, 100] — then we can interpret it as a percentage of total show run time.

episodes <- episodes %>%
  group_by(show) %>%
  mutate(
    episode_rel = (episode_abs / max(episode_abs)) * 100
  )

ggplot(episodes, aes(x = episode_rel, y = rating_c, fill = show)) +
  geom_point(alpha = .75, shape = 21) +
  scale_y_continuous(breaks = seq(-10, 10, .5), minor_breaks = seq(0, 10, .25)) +
  labs(
    title = "Episode Ratings per Show",
    subtitle = "Centered Ratings, normalized run time",
    x = "Relative Episode (% of Total Run)",
    y = "Rating (centered)",
    fill = ""
  )
Centered ratings, normalized episode number
Centered ratings, normalized episode number

For display purposes, we’ll categorize the last season and last episode respectively.

episodes <- episodes %>%
  group_by(show) %>%
  mutate(
    is_last_season = if_else(
      as.numeric(season) == max(season), "Last Season", "Earlier Seasons"
    ),
    is_last_episode = if_else(episode_rel == 100, "Finale", "Earlier Episodes")
  )

Now we’ll look at the previous plot, but highlight the last seasons of our shows:

ggplot(episodes, aes(x = episode_rel, y = rating_c, fill = is_last_season)) +
  geom_point(size = 2, alpha = .75, shape = 21) +
  scale_fill_brewer(palette = "Dark2") +
  #scale_y_continuous(breaks = 0:10, minor_breaks = seq(0, 10, .5)) +
  labs(
    title = "Episode Ratings per Show",
    subtitle = "Al show, centered Ratings, normalized episode numbers",
    x = "Relative Episode (% of Total Run)",
    y = "Rating (centered)",
    fill = ""
  )
Centered episode ratings, colored by last/earlier season
Centered episode ratings, colored by last/earlier season

Welp, not for all, but for most shows in the mix we’re seeing quite a noticable dip at the end there.

ggplot(episodes, aes(x = is_last_season, y = rating_c, 
                     color = is_last_season, fill = is_last_season)) +
  geom_boxplot(alpha = .25) +
  geom_violin(alpha = .5) +
  geom_point(
    data = episodes %>% filter(is_last_episode == "Finale"),
    shape = 21, size = 4, color = "black", stroke = 1,
    key_glyph = "rect"
  ) +
  facet_wrap(~show, nrow = 1) +
  scale_x_discrete(breaks = NULL) +
  scale_fill_brewer(palette = "Dark2", aesthetics = c("color", "fill")) +
  labs(
    title = "Episode Ratings by Earlier/Last Season",
    subtitle = "The dot is the final episode",
    x = "", y = "Rating (centered)",
    color = "", fill = ""
  )
Last Seasons: A Boxplot
Last Seasons: A Boxplot

This is probably the most useful plot so far. Not only can we distinguis between the final season’s ratings and the remainder of the show, but we can also see if the finale itself was rated particularly differently.

Conclusion

I think it’s fair to say that “bad endings” and “controversial endings” are different categories. While BSG and Lost both have endings that left many people unsatisfied, they’re still not noticably lower rated then the remainder of the show – on the contrary even, they’re above average.

Then there’s the case of the bad last season. While Scrubs didn’t have a band ending per-se, it’s just that the whole last season was just too big of a departure from what people liked about the show before, namely, well, the cast for one thing.

And then there’s the “well this is just bullshit” endings. Here we find Dexter, How I Met Your Mother, and of course, Game of Thrones.
These endings are special – they’re not “bad because I didn’t like it”-bad, they’re “bad because it doesn’t make any sense in the context of the hours and hours of previous material”.

At least that’s my hypothesis.

See also