Remember the X-Files?

With the return of the X-Files in form of a miniseries, I was tempted to catch up on the original run of the show, since I had only seen the occasional episode in the late 90’s or early 00’s (my mom was a big fan). Being me, I already looked up the X-Files episodes ratings on trakt.tv to see if there’s something interesting about them, but I didn’t think there was. However, when I listened to the Incomparable talking about the show, I learned that apparently X-Files can be divided into the “myth arc” and regular, more stand-alone episodes. That’s when I realized I need to get my tv show analysis boots on and try to see what I could do. To my delight, I noticed that the appropriate Wikipedia article neatly marks the myth arc episodes, ready for plucking.

And then I started plucking.

2015 TV Recap

I’m late to the game, I know, but I’ve been keeping a list of notable shows I watched in 2015 since spring, and I think I owe it to past-me to put that in blog-form.

Jessica Jones recap

So I’ve been watching Marvel’s Jessica Jones over the past couple days, as one does, and I have opinions and stuff about it. However, since I believe that a plot is worth more than word stuff, I present to you my viewing expierence in data.

Quickly Compare your TV show ratings to trakt.tv

library(tRakt) # install via devtools::install_github("jemus42/tRakt") library(dplyr) library(tidyr) library(ggplot2) get_trakt_credentials(username = "Your Username") slug <- "dig" # Slug from trakt.tv show url trakt.user.ratings(type = "episodes") %>% filter(show.slug == slug) %>% arrange(season, episode) %>% select(rating, season, episode, title) %>% mutate(season = factor(season, ordered = T)) %>% rename(user.rating = rating) %>% left_join((trakt.get_all_episodes(slug) %>% select(rating, title, epnum))) %>% gather("type", value = "rating", user.rating, rating) %>% ggplot(data = ., aes(x = epnum, y = rating, colour = type)) + geom_point(size = 6, colour = "black") + geom_point(size = 5) + ylim(c(5, 10)) + scale_colour_discrete(labels = c("My Rating", "Trakt.

Shows going down the drain lately

I don’t know if you’ve noticed, but lately I’ve done a lot of stuff with tv shows. Along the way, I noticed some trends with a few shows which seemed quite interesting to me, namely some shows were going straight down the drain, at least as far as their recent ratings are concerned. The projects I’m referring to are these two: 100 Popular Shows on trakt.tv 100 Trending Shows on trakt.

So I threw R at a thousand(ish) TV shows

Analyzing TV shows seems to be what I do these days. So I wanted to keep my newfound calling going and sucked the data for about a thousand shows out of the trakt.tv API, which was nice enough to only fail on me, like, twice. So, after some time of intense data pulling, I found myself with the more or less complete data (show info, season info, episode data) for 988 shows (and that’s why I keep referring to 1000(ish)).

Overanalyzing TV Shows

Overanalyzing tv shows has kind of become my jam. So why not totally overdo it. Note that everything I describe in this blogpost is purely for the lulz, and I don’t pretend there’s any scientific merit to it. I just like throwing maths at data. After I more or less succesfully plotted all the things, I wanted to go full blown statisticy on the subject. While my knowledge of statistics isn’t nearly as extensive as I’d like it to, I at least know a little about comparing groups.

Introducing tRakt

It’s been a while since I started working on a set of functions to pull data from trakt.tv. I documented part of the early process in an earlier blogpost, and since then I started aggregating my work into a proper package. Since trakt launched their new APIv2, I started to rewrite and ehance the package a little, also solidifying the whole authentication business. I have not implemented any OAuth2 methods, but since the purpose of this package is to pull a bunch of data and not to perform actions like checkins, I don’t think it’s a big deal.

Solving problems nobody ever has

Remember that last post? No? Good. Then don’t scroll down. Or do. Idunno. One thing I wanted for my more-or-less-automated TV show plots was appropriate colors to differentiate seasons. I assume that’s a problem we can all relate to. Of course in the R and ggplot2 bubble, there’s the RColorBrewerpackage that provides nice and easy color palettes of varying sizes. But that’s boring. Also, repetitive. So let’s fix that.

I just wanted to rewatch Stargate

Stargate SG-1, while probably a mediocre show in the grand scheme of sci-fi shows, it’s the sci-fi show I grew up with, so I tend to enjoy rewatching parts of it occasionally. Well, at least I rewatched it twice so far. The full thing. 10 seasons. Yep. Even those last two. So this time, I wanted to cherry-pick the good™ episodes, and of course efficient cherry-picking in 2014 involves R, the trakt.