If you’re like me, you probably have a serious issue with digital hoarding, refuse to delete anything and methodically collect and categorize all the things.
We need help, yes.
In the meantime, there’s the option of substituting your carefully hand-encoded AVC video files and give them the ol’ modernization treatment with our new best friend, HEVC.
If you’re confused at this point, here’s a short glossary:

  • AVC: Advanced Video Codec or x264: The video codec used in pretty much all modern video.
  • HEVC: High Efficiency Video Codec or x265: The new kid on the block and successor to AVC.

Your standard run-of-the-mill video file that obviously fell from a truck will usually come in either an mp4 or mkv container with AVC video and some kind of audio.
If you want to reduce the file size of the video and keep the quality reasonably high, re-encoding the file into HEVC seems to be a valid plan. Obviously it would be technically better to rip the source medium again into HEVC directly instead of transcoding an existing lossily-encoded file, but… yeah, that would be more work and less bashscriptable.
On a side note, if you want a little primer on video compression, I heartily recommend Tom Scott’s video on a related issue.

Anyway, long story short:
so I took the first episode of Farscape, which I had conveniently lying around in a neat little matroska-packaged AVC+DTS combo, cut out15 minutes, and then re-encoded this 15 minute in a bunch of ways.
Why? Well, initially I wanted to figure out how best to transcode my existing Farscape rips to save the most space while maintaining a reasonable amount of quality, so I did the scientific(ish) thing and created a bunch of samples.
And yes, HEVC encoding without proper hardware support is a pain and I spent way too much CPU time on this little project, but soon™ we will have reached the point were HEVC is the new de-facto standard and when that point comes I will be ready.

Methodology i.e. “stuff I did”

Look, I’m not an expert on video encoding, I’m not familiar with the internals of the encoders, which means I know shit about the standards and the software implementations. I’m just some guy who wanted to save disk space and decided to do some testing in the process.

I re-encoded the aforementioned video clip using the following settings:

  • Encoders: x264 and x265
  • CRF: 1, 17, 18, 20, 21, 25, 26, 51
  • Presets: placebo, slower, medium, veryfast, ultrafast

The result is 80 files of varying quality and size.
Judging file size is pretty straight-forward: Just compare the file sizes. Magic.
As for quality, that’s a difficult one, and since I lack a proper testing setup and about three dozen people to judge the subjective quality of each clip, I’ll just be using the SSIM as calculated by comparing each clip with the original clip, and see how far that gets me.

So to start, here are the first five rows:

file codec preset CRF size ssim
farscape_sample_x264.AAC2.0.CRF01-medium.mkv x264 medium 1 2097.20 0.935785
farscape_sample_x264.AAC2.0.CRF01-placebo.mkv x264 placebo 1 1484.04 0.935858
farscape_sample_x264.AAC2.0.CRF01-slower.mkv x264 slower 1 1530.13 0.935846
farscape_sample_x264.AAC2.0.CRF01-ultrafast.mkv x264 ultrafast 1 3887.30 0.935909
farscape_sample_x264.AAC2.0.CRF01-veryfast.mkv x264 veryfast 1 2315.65 0.935663
farscape_sample_x264.AAC2.0.CRF17-medium.mkv x264 medium 17 280.49 0.933957

File Size

To start of, here’s some tables:

File Size (MiB) of x264 re-encode
CRF preset Sum
placebo slower medium veryfast ultrafast
1 1484 1530 2097 2316 3887 11314
17 259 271 280 254 546 1610
18 223 233 242 216 481 1395
20 169 176 184 161 372 1062
21 149 155 162 140 328 934
25 97 98 103 88 199 585
26 89 89 93 79 177 527
51 24 24 24 23 28 123
Sum 2494 2576 3185 3277 6018 17550
File Size (MiB) of x265 re-encode
CRF preset Sum
placebo slower medium veryfast ultrafast
1 1852 1726 1796 1869 1280 8523
17 180 176 173 157 121 807
18 156 153 150 138 109 706
20 120 119 117 109 89 554
21 106 106 105 97 81 495
25 70 71 71 66 58 336
26 64 65 65 61 53 308
51 22 22 23 21 22 110
Sum 2570 2438 2500 2518 1813 11839

That’s… accurate, yet not very visually stimulating.
Needs more plot.

sizes_by_codec
sizes_by_codec

Okay, let’s zoom in a little by ignoring CRF 51 and CRF 01, as they’re silly anyway.

sizes_by_codec_subset
sizes_by_codec_subset

Hm, yes, quite.
Now a breakdown to compare codecs across presets:

sizes_by_preset
sizes_by_preset
sizes_by_preset
sizes_by_preset

As you might have noticed, absolute file sizes might not be as interesting and/or generalizable as relative size changes, so here we go:

sizes_relative
sizes_relative

SSIM (Approximate Quality)

To start, let’s do the raw data table thing again:

SSIM (times 100) of x264 re-encode
CRF preset Sum
placebo slower medium veryfast ultrafast
1 94 94 94 94 94 470
17 93 93 93 93 93 465
18 93 93 93 93 93 465
20 93 93 93 93 93 465
21 93 93 93 93 93 465
25 93 93 93 93 92 464
26 93 93 93 93 92 464
51 87 88 87 84 82 428
Sum 739 740 739 736 732 3686
SSIM (times 100) of x265 re-encode
CRF preset Sum
placebo slower medium veryfast ultrafast
1 94 94 94 94 94 470
17 93 93 93 93 93 465
18 93 93 93 93 93 465
20 93 93 93 93 93 465
21 93 93 93 93 93 465
25 93 93 93 93 93 465
26 93 93 93 93 93 465
51 89 88 88 88 86 439
Sum 741 740 740 740 738 3699

Please note that I had to multiply the SSIM values by 100 to get them to display as something other than a flat 1 because rounding is hard, apparently.
Also, yes I know the “sum” column/row doesn’t make sense, but it’s the default and I couldn’t be bothered to try to remove it.

And now, the plotty thing.

ssim_by_codec
ssim_by_codec
ssim_by_codec
ssim_by_codec
ssim_by_preset
ssim_by_preset
ssim_by_preset
ssim_by_preset

Now let’s do that thing again where we compare all the CRF by preset cells in a grid, but now using SSIM as a metric:

SSIM_relative
SSIM_relative

Well that’s not very enlightening, is it?
Bummer.

Quality(ish) versus Size

ssim_by_size
ssim_by_size

I’ve tried log scales on this one, but it didn’t really help.
Let’s look at the subset of reasonable CRFs:

ssim_by_size_subset
ssim_by_size_subset

Well if there’s a lesson here, it seems that ultrafast is probably not the way to go.
Let’s take another look, ignoring the ultrafast data.

ssim_by_size_subset2
ssim_by_size_subset2

Conclusion

Keep in mind that this is not a scientific study.
The results might be limited to my version of HandBrake (1.0.7 (2017040900)), or it might be limited to re-encoding a lossily-encoded file, or it might be limited to encoding SD content and behave slightly differently with 4k content. My point is: I don’t know. I have no idea how generalizable these results are, but with the limited amount of certainty I can muster, I’ll give you this:

  • Don’t use ultrafast. veryfast is fast as well, and apparently better(ish)
  • Also, don’t use placebo. Why would you even do that to yourself1.
  • Keep your CRF around the 20’s. Seems reasonable.

¯\_(ツ)_/¯

Note: If you have anything else you want to try with the data, you can grab it here.


  1. If I do this again, I will track the encoding time. Seriously, don’t to placebo.