Digital video compression

Video streams are huge. They’re bigger than you think.

2 hours * 60 minutes/hour * 60 seconds/minute * 23.976 frames/second = 172627.2 frames.
2 hours * 60 minutes/hour * 60 seconds/minute * 29.97 frames/second = 215784 frames.

So, the video part of a 2-hour NTSC DVD would be over 208GB if it was 30fps material, and over 166GB if it was 24fps material, if the video wasn’t compressed.

And, a 2-hour NTSC BluRay would be over 1250GB if it was 30fps material, and over 1000GB if it was 24fps material, if the video wasn’t compressed.

How much they can be compressed is the big issue.

The JPEG format for images is a “lossy” scheme. You lose some picture information. But JPEG is also variable. At its best settings, you can’t visually see any difference between a JPEG image and the original uncompressed image. At its worst settings, the picture is quite “blocky” and looks exceptionally cruddy, but it does make for tiny files. Many image programs don’t allow you to vary the quality – they usually use a quality setting of around 75%. It’s a good compromise between image quality and file size. Some images can be compressed well past that without noticing any difference, while others are already starting to show the losses in certain images.

Most digital video starts off as being the video equivalent of when you store a picture in .jpg format instead of in .bmp format. In fact, most video compression schemes essentially store the complete frames as JPEG images. This alone would reduce the size of the video stream to about 1/5, so our 208GB DVD would be down to about 40GB and our 1250GB BluRay would be down to about 250GB. The equivalent of the JPEG “quality setting”, for a DVD you purchase, is about 80%.

So, the next trick, is to not store every frame as a complete image. Most of the frames are only stored as basically what has changed from the frame before.

In the video world, the complete frames are called “I frames”, and the “what’s changed?” frames are called “P frames” and “B frames”.

In theory, you could just have one I frame, and have the rest of the video stream made up of P frames and B frames, but any transmission/decoding errors would accumulate and the picture could quickly become garbage. So the video stream on your DVD is made up of one I frame and then 14 P and B frames, and then the sequence repeats. Any errors will get corrected at the next I frame, which is about every half-second.

So, by using P and B frames, the DVD’s video stream can now get down to the 6GB range and the BluRay’s video stream can get down to about 35GB.

But, to our benefit, there are still a few more tricks we can use to compress the video stream even further.

The first one, which is rather obvious, is to alter the picture size. “Scene groups” start with this one when recoding video. If you take a 720x480 image and reduce it to 512x384, it’s almost half the size. So, a change that’s not all that noticeable to most people, has now cut our DVD down to about 3GB.

The second one is also pretty obvious. If we make longer sequences of P and B frames, with fewer I frames, then our video stream will be smaller. But any decoding errors will linger for longer, and playback can only start at the beginning of a sequence, so if you make the sequences too long then random access such as pause/resume, FF/REV and chaptering, starts behaving weirdly. In some cases trying to fast-forward 3 seconds ends up backing the video up by 5 minutes, due to playback resuming on the closest sequence boundary and its I frame, and them being just too far apart.

The third one is to lower the “bitrate”. This affects both the picture quality and the motion quality. Since each I frame requires a certain amount of information to store the picture, lower bitrates end up causing the JPEG “quality setting” to be reduced further for each I frame. As well, obviously, large amounts of motion are not well represented by a small amount of data – as the bitrate drops, some of the motion information will need to be ignored. This can lead to jerky video playback as ignored changes “reappear” with each I frame.

The fourth way to shrink file sizes is to use newer compression schemes that use better algorithms in creating/expanding the P frames and B frames.

All of these come with trade-offs. Even the original DVD (or BluRay) lacks some information from the original film, and further compression necessarily throws out more information. There’s no magic rule that “shrinking a movie to _ x _ size, it will still look ok, but past that it will look cruddy”. How small you make your final video stream depends on what codec you use to compress it, and how much loss you’re willing to live with, and what you’re compressing in the first place. Most of the newer codecs achieve similar compression amounts and lead to similar visual results when using the same settings – there’s not really one that’s head-and-shoulders “better” than another.

For instance, using Handbrake to encode 2 episodes of a TV show from a DVD to an h.264 .mkv file, if you go with constant quality settings instead of trying to encode to a specified size, one episode might end up being only 400MB and another one might be 700MB or more. Or, if you try to encode them both to 400MB, obviously the second one needs more visual information discarded.

Usually the lower bitrates (and smaller files) work well enough for most people’s tastes for live-action film and television, but due to the JPEG limitations, gradual colour gradations (either in skies/shadows, or in animated works) tend to become quite blocky. Some of the newer encoders try to incorporate some form of “de-blocking”, but de-blocking and small bitrates/sizes are somewhat opposites. Either the video can’t be compressed quite as much as you’d intended, or other information will need to be discarded.

This is good info.

It still tickles me though that people complain about “compressing” their videos because they want the original “Non-Lossy” Blu-Ray rip.

Blu-Rays are about 90% lossy to begin with; an “UNCOMPRESSED” BluRay would occupy multiple TERRABYTES of space.  

Taking a BluRay and RECOMPRESSING it using recent codecs compresses by an additional 20 to 30% with NO “Psychovisual” loss of information.

You’re right, Tony, I did cheat a little in my numbers.  My numbers for the size of the uncompressed stream merely reflect how big a sequential collection of 720*480, or 1920*1080 .bmp images would be.  I didn’t add anything for the overhead within the actual formats of the stream, or the fact that “essentially a JPEG image” doesn’t mean it really is merely stored as a 24-bit RGB image, so the “true” video streams are even larger than I present them as being.  But it made the point well enough, for just how much information is “discarded”, even with the commercial offerings.

I just took the calculated number of frames and multiplied by 720*480*3 for a DVD, and by 1920*1080*3 for a BluRay.  And, for those wondering, it’s “times three” because just 720*480 would give the number of pixels, but each pixel has 24 bits of colour in the RGB world of .bmp and .jpg.  At 8 bits per byte, 720*480*24/8 = 720*480*3 bytes per image.