Image compression vs. video compression

Image compression vs. video compression

Different compression standards utilize different methods of reducing data, and hence, results differ in bit rate, quality and latency. Compression algorithms fall into two types: image compression and video compression.

Image compression uses intraframe coding technology. Data is reduced within an image frame simply by removing unnecessary information that may not be noticeable to the human eye. Motion JPEG is an example of such a compression standard. Images in a Motion JPEG sequence is coded or compressed as individual JPEG images.

With the Motion JPEG format, the three images in the above sequence are coded and sent as separate unique images (I-frames) with no dependencies on each other.
Illustration of how Motion JPEG format works

Video compression algorithms such as MPEG-4 and H.264 use interframe prediction to reduce video data between a series of frames. This involves techniques such as difference coding, where one frame is compared with a reference frame and only pixels that have changed with respect to the reference frame are coded. In this way, the number of pixel values that is coded and sent is reduced. When such an encoded sequence is displayed, the images appear as in the original video sequence.

With difference coding, only the first image (I-frame) is coded in its entirety. In the two following images (P-frames), references are made to the first picture for the static elements, i.e. the house. Only the moving parts, i.e. the running man, are coded using motion vectors, thus reducing the amount of information that is sent and stored.
Video compression reducing the amount of data transmitted

Other techniques such as block-based motion compensation can be applied to further reduce the data. Block-based motion compensation takes into account that much of what makes up a new frame in a video sequence can be found in an earlier frame, but perhaps in a different location. This technique divides a frame into a series of macroblocks (blocks of pixels). Block by block, a new frame can be composed or ‘predicted’ by looking for a matching block in a reference frame. If a match is found, the encoder codes the position where the matching block is to be found in the reference frame. Coding the motion vector, as it is called, takes up fewer bits than if the actual content of a block were to be coded.

Illustration of block-based motion compensation.
Block-based motion compensation

With interframe prediction, each frame in a sequence of images is classified as a certain type of frame, such as an I-frame, P-frame or B-frame.

An I-frame, or intra frame, is a self-contained frame that can be independently decoded without any reference to other images. The first image in a video sequence is always an I-frame. I-frames are needed as starting points for new viewers or resynchronization points if the transmitted bit stream is damaged. I-frames can be used to implement fast-forward, rewind and other random access functions. An encoder will automatically insert I-frames at regular intervals or on demand if new clients are expected to join in viewing a stream. The drawback of I-frames is that they consume much more bits, but on the other hand, they do not generate many artifacts, which are caused by missing data.

A P-frame, which stands for predictive inter frame, makes references to parts of earlier I and/or P frame(s) to code the frame. P-frames usually require fewer bits than I-frames, but a drawback is that they are very sensitive to transmission errors because of the complex dependency on earlier P and/or I frames.

A B-frame, or bi-predictive inter frame, is a frame that makes references to both an earlier reference frame and a future frame. Using B-frames increases latency.

A typical sequence with I-, B- and P-frames. A P-frame may only reference preceding I- or P-frames, while a B-frame may reference both preceding and succeeding I- or P-frames.
I-, B- and P-frames

Add Feedback