+44 (0)1628 947 717

Scalable video coding (SVC) is a technique that enables a video stream to be broken into multiple layers of resolution, quality and frame rate.

Scalable Video Coding (SVC) is the name for the Annex G extension of the H.264/MPEG-4 AVC video compression standard. SVC standardises the encoding of a high-quality video bitstream that also contains one or more subset bitstreams. A subset video bitstream is derived by dropping packets from the larger video to reduce the bandwidth required for the subset bitstream. The subset bitsteam can represent a lower spatial resolution (smaller screen), lower temporal resolution (lower frame rate), or lower quality video signal. H.264/MPEG-4 AVC was developed jointly by ITU-T and ISO/IEC JTC 1. These two groups created the Joint Video Team (JVT) to develop the H.264/MPEG-4 AVC standard.

The objective of the SVC standardisation has been to enable the encoding of a high-quality video bitstream that contains one or more subset bitstreams that can themselves be decoded with a complexity and reconstruction quality similar to that achieved using the existing H.264/MPEG-4 AVC design with the same quantity of data as in the subset bitstream. The subset bitstream is derived by dropping packets from the larger bitstream.

A subset bitstream can represent a lower spatial resolution, or a lower temporal resolution, or a lower quality video signal (each separately or in combination) compared to the bitstream it is derived from. The following modalities are possible:

  • Temporal (frame rate) scalability: the motion compensation dependencies are structured so that complete pictures (i.e. their associated packets) can be dropped from the bitstream. (Temporal scalability is already enabled by H.264/MPEG-4 AVC. SVC has only provided supplemental enhancement information to improve its usage.)
  • Spatial (picture size) scalability: video is coded at multiple spatial resolutions. The data and decoded samples of lower resolutions can be used to predict data or samples of higher resolutions in order to reduce the bit rate to code the higher resolutions.
  • SNR/Quality/Fidelity scalability: video is coded at a single spatial resolution but at different qualities. The data and decoded samples of lower qualities can be used to predict data or samples of higher qualities in order to reduce the bit rate to code the higher qualities.
  • Combined scalability: a combination of the 3 scalability modalities described above.

Profiles and levels

As a result of the Scalable Video Coding extension, the standard contains three additional scalable profiles: Scalable Baseline, Scalable High, and Scalable High Intra. These profiles are defined as a combination of the H.264/MPEG-4 AVC profile for the base layer (2nd word in scalable profile name) and tools that achieve the scalable extension:

  • Scalable Baseline Profile: Mainly targeted for conversational, mobile, and surveillance applications.
    • A bitstream conforming to Scalable Baseline profile contains a base layer bitstream that conforms to a restricted version of Baseline profile of H.264/MPEG-4 AVC.
    • Supports B slices, weighted prediction, CABAC entropy coding, and 8×8 luma transform in enhancement layers (CABAC and the 8×8 transform are only supported for certain levels), although the base layer has to conform to the restricted Baseline profile, which does not support these tools. Coding tools for interlaced sources are not included.
    • Spatial scalable coding is restricted to resolution ratios of 1.5 and 2 between successive spatial layers in both horizontal and vertical direction and to macroblock-aligned cropping.
    • Quality and temporal scalable coding are supported without any restriction.

  • Scalable High Profile: Primarily designed for broadcast, streaming, storage and Videoconferencing applications.
    • A bitstream conforming to Scalable High profile contains a base layer bitstream that conforms to High profile of H.264/MPEG-4 AVC.
    • Supports all tools specified in the Scalable Video Coding extension.
    • Spatial scalable coding without any restriction, i.e., arbitrary resolution ratios and cropping parameters is supported.
    • Quality and temporal scalable coding are supported without any restriction.

  • Scalable High Intra Profile: Mainly designed for professional applications.
    • Uses Instantaneous Decoder Refresh (IDR) pictures only. IDR pictures can be decoded without reference to previous frames.
    • A bitstream conforming to Scalable High Intra profile contains a base layer bitstream that conforms to High profile of H.264/MPEG-4 AVC with only IDR pictures allowed.
    • All scalability tools are allowed as in Scalable High profile but only IDR pictures are permitted in any layer.