C Specification
When calling vkGetPhysicalDeviceVideoCapabilitiesKHR to query the
capabilities for an AV1 encode profile, the
VkVideoCapabilitiesKHR::pNext chain must include a
VkVideoEncodeAV1CapabilitiesKHR structure that will be filled with the
profile-specific capabilities.
The VkVideoEncodeAV1CapabilitiesKHR structure is defined as:
// Provided by VK_KHR_video_encode_av1
typedef struct VkVideoEncodeAV1CapabilitiesKHR {
VkStructureType sType;
void* pNext;
VkVideoEncodeAV1CapabilityFlagsKHR flags;
StdVideoAV1Level maxLevel;
VkExtent2D codedPictureAlignment;
VkExtent2D maxTiles;
VkExtent2D minTileSize;
VkExtent2D maxTileSize;
VkVideoEncodeAV1SuperblockSizeFlagsKHR superblockSizes;
uint32_t maxSingleReferenceCount;
uint32_t singleReferenceNameMask;
uint32_t maxUnidirectionalCompoundReferenceCount;
uint32_t maxUnidirectionalCompoundGroup1ReferenceCount;
uint32_t unidirectionalCompoundReferenceNameMask;
uint32_t maxBidirectionalCompoundReferenceCount;
uint32_t maxBidirectionalCompoundGroup1ReferenceCount;
uint32_t maxBidirectionalCompoundGroup2ReferenceCount;
uint32_t bidirectionalCompoundReferenceNameMask;
uint32_t maxTemporalLayerCount;
uint32_t maxSpatialLayerCount;
uint32_t maxOperatingPoints;
uint32_t minQIndex;
uint32_t maxQIndex;
VkBool32 prefersGopRemainingFrames;
VkBool32 requiresGopRemainingFrames;
VkVideoEncodeAV1StdFlagsKHR stdSyntaxFlags;
} VkVideoEncodeAV1CapabilitiesKHR;
Members
-
sTypeis a VkStructureType value identifying this structure. -
pNextisNULLor a pointer to a structure extending this structure. -
flagsis a bitmask of VkVideoEncodeAV1CapabilityFlagBitsKHR indicating supported AV1 encoding capabilities. -
maxLevelis aStdVideoAV1Levelvalue indicating the maximum AV1 level supported by the profile, as defined in section A.3 of the AV1 Specification. -
codedPictureAlignmentindicates the alignment at which the implementation will code pictures. This capability does not impose any valid usage constraints on the application. However, depending on thecodedExtentof the encode input picture resource, this capability may result in a change of the resolution of the encoded picture, as described in more detail below. -
maxTilesindicates the maximum number of AV1 tile columns and rows the implementation supports. -
minTileSizeindicates the minimum extent of individual AV1 tiles the implementation supports. -
maxTileSizeindicates the maximum extent of individual AV1 tiles the implementation supports. -
superblockSizesis a bitmask of VkVideoEncodeAV1SuperblockSizeFlagBitsKHR values indicating the supported AV1 superblock sizes. -
maxSingleReferenceCountindicates the maximum number of reference pictures the implementation supports when using single reference prediction mode. -
singleReferenceNameMaskis a bitmask of supported AV1 reference names when using single reference prediction mode. -
maxUnidirectionalCompoundReferenceCountindicates the maximum number of reference pictures the implementation supports when using unidirectional compound prediction mode. -
maxUnidirectionalCompoundGroup1ReferenceCountindicates the maximum number of reference pictures the implementation supports when using unidirectional compound prediction mode from reference frame group 1, as defined in section 6.10.24 of the AV1 Specification. -
unidirectionalCompoundReferenceNameMaskis a bitmask of supported AV1 reference names when using unidirectional compound prediction mode. -
maxBidirectionalCompoundReferenceCountindicates the maximum number of reference pictures the implementation supports when using bidirectional compound prediction mode. -
maxBidirectionalCompoundGroup1ReferenceCountindicates the maximum number of reference pictures the implementation supports when using bidirectional compound prediction mode from reference frame group 1, as defined in section 6.10.24 of the AV1 Specification. -
maxBidirectionalCompoundGroup2ReferenceCountindicates the maximum number of reference pictures the implementation supports when using bidirectional compound prediction mode from reference frame group 2, as defined in section 6.10.24 of the AV1 Specification. -
bidirectionalCompoundReferenceNameMaskis a bitmask of supported AV1 reference names when using bidirectional compound prediction mode. -
maxTemporalLayerCountindicates the maximum number of AV1 temporal layers supported by the implementation. -
maxSpatialLayerCountindicates the maximum number of AV1 spatial layers supported by the implementation. -
maxOperatingPointsindicates the maximum number of AV1 operating points supported by the implementation. -
minQIndexindicates the minimum quantizer index value supported. -
maxQIndexindicates the maximum quantizer index value supported. -
prefersGopRemainingFramesindicates that the implementation’s rate control algorithm prefers the application to specify the number of frames in each AV1 rate control group remaining in the current group of pictures when beginning a video coding scope. -
requiresGopRemainingFramesindicates that the implementation’s rate control algorithm requires the application to specify the number of frames in each AV1 rate control group remaining in the current group of pictures when beginning a video coding scope. -
stdSyntaxFlagsis a bitmask of VkVideoEncodeAV1StdFlagBitsKHR indicating capabilities related to AV1 syntax elements.
Description
singleReferenceNameMask,
unidirectionalCompoundReferenceNameMask, and
bidirectionalCompoundReferenceNameMask are encoded such that when bit
index i is set, it indicates support for the
AV1 reference name
STD_VIDEO_AV1_REFERENCE_NAME_LAST_FRAME + i.
|
Note
|
These masks indicate which elements of the |
codedPictureAlignment provides information about implementation
limitations to encode arbitrary resolutions.
In particular, some implementations may not be able to generate bitstreams
aligned to the requirements of the AV1 Specification (8x8).
In such cases, the implementation may override the width and height of the bitstream, in order to produce a
bitstream compliant to the AV1 Specification.
If such an override occurs, the encoded resolution of the coded picture is
enlargened, with the texel values used for the texel coordinates outside of
the bounds of the codedExtent of the encode input picture resource
being first governed by the rules regarding the
encode input picture granularity.
Any texel values outside of the region described by the encode input picture
granularity are implementation-defined.
Implementations should use well-defined values to minimize impact on the
produced encoded content.
|
Note
|
This capability does not impose additional application requirements. However, these overrides change the effective resolution of the bitstream and add padding pixels. Applications sensitive to such overrides can use this capability and the corresponding override behavior to compute the cropping needed to reproduce the original input of the encoding and transmit it in a side channel (i.e. by using cropping fields available in a container). Additionally, applications can explicitly consider this alignment in their coded extent, to avoid implementation-defined texel values being included in the encoded content. |
Document Notes
For more information, see the Vulkan Specification
This page is extracted from the Vulkan Specification. Fixes and changes should be made to the Specification, not directly.