CDI (Cloud Digital Interface) Baseline Profile Video Format Specification

Document version

02.00

Document owner

Amazon Web Services

Summary

This document specifies the format of video payloads sent to and received from the CDI SDK.

Scope

The specifications in this document apply to video streams carried through the AWS CDI SDK's (Software Development Kit) AVM (Audio, Video, Metadata) API (Application Program Interface).

Status

current

Compatibility

CDI SDK version 1.0 and later support this version of the CDI baseline profile. Minor release number changes will only contain clarifications and corrections and therefore will maintain backwards compatibility with the SDK. The major number will be incremented when new features are added or any other incompatibility is introduced. A corresponding update to the CDI SDK will be required to support the changes documented in these future baseline profile documents.

Configuration Structure

The CdiAvmConfig structure is used to pass media format information through the CDI SDK for each stream. It contains three parts: a URI, a data array, and a data array size. The URI for this specification is defined to be https://cdi.elemental.com/specs/baseline-video.

The bytes of the array are the ASCII characters forming <name>=<value> pairs or standalone <name> entries, terminated by a semicolon and each entry is separated from the next by a space character as specified in section 7.1 of ST 2110-20-2017. All of the other pieces that form the SDP are not included. data_size is the total number of characters comprising the string of names, values, and separators. No terminating carriage return or NUL character shall be included.

Note: the CDI SDK provides CdiAvmMakeBaselineConfiguration() for generating an appropriate CdiAvmConfig structure to pass to CdiAvmTxPayload() and CdiAvmParseBaselineConfiguration() for parsing CdiAvmConfig structures from the receive payload callback function. These functions alleviate the need for application programs to deal directly with the CdiAvmConfig structure for CDI baseline media types, including video.

Section 7 of ST 2110-20-2017, Session Description Protocol (SDP) Considerations is leveraged for specifying the parameters applicable to video payloads. For compatibility sake, some parameters have fixed values in CDI, others have a subset of allowed values, and new parameters are specified in this document that go beyond the parameters available in ST 2110.

The rectangle described by the partial_frame parameter must fall completely within the bounds of the frame whose dimensions are specified by the width and height parameters. Additional restrictions on the values are described below.

Payload Format

Example 1

Example 2

Data must be placed into memory buffers with the proper format before transmission in order for the receiving end to be able to consume it. The application is solely responsible for this formatting. The layout follows Sample Row Data Segments as specified in ST 2110-22-2017, section 6.2. Video payloads transported by the CDI SDK shall only contain sample data. Sample Row Data Headers shall not be present in the buffer. Sample order and layout in memory is as shown in Table 2 for YCbCr-4:2:2 and in Table 1 for YCbCr-4:4:4 of ST 2110-20-2017. Unlike in ST 2110-20, both fields of interlaced video are sent in the same payload; the second field's first sample follows immediately after the last byte of the last sample of the first field while honoring padding rules.

Interlaced video affects how the partial_frame configuration parameter is interpreted. The vertical offset and height specified must fall within the range of lines for the first field. The corresponding lines of the second field are implicitly included in the resulting rectangle.

Pgroups are honored by the transmit side packetizer which guarantees that scatter gather list entries on the receive side do not have boundaries that fall within samples.

An additional consideration for using the CDI SDK is whether an alpha channel is enabled. If so, the primary samples are placed first in the memory buffer, according to the usual layout rules, followed immediately by the alpha samples (luma only) with the same bit depth and layout as the primary samples.

When an alpha channel is used with interlaced video, the order is:

  1. primary field 1
  2. primary field 2
  3. alpha field 1
  4. alpha field 2

References

  1. IETF RFC 4566: https://tools.ietf.org/html/rfc4566

  2. ST 2110-20:2017 - SMPTE Standard - Professional Media Over Managed IP Networks: Uncompressed Active Video," in ST 2110-20:2017, 27 Nov. 2017, doi: 10.5594/SMPTE.ST2110-20.2017: https://ieeexplore.ieee.org/document/8167389