H264 in RTMP - dalmatele/ffmpeg-libav-tutorial GitHub Wiki

In this document, I'll note some info about how to package H264 frame in RTMP message.

How to send H264 frame through RTMP

Suppose, we already had an encode H264 frame.

To send it through RTMP protocol, we need to package it into a flv format.

To package a frame into flv format, the first step is put your frame in a AMF video data item, it means you prefix and suffix it with some data base on its NALU content.

This is pseudo code for these steps:

if idr 
flv[0] = 0x17 // 0x10 key frame; 0x07 h264 codec id
flv[1] = 0x01 // 0 sequence header; 1 nalu; 2 end of seq
flv[2] = 0 // pres offset
flv[3] = 0 // pres offset
flv[4] = 0 // pres offset
flv[5] = 0 // size
flv[6] = 0 // size cont
flv[7] = 0 // size cont
flv[8] = 0 // size cont

else if coded slice
flv[0] = 0x27
flv[1] = 0x01
flv[2] = 0 // pres offset
flv[3] = 0 // pres offset
flv[4] = 0 // pres offset
flv[5] = 0 // size
flv[6] = 0 // size cont
flv[7] = 0 // size cont
flv[8] = 0 // size cont

else if PPS or SPS
.... skipping this here as its really complicated, this is the h264/AVC configuration data

copy(encoded, 0, flv, 9, encoded.length)

flv[flv.length - 1] = 0

Next step is package AMF video data into a RTMP message end send it. You can see detail in here

Frame type in H264

  • I-frames: Also known as key frames, I-frames are completely self-referential and don't use information from any other frames. These are the largest frames of the three, and the highest-quality, but the least efficient from a compression perspective.

  • P-frames: P-frames are "predicted" frames. When producing a P-frame, the encoder can look backwards to previous I or P-frames for redundant picture information. P-frames are more efficient than I-frames, but less efficient than B-frames.

  • B-frames: B-frames are bi-directional predicted frames. As you can see in Figure 5, this means that when producing B-frames, the encoder can look both forwards and backwards for redundant picture information. This makes B-frames the most efficient frame of the three. Note that B-frames are not available when producing using H.264's Baseline Profile.

  • SPS: unit contains parameters that apply to a series of consecutive coded video pictures, referred to as a “coded video sequence” in the h.264 standard. See more here

  • PPS: contains parameters that apply to the decoding of one or more individual pictures inside a coded video sequence

NAL unit type

See here

How to write a NAL into a chunk:

This is format of a chunk .

NAL size = sizeof(NAL data)

NAL size is a 4 byte component

What is decode delay:

⚠️ **GitHub.com Fallback** ⚠️