H264 in RTMP - dalmatele/ffmpeg-libav-tutorial GitHub Wiki
In this document, I'll note some info about how to package H264 frame in RTMP message.
Suppose, we already had an encode H264 frame.
To send it through RTMP protocol, we need to package it into a flv format.
To package a frame into flv format, the first step is put your frame in a AMF video data item, it means you prefix and suffix it with some data base on its NALU content.
This is pseudo code for these steps:
if idr
flv[0] = 0x17 // 0x10 key frame; 0x07 h264 codec id
flv[1] = 0x01 // 0 sequence header; 1 nalu; 2 end of seq
flv[2] = 0 // pres offset
flv[3] = 0 // pres offset
flv[4] = 0 // pres offset
flv[5] = 0 // size
flv[6] = 0 // size cont
flv[7] = 0 // size cont
flv[8] = 0 // size cont
else if coded slice
flv[0] = 0x27
flv[1] = 0x01
flv[2] = 0 // pres offset
flv[3] = 0 // pres offset
flv[4] = 0 // pres offset
flv[5] = 0 // size
flv[6] = 0 // size cont
flv[7] = 0 // size cont
flv[8] = 0 // size cont
else if PPS or SPS
.... skipping this here as its really complicated, this is the h264/AVC configuration data
copy(encoded, 0, flv, 9, encoded.length)
flv[flv.length - 1] = 0
Next step is package AMF video data into a RTMP message end send it. You can see detail in here
-
I-frames: Also known as key frames, I-frames are completely self-referential and don't use information from any other frames. These are the largest frames of the three, and the highest-quality, but the least efficient from a compression perspective.
-
P-frames: P-frames are "predicted" frames. When producing a P-frame, the encoder can look backwards to previous I or P-frames for redundant picture information. P-frames are more efficient than I-frames, but less efficient than B-frames.
-
B-frames: B-frames are bi-directional predicted frames. As you can see in Figure 5, this means that when producing B-frames, the encoder can look both forwards and backwards for redundant picture information. This makes B-frames the most efficient frame of the three. Note that B-frames are not available when producing using H.264's Baseline Profile.
-
SPS: unit contains parameters that apply to a series of consecutive coded video pictures, referred to as a “coded video sequence” in the h.264 standard. See more here
-
PPS: contains parameters that apply to the decoding of one or more individual pictures inside a coded video sequence
See here
How to write a NAL into a chunk:
This is format of a chunk .
NAL size = sizeof(NAL data)
NAL size is a 4 byte component