Latency Break Down - Nixes/wifibroadcast GitHub Wiki

A full description of how video is processed within the raspberry pi, including limitations of the pipeline. From here

To give some background as to what is happening in video mode.
- The sensor is going at the requested frame rate. No if's, not buts.
- There are normally 3 buffers allocated to handle those incoming Bayer frames - one being filled, one being fed into the ISP, and one spare. The control software deals with buffer swapping.
- The ISP can run at about 120-150MPix/s depending on how the frame is broken up.
- It writes to a buffer from a pool of buffers. This is the pool that has been increased in size. If there isn't a buffer available, then the ISP can't start the frame. If the camera starts producing a new frame before the last has been passed into the ISP, then the old one is dropped.
- When the ISP completes a frame, if there hasn't been a request for the buffer from the layer above (the IL/MMAL component) then again the frame is dropped.
- The MMAL/IL layer will request a frame when the previous one has been passed on to whatever is downstream.

- If it is the video encoder that is downstream, it can queue a number of jobs up. In other words it will soak up all those extra buffers if it is running behind, and only then will it start causing the camera component to start dropping. This is mainly of benefit during startup as opening the codec can take around 60ms, so absorbs the buffers and normally has capacity to catch up again once going.

Raspividyuv is obviously pulling frames back to the ARM, therefore another place for stalls is if it isn't delivering buffers back to the GPU fast enough to keep the system running. Adding in extra buffers (increase VIDEO_OUTPUT_BUFFERS_NUM by the looks of it) means that there should always be extras available on the GPU and it shouldn't stall.

There are subtleties involved on the GPU side. Format conversions and some other image processing is done using the Vector Register File (VRF) which is the heart of the SIMD side of the GPU. There are 2 of them, but the RTOS doesn't dynamically shift processes between cores. The main things that will use that are the format conversion on the output of the component (particularly if you ask for RGB), tuner algorithms (AGC, AWB, etc), and the denoise algorithm. You can disable the denoise algorithm using MMAL_PARAMETER_VIDEO_DENOISE if you think that is what is causing your issue, but there's no debug available to tell you what the loading/contention is on the VRF - suck it and see.

Break down of latency composition (based on some benefitiv experiments):

  • 40ms Image Acquisition (buffering 2 frames?)
  • 10ms Frame Encode (h.264)
  • 10ms Wi-Fi + FEC
  • Reception + FEC Decoding + Display: Remaining ~50-100ms