FFmpegOutputHandler - shibotsu/obs-clone GitHub Wiki

FFmpegRecorder

FFmpegRecorder is a C++ class designed to handle real-time video and audio recording using the FFmpeg library. It supports encoding video (e.g., H.264) and audio (e.g., AAC) and writing them to a file or streaming endpoint.

This recorder handles frame timing, resampling, color format conversion, and interleaving streams into a muxed output file.

✅ Features

  • Video encoding to H.264 (YUV420P)
  • Audio encoding to AAC (FLT to target format)
  • Automatic resampling and pixel format conversion
  • Output to files (.mp4, .mkv, etc.) or network streams (e.g., rtmp://)
  • Time synchronization and timestamping
  • FIFO buffering for smoother audio input
  • Adjustable parameters (resolution, FPS, sample rate, etc.)

🔧 Initialization

bool initialize(const char\* outputFile, int width, int height, int fps, int sampleRate, int channels);

This method configures the output context, selects appropriate encoders based on file/stream destination, and sets up:

  • Video and audio codec contexts
  • Resampling (SwrContext) and pixel format conversion (SwsContext)
  • FIFO buffer for audio
  • File or stream I/O
  • FFmpeg muxing headers

Arguments:

  • outputFile: Target file path or stream URL
  • width, height: Video resolution
  • fps: Video frame rate
  • sampleRate: Audio sample rate (e.g., 44100)
  • channels: Number of audio channels

🟢 Recording

bool sendVideoFrame(unsigned char\* rgbaData, int64_t timestamp);

Converts incoming RGBA video data to YUV420P and sends it for encoding.

  • Timestamps are converted to stream time_base using FFmpeg scaling.
  • Allocates and prepares an AVFrame for encoding.
  • Encoded frames are written to the output stream.

⏱ Timestamping

int64_t getRecordingTimestamp();

Returns the number of milliseconds since recording started using a high-resolution clock. Used to maintain sync between audio and video.

🎵 Audio Flow

The functions are designed to:

  • Accept interleaved AV_SAMPLE_FMT_FLT samples
  • Use SwrContext to resample to codec-compatible format
  • Accumulate samples in a FIFO until a full frame (e.g., 1024 samples) can be encoded

Key Members (Audio):

  • m_aacFrameSize: Frame size for AAC
  • m_audioFifo: Sample buffer to align input to required frame size
  • m_audioSamplesProcessed, m_totalAudioSamplesQueued: Stats
  • m_audioTimingInitialized: Ensures sync logic starts from a stable point

🧠 Internal Design Notes

Stream Format Decision:

  • Chooses container and codec formats based on outputFile (e.g., flv for RTMP, guessed for .mp4)

  • Encoder Hints: Preset: "medium"

  • CRF: "23" for good quality and compression

Robust Logging:

  • Uses av_strerror to provide clear FFmpeg error messages

🛑 Dependencies

You need FFmpeg development libraries installed:

  • libavformat
  • libavcodec
  • libswscale
  • libswresample
  • libavutil