FFmpegOutputHandler - shibotsu/obs-clone GitHub Wiki
FFmpegRecorder
FFmpegRecorder
is a C++ class designed to handle real-time video and audio recording using the FFmpeg library. It supports encoding video (e.g., H.264) and audio (e.g., AAC) and writing them to a file or streaming endpoint.
This recorder handles frame timing, resampling, color format conversion, and interleaving streams into a muxed output file.
✅ Features
- Video encoding to H.264 (YUV420P)
- Audio encoding to AAC (FLT to target format)
- Automatic resampling and pixel format conversion
- Output to files (.mp4, .mkv, etc.) or network streams (e.g., rtmp://)
- Time synchronization and timestamping
- FIFO buffering for smoother audio input
- Adjustable parameters (resolution, FPS, sample rate, etc.)
🔧 Initialization
bool initialize(const char\* outputFile, int width, int height, int fps, int sampleRate, int channels);
This method configures the output context, selects appropriate encoders based on file/stream destination, and sets up:
- Video and audio codec contexts
- Resampling (SwrContext) and pixel format conversion (SwsContext)
- FIFO buffer for audio
- File or stream I/O
- FFmpeg muxing headers
Arguments:
- outputFile: Target file path or stream URL
- width, height: Video resolution
- fps: Video frame rate
- sampleRate: Audio sample rate (e.g., 44100)
- channels: Number of audio channels
🟢 Recording
bool sendVideoFrame(unsigned char\* rgbaData, int64_t timestamp);
Converts incoming RGBA video data to YUV420P and sends it for encoding.
- Timestamps are converted to stream time_base using FFmpeg scaling.
- Allocates and prepares an AVFrame for encoding.
- Encoded frames are written to the output stream.
⏱ Timestamping
int64_t getRecordingTimestamp();
Returns the number of milliseconds since recording started using a high-resolution clock. Used to maintain sync between audio and video.
🎵 Audio Flow
The functions are designed to:
- Accept interleaved AV_SAMPLE_FMT_FLT samples
- Use SwrContext to resample to codec-compatible format
- Accumulate samples in a FIFO until a full frame (e.g., 1024 samples) can be encoded
Key Members (Audio):
m_aacFrameSize
: Frame size for AACm_audioFifo
: Sample buffer to align input to required frame size- m_audioSamplesProcessed, m_totalAudioSamplesQueued: Stats
m_audioTimingInitialized
: Ensures sync logic starts from a stable point
🧠 Internal Design Notes
Stream Format Decision:
-
Chooses container and codec formats based on outputFile (e.g., flv for RTMP, guessed for .mp4)
-
Encoder Hints: Preset: "medium"
-
CRF: "23" for good quality and compression
Robust Logging:
- Uses av_strerror to provide clear FFmpeg error messages
🛑 Dependencies
You need FFmpeg development libraries installed:
libavformat
libavcodec
libswscale
libswresample
libavutil