CodeProfiling - shibotsu/obs-clone GitHub Wiki

๐Ÿงช Code Profiling: sendVideoFrame()

In this section, we'll compare two versions of a function responsible for sending video frames using FFmpeg. The first one contains poor practices โ€” highlights several performance and memory issues, while the second demonstrates clean, production-ready implementation.


โŒ The original version: sendVideoFrame()

bool FFmpegRecorder::sendVideoFrame(unsigned char* rgbaData, int64_t timestamp) {
    if (!rgbaData || !videoCodecContext) {
        fprintf(stderr, "Invalid input for sendVideoFrameWorse\n");
        return false;
    }

    SwsContext* tempSwsContext = sws_getContext(videoCodecContext->width,
    videoCodecContext->height, AV_PIX_FMT_BGRA, videoCodecContext->width,
    videoCodecContext->height, AV_PIX_FMT_YUV420P, SWS_BICUBIC,
    nullptr, nullptr, nullptr);

    if (!tempSwsContext) {
        fprintf(stderr, "Failed to allocate SWS context in
        sendVideoFrameWorse\n");
        return false;
    }

    AVFrame* frame = av_frame_alloc();
    if (!frame) {
        fprintf(stderr, "Failed to allocate video frame in worse version\n");
        return false;
    }

    frame->width = videoCodecContext->width;
    frame->height = videoCodecContext->height;
    frame->format = videoCodecContext->pix_fmt;

    int ret = av_frame_get_buffer(frame, 1);
    if (ret < 0) {
        char errBuf[AV_ERROR_MAX_STRING_SIZE] = { 0 };
        av_strerror(ret, errBuf, AV_ERROR_MAX_STRING_SIZE);
        fprintf(stderr, "Failed to allocate video frame buffer: %s\n",
        errBuf);
        av_frame_free(&frame);
        sws_freeContext(tempSwsContext);
        return false;
    }

    uint8_t* srcData[1] = { rgbaData };
    int srcLinesize[1] = { 4 * videoCodecContext->width };

    sws_scale(tempSwsContext, srcData, srcLinesize, 0,
        videoCodecContext->height, frame->data, frame->linesize);

    frame->pts = av_rescale_q(timestamp, {1, 1000},
    videoCodecContext->time_base);

    bool success = (encodeVideoFrame(frame) >= 0);

    av_frame_free(&frame);

    // โŒ Forgot to free tempSwsContext
    return success;
}

๐Ÿ” Why this version is problematic

  1. Re-initializing sws_getContext on every frame
    • sws_getContext is computationally expensive and meant to be reused
    • High CPU usage, frame drops, and overall poor performance
  2. Allocating AVFrame and buffer for every frame
    • Repeatedly calling av_frame_alloc and av_frame_get_buffer without reuse
    • Increased memory churn and slower execution
  3. Missing sws_freeContext() on failure paths
    • Memory leak if error occurs before freeing tempSwsContext
    • Steady memory growth โ†’ eventual crash in long-running apps
  4. Poor buffer alignment with av_frame_get_buffer(frame, 1)
    • 1-byte alignment is insufficient for many SIMD operations
    • Slower pixel processing
  5. No call to av_frame_make_writable()
    • If frames were pooled/reused, they might be read-only
    • May silently fail or corrupt data
  6. Using external timestamp directly for PTS
    • timestamp might be jittery or non-monotonic
    • Playback glitches, A/V desync, or encoder errors
  7. Memory leak: forgot to sws_freeContext()
    • tempSwsContext is never released
    • Massive memory leak across many frames

โœ… The Improved Version: sendVideoFrame()

bool FFmpegRecorder::sendVideoFrame(unsigned char* rgbaData, int64_t timestamp) {
    if (!rgbaData || !videoCodecContext || !swsContext) {
        fprintf(stderr, "Invalid input for sendVideoFrame\n");
        return false;
    }

    int64_t currentRecordingRelativeMs = getRecordingTimestamp();
    int64_t video_pts_val = av_rescale_q(currentRecordingRelativeMs,
        AVRational{ 1, 1000 },
        videoCodecContext->time_base);

    AVFrame* frame = av_frame_alloc();
    if (!frame) {
        fprintf(stderr, "Failed to allocate video frame\n");
        return false;
    }

    frame->width = videoCodecContext->width;
    frame->height = videoCodecContext->height;
    frame->format = videoCodecContext->pix_fmt;

    int ret = av_frame_get_buffer(frame, 32);
    if (ret < 0) {
        char errBuf[AV_ERROR_MAX_STRING_SIZE] = { 0 };
        av_strerror(ret, errBuf, AV_ERROR_MAX_STRING_SIZE);
        fprintf(stderr, "Failed to allocate video frame buffer: %s\n", errBuf);
        av_frame_free(&frame);
        return false;
    }

    ret = av_frame_make_writable(frame);
    if (ret < 0) {
        char errBuf[AV_ERROR_MAX_STRING_SIZE] = { 0 };
        av_strerror(ret, errBuf, AV_ERROR_MAX_STRING_SIZE);
        fprintf(stderr, "Failed to make video frame writable: %s\n", errBuf);
        av_frame_free(&frame);
        return false;
    }

    uint8_t* srcData[1] = { rgbaData };
    int srcLinesize[1] = { 4 * videoCodecContext->width };

    sws_scale(swsContext, srcData, srcLinesize, 0,
        videoCodecContext->height, frame->data, frame->linesize);

    frame->pts = video_pts_val;
    videoFrameCount++;

    bool success = (encodeVideoFrame(frame) >= 0);

    av_frame_free(&frame);
    return success;
}

โœ… Why this version is better

  • Reuses swsContext, initialized once in initialize().
  • Manages memory properly: allocates and frees per frame without leaks.
  • Uses optimal 32-byte alignment for better performance.
  • Calls av_frame_make_writable() to prevent undefined behavior on reused frames.
  • Uses consistent timestamp source (getRecordingTimestamp()), which ensures monotonicity.
  • Clean encoding pipeline: receives data โ†’ converts โ†’ encodes โ†’ frees.