14: DESIGN YOUTUBE - swchen1234/systemDesign GitHub Wiki

Youtube 因其广泛而受欢迎,利用到了大量的complex technologies.

Step 1 - Understand the problem and establish design scope

提问:

  • important features
  • what kind of clients? mobile app/web/tv?
  • average user usage
  • global?
  • resolution
  • encryption
  • video size limit?
  • can we leverage existing cloud infra?

Back of the envelope estimation

  • Assume the product has 5 million daily active users (DAU).
  • Users watch 5 videos per day.
  • 10% of users upload 1 video per day.
  • Assume the average video size is 300 MB.
  • Total daily storage space needed: 5 million * 10% * 300 MB = 150TB
  • CDN cost
    • CDN charge roughly $150, 000/day

Step 2 - Propose high-level design and get buy-in

We will leverage existing cloud services, such as CDN and blob storage are the cloud services. 因为:

  • Building scalable blob storage or CDN is extremely complex and costly.
  • System design interviews are not about building everything from scratch

Client: You can watch YouTube on your computer, mobile phone, and smartTV. CDN: Videos are stored in CDN. When you press play, a video is streamed from the CDN. API servers: Everything else except video streaming goes through API servers. This includes feed recommendation, generating video upload URL, updating metadata database and cache, user signup, etc.

Video uploading flow

整个uploading flow可以分为两个进程,且并列运行

Flow a: upload the actual video

Flow b: update the metadata

While a file is being uploaded to the original storage, the client in parallel sends a request to update the video metadata ??不懂?flow a 中不是说metadata update由completion handler完成吗?

Video streaming flow

和download不同的是,When you watch streaming videos, your client loads a little bit of data at a time so you can watch videos immediately and continuously.

  • Streaming protocol is a standardized way to control data transfer for video streaming.
  • Different streaming protocols support different video encodings and playback players.
  • Videos are streamed from CDN directly. The edge server closest to you will deliver the video. Thus, there is very little latency。

Step 3 - Design deep dive

Video transcoding

When you record a video, the device (usually a phone or camera) gives the video file a certain format. If you want the video to be played smoothly on other devices, the video must be encoded into compatible bitrates and formats. Bitrate is the rate at which bits are processed over time. A higher bitrate generally means higher video quality. High bitrate streams need more processing power and fast internet speed.

video trancoding的重要原因:

  • Raw video consumes large amounts of storage space. An hour-long high definition video recorded at 60 frames per second can take up a few hundred GB of space.
  • Many devices and browsers only support certain types of video formats.
  • To ensure users watch high-quality videos while maintaining smooth playback, it is a good idea to deliver higher resolution video to users who have high network bandwidth and lower resolution video to users who have low bandwidth.
  • Network conditions can change, especially on mobile devices. To ensure a video is played continuously, switching video quality automatically or manually based on network conditions is essential for smooth user experience.

most of the encoding formats contain two parts:

  • Container: This is like a basket that contains the video file, audio, and metadata. You can tell the container format by the file extension, such as .avi, .mov, or .mp4.
  • Codecs: These are compression and decompression algorithms aim to reduce the video size while preserving the video quality.

Directed acyclic graph (DAG) model

different content creators may have different video processing requirements. To support different video processing pipelines and maintain high parallelism, it is important to add some level of abstraction and let client programmers define what tasks to execute. For example, Facebook’s streaming video engine uses a directed acyclic graph (DAG) programming model, which defines tasks in stages so they can be executed sequentially or parallelly.

Video transcoding architecture

Error handling

细节过于复杂,请见原书。

Step 4 - Wrap up

additional points:

  • Scale the API tier
  • Scale the database
  • Live streaming
  • Live streaming has a higher latency requirement, so it might need a different streaming protocol.
  • Live streaming has a lower requirement for parallelism because small chunks of data are already processed in real-time.
  • Live streaming requires different sets of error handling. Any error handling that takes too much time is not acceptable.
  • video takedowns:不好的违法的video要下架。