Multimodal - AshokBhat/ml GitHub Wiki

About

  • Systems that can understand different input types
  • Such as text, speech, images, and videos.

See also

  • [GPT-4]] ](/AshokBhat/ml/wiki/[[Gemini) | LLama 3.2