Multimodal - AshokBhat/ml GitHub Wiki About Systems that can understand different input types Such as text, speech, images, and videos. See also [GPT-4]] ](/AshokBhat/ml/wiki/[[Gemini) | LLama 3.2