Brainstorming - AIMLOps-C4-G16/aimlops-capstone-project GitHub Wiki
Possible prototypes
- Image captioning from Google Cloud (link)
- Microsoft Image Captioning (link)
- Huggingface (link)
- Text to Image Search using standard Open AI CLIP Model + FAISS vector store (link) (link)
Architectures
- YOLOv3 encoder + Llama 2 decoder
- finetune existing VLM eg. Llama 3.2 vision instruct