LLM stacks \ 4.5.1 External data RAG - terrytaylorbonn/auxdrone GitHub Wiki
25.0717 Lab notes (Gdrive), Git
- Agent calls RAG DB API ✅. Why does the agent decided to call RAG?
- Depends on implementation! In simple_rag.py, it always calls RAG first
- More sophisticated agents might:
- Check if question seems factual first
- Try RAG, then fallback to general knowledge
- Let the model decide: "Do I need to search documents?"
- 1 Agent vectorizes prompt ✅
- Agent uses embedding model (e.g., "nomic-embed-text") to convert prompt to vectors
- Same embedding model was used to create the RAG database vectors
- 2 Agent finds best match and retrieves ORIGINAL TEXT ✅
- "Original text" is the actual readable text that was vectorized
- The vectors are just for finding the text
- for example: [return results["documents"][0]] ← This is readable text
- 3 Agent gives retrieved text to model. Agent also sends the original human prompt.
- For example :
- context = relevant_docs[0][:200] # Retrieved text
- prompt = f"Based on this: '{context}' Answer: {question}" # Both context + original question
- For example :