LLM stacks \ 4.5.1 External data RAG - terrytaylorbonn/auxdrone GitHub Wiki

Agent calls RAG DB API ✅. Why does the agent decided to call RAG?
- Depends on implementation! In simple_rag.py, it always calls RAG first
- More sophisticated agents might:
  - Check if question seems factual first
  - Try RAG, then fallback to general knowledge
  - Let the model decide: "Do I need to search documents?"
1 Agent vectorizes prompt ✅
- Agent uses embedding model (e.g., "nomic-embed-text") to convert prompt to vectors
- Same embedding model was used to create the RAG database vectors
2 Agent finds best match and retrieves ORIGINAL TEXT ✅
- "Original text" is the actual readable text that was vectorized
- The vectors are just for finding the text
- for example: [return results["documents"][0]] ← This is readable text
3 Agent gives retrieved text to model. Agent also sends the original human prompt.
- For example :
  - context = relevant_docs[0][:200] # Retrieved text
  - prompt = f"Based on this: '{context}' Answer: {question}" # Both context + original question