3 ChatMAQ - terrytaylorbonn/auxdrone GitHub Wiki

25.0223 (0214) (Gdrive)

4.5 RAG

(was 3 Chat (CusSup) (MAQ/LLM/RAG))

Why RAG?

(many of the following ideas below are my own non-expert concepts; not sure if they are correct)

I want to get local docs that I create "injected" into ChatGPT. The user can use GPT instead of reading my docs. I was not sure this was possible.

After a few days of searching the internet, I think found what I am looking for: RAG.

  • G = Generate a (GPT) response.
  • A = Augment the response with my own custom docs.
  • R = Retrieve my content (that I put into a custom Supabase DB using my own crawler; demo'd in #299).



My diagrams (first draft)

The normal method.

image


With RAG added local context to the response.

image




















image

* "scrape" = use AI tools to crawl over info sources (planning docs, docx's, lab notes) to collect info.

About using AI: If you automate too much of dev and writing with AI, then you might end up like someone with an AI driven (battery powered) car:

  • You have no control over whats under the hood.
  • Only a real expert can fix anything.
  • The AI may fail (and leave you stranded or worse).

I want to use only as much AI as currently is reasonable (with eyes on the future).

image




#299 Demo test

Doc #299_(OK)_rag_ai_pydantic_COLE_.docx describes how I did the RAG (retrieval augmented generation) demo from Cole Medin "The Future of RAG is Agentic - Learn this Strategy NOW" Cole Medin @ https://www.youtube.com/watch?v=_R-ff4ZMLC8.

With RAG (localhost)

image

image

Without RAG

image

image




1 QUERY #317

1.1 local DB

image

1.2 sentence transformer

image

1.3 generated embeddings

image

image

1.4 mongo vector search index

image

1.5 generate embedding for query and return results

Note: limited results because vectors only for limited DB entries to avoid costs... but it works.

image

⚠️ **GitHub.com Fallback** ⚠️