Course :: Develop RAG application (Retrieval‐Augmented Generation) in practice - up1/training-courses GitHub Wiki

Develop RAG application (Retrieval-Augmented Generation) in practice

  • 3 days

Outline

Day 1 : Foundations of RAG & Setup

  • Introduction to Retrieval-Augmented Generation (RAG)
    • What is RAG?
    • Why is it important?
    • Use cases and real-world applications
    • Comparison of RAG vs. Traditional LLM responses (Prompt engineering)
  • Understanding the Components of a RAG System
    • Document Retrieval (Vector Databases, Keyword Search)
    • Embeddings & Similarity Search
    • Large Language Models (LLMs) and their role in RAG
  • Building a Simple RAG Pipeline
    • Data ingestion & preprocessing
      • Structured data
        • Database
      • Unstructured data
        • PDF files
    • Generating embeddings with LLM provider
      • Embeddings
      • Vector database
    • Implementing a basic retrieval system
    • Workshop
      • Creating a basic RAG system with a local document store
      • Use cases
        • Chatbot
        • Question/Answering from knowledge (Database and PDF file)
        • Log analysis
        • Data analysis

Day 2 : Enhancing RAG with Advanced Techniques

  • Deep Dive into Embeddings and Vector Search
    • Types of embedding models (OpenAI, Hugging Face and Amazon Bedrock)
    • Understanding vector search algorithms (FAISS, Pinecone, ChromaDB)
    • Workshop
      • Implementing vector search using ChromaDB
      • Use cases
        • Chatbot
        • Question/Answering from knowledge (Database and PDF file)
        • Log analysis
        • Data analysis
  • Optimizing Retrieval in RAG
    • What is Chunking in RAG ?
      • Improved Accuracy
      • Enhanced Efficiency
      • Preserved Context
      • Information Access
    • Chunking strategies for better retrieval (Chunking Considerations)
      • Fixed Size Chunking
      • Recursive Chunking
      • Document Based Chunking
      • Semantic Chunking
      • Agentic Chunking
    • Hybrid search
      • Combining keyword-based and vector search
    • Ranking and filtering techniques for retrieved documents
    • Workshop
      • Implementing a hybrid search
      • Use cases
        • Chatbot
        • Question/Answering from knowledge (Database and PDF file)
        • Log analysis
        • Data analysis

Day 3 : Deploying & Scaling RAG Applications

  • Integrating RAG with APIs and Web Applications
    • Exposing RAG as a REST API using FastAPI
    • Frontend integration with web apps (Streamlit)
    • Workshop
      • Deploying a simple RAG-based
        • Chatbot
        • Question/Answering from knowledge (Database and PDF file)
        • Log analysis
        • Data analysis
  • Scaling and Performance Optimization
    • Caching responses for faster results
    • Distributed search and multi-vector index strategies
    • Handling large-scale document ingestion
    • Workshop
      • Optimizing a RAG pipeline for high performance
      • Use cases