AI Agent - MacKittipat/note-developer GitHub Wiki

AI Agent

Understanding LLMs (Level 1):

  • LLMs (like ChatGPT, Google Gemini, Claude) are the foundation of many AI applications.
  • They are excellent at generating and editing text based on their training data.
  • Key Traits of LLMs
    • Limited Knowledge of Proprietary Information. They do not have access to personal or internal company data (e.g., a personal calendar).
    • Passive. They require a human prompt to produce an output.

Understanding AI Workflows (Level 2):

  • AI workflows build on LLMs by allowing them to interact with external tools and data.
  • Humans define a predefined path or "control logic" for the LLM to follow.
  • The LLM executes steps in this predefined path, which can involve retrieving information or using other models (e.g., accessing a calendar, using a weather API, employing a text-to-audio model).
  • Key Trait of AI Workflows:
    • They can only follow the predefined paths set by humans. If a follow-up question requires information or an action outside of that defined path, the workflow will fail.
  • Retrieval Augmented Generation (RAG): Introduced as a "fancy term" for a process that helps AI models look things up before answering (e.g., accessing a calendar or weather service). RAG is presented as a type of AI workflow.

Understanding AI Agents (Level 3):

  • AI agents represent a significant advancement from workflows because the LLM becomes the decision-maker. Instead of following a predefined path, the AI agent is given a goal.
  • Key Traits of AI Agents:
    • Reasoning: The AI agent autonomously determines the best approach to achieve the goal, thinking about the necessary steps and tools.
    • Acting: The AI agent takes action using tools to execute its plan.
    • Iterating: AI agents can observe their interim results and autonomously decide if adjustments or repetitions of steps are needed to improve the output and meet the goal. This replaces the human trial-and-error process seen in workflows.
  • The Defining Change: The human decision-maker in a workflow is replaced by an LLM.
  • ReAct Framework: Mentioned as the "most common configuration" for AI agents because it aligns with the core functions of Reasoning and Acting.