agent‐based systems - chunhualiao/public-docs GitHub Wiki

leture 6.

Basics

agent

  • An “intelligent” system that interacts with some “environment”
    • Physical environments: robot, autonomous car, …
    • Digital environments: DQN for Atari, Siri, AlphaGo, …
    • Humans as environments: chatbot

Why LLM Agents?

  • Solving real-world tasks typically involves a trial-and-error process
  • Leveraging external tools and retrieving from external knowledge expand LLM’s capabilities
  • task decomposition: allocation of subtasks to specialized modules

Challenges

List

  • Reasoning and planning: LLM agents tend to make mistakes when performing complex tasks end-to-end
  • Embodiment and learning from environment feedback
    • LLM agents are not yet efficient at recovering from mistakes for long-horizon tasks
    • Continuous learning, self-improvement
    • Multimodal understanding, grounding and world models
  • safety and privacy: LLMs are susceptible to adversarial attacks, can emit harmful messages and leak private data
  • human-agent interaction, ethics: How to effectively control the LLM agent behavior, and design the interaction mode between humans and LLM agents

Domains

  • code generation: Cursor, Github Copilot, Devin, Replit...
  • workflow automation: Microsoft Copilot, Multi-On
  • personal assistant: Google Astra, OpenAI GPT-ro
  • Robotics: Figure AI, Tesla Optimus
  • Education
  • Law
  • Finance
  • Healthcare
  • Cybersecurity

https://rdi.berkeley.edu/

Tool Use

Tools could be

  • search engines
  • calculators
  • task specific models
  • APIs

Unnatural format requires task/tool-specific fine-tuning

papers

  • TALM: Tool Augmented Language Models
  • Toolformer: Language Models Can Teach Themselves to Use Tools

Frameworks:

prompting techniques

Reflexion: Language Agents with Verbal Reinforcement Learning, 2023

FireAct: Toward Language Agent Fine-tuning, 2023

AutoGen: mutil-agent conversation programming // seems popular

Langraph:graph-based control flow

CrewAI: high-level static agent-task workflow

More multi-agent framework

Courses

Large Language Model Agents

https://llmagents-learning.org/f24

Benchmarks

benchmark

Hackathons

https://rdi.berkeley.edu/llm-agents-hackathon/

Organizations

https://rdi.berkeley.edu/