ForagerRL - gama-platform/gama GitHub Wiki

The Smart Forager β€” Reinforcement Learning in GAMA

By Killian Trouillet

Welcome to the comprehensive tutorial on Reinforcement Learning with the GAMA platform. You will build a forager agent that learns to navigate toward food while avoiding obstacles β€” from a simple grid world to a continuous environment trained with Deep RL.


Part 1: Internal RL (GAML only)

Build a tabular Q-Learning agent entirely in GAML, step by step:

  1. Step 1: The Grid World β€” Create the 10Γ—10 environment with food and obstacles.
  2. Step 2: The Forager Agent β€” Define a simple agent that moves randomly.
  3. Step 3: Rewards and Episodes β€” Implement the reward system and simulation resets.
  4. Step 4: The Q-Table β€” Set up the agent's memory using map<string, float>.
  5. Step 5: Q-Learning Algorithm β€” Implement the Bellman equation and Ξ΅-greedy policy.
  6. Step 6: Visualization & Automatic Test β€” Add charts, heatmaps, and evaluate the learned policy.

Expected console output after training and testing


Part 2: Deep RL with Gymnasium β€” Continuous Forager

In this part, we move from the grid world to a continuous environment and train a neural network using PPO via the gama-gymnasium Python bridge.

  1. Step 7: Introduction & The Continuous World β€” Why Deep RL? Architecture overview. Continuous world setup.
  2. Step 8: The GymAgent Bridge β€” The bridge species, spaces, and GAMA↔Python communication.
  3. Step 9: Sensors, Movement & Rewards β€” Ray-cast sensors, velocity actions, reward shaping. Complete GAML model.
  4. Step 10: Headless Training with PPO β€” Python script, PPO explained, training process.
  5. Step 11: Testing in GAMA GUI β€” Load and visualize the trained policy. Summary.

Part 3: Multi-Agent Deep RL with PettingZoo β€” Cooperative Foragers

In this part, we extend the continuous world to multiple foragers that must cooperate: using gama-pettingzoo and independent PPO models, both agents must reach the food together.

  1. Step 12: From Single Agent to Multi-Agent β€” PettingZoo Parallel API, PetzAgent bridge, cooperative reward design.
  2. Step 13: The Multi-Agent GAML Model β€” Multi-forager species, observation sharing, reward logic.
  3. Step 14: Training Multiple Agents β€” Independent PPO, PetzSingleAgentEnv Gymnasium wrapper, alternating training rounds.
  4. Step 15: Testing & Tutorial Summary β€” GUI testing, success criteria, recap of all 3 parts.
⚠️ **GitHub.com Fallback** ⚠️