ForagerRL_step16 - gama-platform/gama GitHub Wiki

16. Stress Test — 10 Random Agents

By Killian Trouillet


Step 16: Stress Test with 10 Agents

Content

In Steps 12–15 we trained and tested a cooperative policy with 2 foragers. Because we used parameter sharing, the single trained network generalises to any number of agents — each one independently runs its own forward pass through the same weights.

In this last step, we evaluate that claim without any re-training: we create a new GAML model with 10 foragers placed across the map and run the trained model by flipping one flag in the test script.


What Changes

Aspect Step 15 (Normal Test) Step 16 (Stress Test)
Agents 2 (fixed positions) 10 (spread across the map)
GAML model forager_petz.gaml forager_petz_stress.gaml
Experiment petz_env petz_stress_env
Python script test_forager_petz.py same script — STRESS_TEST = True
Model Shared PPO Same shared PPO
max_steps 300 500 (extra time for 10 agents)

The New GAML Model

forager_petz_stress.gaml is a self-contained model derived from forager_petz.gaml with three differences.

10 Agent IDs

list<string> agent_ids <- [
    "forager_0", "forager_1", "forager_2", "forager_3", "forager_4",
    "forager_5", "forager_6", "forager_7", "forager_8", "forager_9"
];
list<bool> agents_at_food <- [
    false, false, false, false, false,
    false, false, false, false, false
];

10 Varied Starting Positions

list<point> start_positions <- [
    {5.0,  5.0},   // top-left corner
    {15.0, 5.0},
    {5.0, 15.0},
    {50.0, 5.0},   // top-center
    {5.0, 50.0},   // left-center
    {50.0, 50.0},  // center
    {5.0, 85.0},   // bottom-left
    {50.0, 85.0},  // bottom-center
    {85.0, 5.0},   // top-right
    {85.0, 50.0}   // right-center
];

Positions are chosen to be free of obstacles, spread across all map quadrants.

Longer Episode Limit

int max_steps <- 500;  // 300 in the 2-agent model

With 10 agents spread across the map, some start far from the food. 500 steps gives even distant agents a fair chance.

Observation: Nearest Teammate

The trained model still expects 15 values (2 for teammate position). With 10 agents, compute_observation reports the nearest other agent instead of the first one:

forager nearest <- (forager where (each.agent_id != agent_id)) closest_to self;
if (nearest != nil) {
    other_x <- nearest.location.x / world_size;
    other_y <- nearest.location.y / world_size;
}

This is a zero-shot adaptation — the trained weights are unchanged.

Aspect: 10 Distinct Colours

list<rgb> palette <- [
    #blue, rgb(0,180,180), rgb(0,180,0), rgb(180,0,180), rgb(180,100,0),
    rgb(0,100,200), rgb(200,0,100), rgb(100,200,0), rgb(80,80,200), rgb(200,80,80)
];
rgb agent_color <- palette[agent_index mod length(palette)];
draw circle(0.8) color: at_food ? #orange : agent_color;

Each forager gets a distinct colour. When it reaches the food it turns orange, just as in the 2-agent case.


Running the Stress Test

Step 1 – Open GAMA GUI

Open GAMA normally (port 1000).

Step 2 – Enable STRESS_TEST

Open test_forager_petz.py and change the one flag at the top:

STRESS_TEST = True   # ← was False

Step 3 – Run

cd models/petz
python test_forager_petz.py

Expected Console Output

=======================================================
  Smart Forager — MARL Test (gama-pettingzoo GUI)
=======================================================
Model loaded (shared by both foragers)
  Connecting to GAMA (attempt 1/8)...
  Connected!

Running 1 cooperative test episodes...

  Episode 1/1: ✓ COOPERATIVE SUCCESS! | Steps: 134
    forager_0: reward = 86.3
    forager_1: reward = 84.7
    forager_2: reward = 79.1
    forager_3: reward = 81.4
    forager_4: reward = 77.8
    forager_5: reward = 82.2
    forager_6: reward = 74.5
    forager_7: reward = 78.9
    forager_8: reward = 83.6
    forager_9: reward = 80.1

=======================================================
  Test Results Summary
=======================================================
  Episodes    : 1
  Success Rate: 100%
  Avg Steps   : 134
  forager_0 avg reward: 86.3
  ...
=======================================================

Note: with 10 agents all needing to reach the food, success requires the most distant agent to navigate the full map. Episodes that hit the 500-step timeout show a lower success rate — this is expected and does not mean the network failed to learn.


Key Files

File Description
models/petz/forager_petz_stress.gaml 10-agent stress model
models/petz/test_forager_petz.py Test script — set STRESS_TEST = True
models/petz/train_forager_petz.py Training script (unchanged, 2 agents)
⚠️ **GitHub.com Fallback** ⚠️