ForagerRL_step16 - gama-platform/gama GitHub Wiki
By Killian Trouillet
In Steps 12–15 we trained and tested a cooperative policy with 2 foragers. Because we used parameter sharing, the single trained network generalises to any number of agents — each one independently runs its own forward pass through the same weights.
In this last step, we evaluate that claim without any re-training: we create a new GAML model with 10 foragers placed across the map and run the trained model by flipping one flag in the test script.
| Aspect | Step 15 (Normal Test) | Step 16 (Stress Test) |
|---|---|---|
| Agents | 2 (fixed positions) | 10 (spread across the map) |
| GAML model | forager_petz.gaml |
forager_petz_stress.gaml |
| Experiment | petz_env |
petz_stress_env |
| Python script | test_forager_petz.py |
same script — STRESS_TEST = True
|
| Model | Shared PPO | Same shared PPO |
max_steps |
300 | 500 (extra time for 10 agents) |
forager_petz_stress.gaml is a self-contained model derived from forager_petz.gaml with three differences.
list<string> agent_ids <- [
"forager_0", "forager_1", "forager_2", "forager_3", "forager_4",
"forager_5", "forager_6", "forager_7", "forager_8", "forager_9"
];
list<bool> agents_at_food <- [
false, false, false, false, false,
false, false, false, false, false
];
list<point> start_positions <- [
{5.0, 5.0}, // top-left corner
{15.0, 5.0},
{5.0, 15.0},
{50.0, 5.0}, // top-center
{5.0, 50.0}, // left-center
{50.0, 50.0}, // center
{5.0, 85.0}, // bottom-left
{50.0, 85.0}, // bottom-center
{85.0, 5.0}, // top-right
{85.0, 50.0} // right-center
];
Positions are chosen to be free of obstacles, spread across all map quadrants.
int max_steps <- 500; // 300 in the 2-agent model
With 10 agents spread across the map, some start far from the food. 500 steps gives even distant agents a fair chance.
The trained model still expects 15 values (2 for teammate position). With 10 agents, compute_observation reports the nearest other agent instead of the first one:
forager nearest <- (forager where (each.agent_id != agent_id)) closest_to self;
if (nearest != nil) {
other_x <- nearest.location.x / world_size;
other_y <- nearest.location.y / world_size;
}
This is a zero-shot adaptation — the trained weights are unchanged.
list<rgb> palette <- [
#blue, rgb(0,180,180), rgb(0,180,0), rgb(180,0,180), rgb(180,100,0),
rgb(0,100,200), rgb(200,0,100), rgb(100,200,0), rgb(80,80,200), rgb(200,80,80)
];
rgb agent_color <- palette[agent_index mod length(palette)];
draw circle(0.8) color: at_food ? #orange : agent_color;
Each forager gets a distinct colour. When it reaches the food it turns orange, just as in the 2-agent case.
Open GAMA normally (port 1000).
Open test_forager_petz.py and change the one flag at the top:
STRESS_TEST = True # ← was Falsecd models/petz
python test_forager_petz.py=======================================================
Smart Forager — MARL Test (gama-pettingzoo GUI)
=======================================================
Model loaded (shared by both foragers)
Connecting to GAMA (attempt 1/8)...
Connected!
Running 1 cooperative test episodes...
Episode 1/1: ✓ COOPERATIVE SUCCESS! | Steps: 134
forager_0: reward = 86.3
forager_1: reward = 84.7
forager_2: reward = 79.1
forager_3: reward = 81.4
forager_4: reward = 77.8
forager_5: reward = 82.2
forager_6: reward = 74.5
forager_7: reward = 78.9
forager_8: reward = 83.6
forager_9: reward = 80.1
=======================================================
Test Results Summary
=======================================================
Episodes : 1
Success Rate: 100%
Avg Steps : 134
forager_0 avg reward: 86.3
...
=======================================================
Note: with 10 agents all needing to reach the food, success requires the most distant agent to navigate the full map. Episodes that hit the 500-step timeout show a lower success rate — this is expected and does not mean the network failed to learn.
| File | Description |
|---|---|
models/petz/forager_petz_stress.gaml |
10-agent stress model |
models/petz/test_forager_petz.py |
Test script — set STRESS_TEST = True
|
models/petz/train_forager_petz.py |
Training script (unchanged, 2 agents) |