AI Agent Harnesses - spinningideas/resources GitHub Wiki

"Agentic Harness" Frameworks

Below is a collection of multi-agent orchestration systems that enable creating and running a "harness" that powers agentic development flows

High Level TLDR: https://addyosmani.com/blog/agent-harness-engineering/ + https://ai.gopubby.com/harness-engineering-what-every-ai-engineer-needs-to-know-in-2026-0ab649e5686a

https://github.com/walkinglabs/awesome-harness-engineering

Summary

A coding agent is two things: the model, and the harness wrapped around it. The model reasons. The harness gives it your context, your tools, and your process. One teardown of Claude Code found roughly 98% of it is the harness, not the model. The harness is your true leverage - the model is becoming a commodity

There are two levels to building a harness.

1) AI layer

The first level is a single coding agent session: the wrapper around one instance of Claude Code, Codex, or whatever you use.

The AI layer: your rules, skills, MCP servers, hooks, LSP, and subagents. Think BMAD or GitHub Spec Kit. Sounds a lot like context engineering, but what separates it is the mindset. When the agent does something dumb, you don't just blame the model - you improve the harness! It missed a convention, so that becomes a rule. It ran something destructive, so a hook now blocks it. Every mistake becomes a permanent upgrade to your system. See https://github.com/spinningideas/resources/wiki/AI-Assisted-Product-Development#general-ai-assisted-development-toolkits

2) Orchestration layer

The second level is orchestrating multiple coding agent sessions into one workflow (think Ralph loop - see https://github.com/spinningideas/resources/wiki/AI-Ralph-Technique). You do not hand a massive PRD to a single session and hope it can dissect it all. It burns tokens, and the model gets overwhelmed no matter how good your AI layer is. So you give each agent session one focused job.

One plans, one implements, one validates, and if everything passes you open a pull request. Automate those handoffs and you get something like the Ralph loop, splitting a big spec into tasks and running fresh sessions until the work is done. That is how you take on larger tasks reliably without babysitting every step.

That second level is the real future of agentic engineering. The models and the tools keep getting better, and the way you scale alongside them is the harness you build around them.

Frameworks

There are a number of "Agentic Harness" frameworks:

General Links

https://www.anthropic.com/engineering/harness-design-long-running-apps
https://github.com/walkinglabs/awesome-harness-engineering
https://www.youtube.com/watch?v=nBH07G-zayk - video explaining highlevel usage of "Agentic Harness" frameworks
https://www.youtube.com/watch?v=ulNsa0sD8N0 - Cole Medin explaining highlevel usage of "Agentic Harness" frameworks
https://github.com/walkinglabs/learn-harness-engineering
https://github.com/earendil-works/pi (pi agent based)
https://github.com/yzddp/harnesscode

AI-Assisted-Product-Development

https://github.com/spinningideas/resources/wiki/AI-Assisted-Product-Development

AI-Assisted-UI-Development

https://github.com/spinningideas/resources/wiki/AI-Assisted-UI-Development

AI-Assisted-Development-Tools

https://github.com/spinningideas/resources/wiki/AI-Assisted-Coding-Tools