Article Notes - RuhDel/Learning_Journal GitHub Wiki

AI Agents by Sloth Bytes

🤖 What are AI Agents?

🦥 Sloth's definition

Agents are AI that reason, plan and act in a continuous loop on it's own until the agent either completes its task or encounters an error.

OpenAI's Definition

Agents are systems that independently accomplish tasks on your behalf.

Nvidia's Definition

AI agents are advanced AI systems designed to autonomously reason, plan, and execute complex tasks based on high-level goals.

Anthropic's Definition

LLMS that dynamically direct their own processes and tool usage, maintaining control over how they accomplish tasks.

So Basically...

Agents are models using tools in a fancy for loop with multiple conditions.

🤖 When is Something an AI Agent?

AI Agents depend on how independent they are. There's 3 main levels to this:

Non-agentic LLMs (The Basics)

These are basically your normal LLMs/AIs. ChatGPT, Gemini, Claude, Llama, etc.
What they are: AI systems that can only respond with text based on their training data.
How they work: They analyze questions you provide and generate a response using patterns from their training.
- Obviously it's more detailed, but this isn't about LLMs..
Real-world example: Using basic ChatGPT to answer a question.
Limitations:
- Can't look up new information online
- Doesn't remember conversations after you close the chat
- Can't use tools or take actions in the real world
- It's like talking to someone who's knowledgeable but cut off from the outside world

AI Workflows (The Middle Ground)

These are like LLMs with special abilities. They have access to specific tools and information.

What are they: LLMs connected to specific tools and data sources.
How they work: They can use predetermined tools when instructed, but follow fixed patters.
**Real-world example:
- ChatGPT with browsing capability looking up today's weather
- Claude accessing your Google Drive to find a specific document when you ask
Key features:
- Can access external information when specifically asked.
- May remember previous conversations
- Use tools, but usually need direct instructions for which tool to use: "Search the internet to find ___"
- AI workflows are like a helpful assistant who can look things up and basic tools, but needs step-by-step guidance

AI Agents (The Advanced Systems)

These are like digital assistants with initiative. They can decide what to do and how to do it.

What they are: Systems that can observe, plan, decide, and act toward goals with minimal supervision
How they work: They break down complex tasks, decide which tools to use, and learn from results
Real-world example:
- An AI system that, when asked to "research vacation options for my family," automatically searches travel sites, checks your calendar for available dates, compares prices, and presents options.
- Claude code, which can plan and execute complex coding tasks from a simple request.
Key features:
- Can create and follow multi-step plans without guidance at each step
- Chooses appropriate tools autonomously
- Adapts when the initial approach doesn't work
- Maintains memory of what it has learned and accomplished
- Like having a proactive assistant who understands your goals and figures out how to achieve them on their own

My AI Skeptic Friends Are All Nuts by Thomas Ptacek

Level Setting

People coding with LLMs today use agents. Agents get to poke around your codebase on their own. They author files directly. They run tools. They compile code, run tests, and iterate on the results. They also:

pull in arbitrary code from the tree, or from other trees online, into their context windows,
run standard Unix tools to navigate the tree and extract information,
interact with Git,
run existing tooling, like linters, formatters, and model checkers, and
make essentially arbitrary tool calls (that you set up) through MCP.

The code in an agent that actually “does stuff” with code is not, itself, AI. This should reassure you. It’s surprisingly simple systems code, wired to ground truth about programming in the same way a Makefile is. You could write an effective coding agent in a weekend. Its strengths would have more to do with how you think about and structure builds and linting and test harnesses than with how advanced o3 or Sonnet have become.

If you’re making requests on a ChatGPT page and then pasting the resulting (broken) code into your editor, you’re not doing what the AI boosters are doing. No wonder you’re talking past each other.

The Positive Case

LLMs can write a large fraction of all the tedious code you’ll ever need to write. And most code on most projects is tedious. LLMs drastically reduce the number of things you’ll ever need to Google. They look things up themselves. Most importantly, they don’t get tired; they’re immune to inertia.

Think of anything you wanted to build but didn’t. You tried to home in on some first steps. If you’d been in the limerent phase of a new programming language, you’d have started writing. But you weren’t, so you put it off, for a day, a year, or your whole career.

There’s a downside. Sometimes, gnarly stuff needs doing. But you don’t wanna do it. So you refactor unit tests, soothing yourself with the lie that you’re doing real work. But an LLM can be told to go refactor all your unit tests. An agent can occupy itself for hours putzing with your tests in a VM and come back later with a PR. If you listen to me, you’ll know that. You’ll feel worse yak-shaving. You’ll end up doing… real work.

But you have no idea what the code is...

Are you a vibe coding Youtuber? Can you not read code? If so: astute point. Otherwise: what the fuck is wrong with you?

You’ve always been responsible for what you merge to main. You were five years go. And you are tomorrow, whether or not you use an LLM.

If you build something with an LLM that people will depend on, read the code. In fact, you’ll probably do more than that. You’ll spend 5-10 minutes knocking it back into your own style. LLMs are showing signs of adapting to local idiom, but we’re not there yet.

People complain about LLM-generated code being “probabilistic”. No it isn’t. It’s code. It’s not Yacc output. It’s knowable. The LLM might be stochastic. But the LLM doesn’t matter. What matters is whether you can make sense of the result, and whether your guardrails hold.

Reading other people’s code is part of the job. If you can’t metabolize the boring, repetitive code an LLM generates: skills issue! How are you handling the chaos human developers turn out on a deadline?

For the last month or so, Gemini 2.5 has been my go-to †. Almost nothing it spits out for me merges without edits. I’m sure there’s a skill to getting a SOTA model to one-shot a feature-plus-merge! But I don’t care. I like moving the code around and chuckling to myself while I delete all the stupid comments. I have to read the code line-by-line anyways.