DEVELOPING WITH AGENTIC WORKFLOWS HANDS-ON WORKSHOP · JUNE 2026

Getting
airborne

Demystifying agentic engineering, so we can leave the "ground" and "take flight" on the AI Adoption Continuum

RISHI DEAN · github.com/rishidean/tutorials-agent-loop
HOW TO READ THISSELF-PACED

Mapping the flow.

Read straight through, or simply jump to the part you need.

Part 1
The frame
Where we are, and why it has to be you.
Part 2
The double loop
The system, taken apart piece by piece.
Part 3
Your turn
Clone it and run the loop yourself.
Part 4
Looking ahead
What airborne means — and the one ask.
Appendix · jump ahead
Why it works
The theory under the loop — five ideas.
Extending this model
From solo to org — the four moves.
Making it run
The actual files, and how to run it.

Every claim here is runnable — the repo builds a real app, unattended.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced2
A quick caveat

This is one way.
Not the way.

There are many ways to build agentic loops. This is a simplified version of what I do.  The tools are moving fast, and some of this may even be outdated by the time you finish reading this deck!

Take the principles, not the syntax. The shape of the loop matters more than which file does what.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced3
THE PARADOX

Two realities coexist in the world

The measured reality
95%

of businesses fail to see meaningful returns on AI.

The lived reality
10×

Solo builders ship in a week what used to take a quarter.

How can both of these be true?

Both are true. The difference is how they move, not how hard they work — that's the map, next 

github.com/rishidean/tutorials-agent-loop  ·  Self-paced4
F1 cockpit and steering wheel
THE OPERATOR VS. THE TOOL

Same car. Different driver.

“AI hallucinates.”  “It can't do X.”  “We've seen it fail at other companies.”

Put an automatic-only driver in this car and they spin it into the wall...then tell you why it's undrivable. Put a pro in the same seat, and it's the fastest thing on Earth. Both are correct.

The car you drive in two years is up to you.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced5
Part one

The frame

Introduction to the AI Adoption Continuum, where we currently sit, and how we advance...today.

What this answers: where we are, where the gains hide, and why it's on you.
github.com/rishidean/tutorials-agent-loop  ·  Self-paced6
THE AI ADOPTION CONTINUUM

Think of AI adoption as modes of transport.

{{ s.markerText }}
{{ s.icon }}
{{ s.num }}
{{ s.name }}
{{ s.desc }}
github.com/rishidean/tutorials-agent-loop  ·  Self-paced7
THE AI ADOPTION CONTINUUM

Six ways to move. Only some leave the ground.

95% of teams live here
THE 10X LIVE HERE
GROUND        AIR
{{ s.markerText }}
{{ s.icon }}
{{ s.num }}
{{ s.name }}
{{ s.desc }}
Ground — same work, faster Airborne — a different process
github.com/rishidean/tutorials-agent-loop  ·  Self-paced8
How you operate

The shift from player, to coach, to conductor.

Autocomplete
Player
You AI AI You
Linear. You in every step.
Agentic
Coach
You set goal Agent loops Plan Act Verify You review
Cyclical. You manage the loop.
Orchestration
Conductor
You set strategy Agent A Agent B Agent C Agent D You verify outcome
Systemic. You conduct the fleet.
github.com/rishidean/tutorials-agent-loop  ·  Self-paced9
PLACING THE MAJORITY

We're widely still in the ground transport phase.

Driving gets you ~10–20% faster, but not the 10× solo builders see. The difference lies in the             mode of transport.

{{ s.markerText }}
{{ s.icon }}
{{ s.num }}
{{ s.name }}
{{ s.desc }}
github.com/rishidean/tutorials-agent-loop  ·  Self-paced10
Why the gap persists

What sounds like rigor, is often a skills gap.

The most responsible-sounding teams are often the least practiced.

THE DOUBT LOOP 1 2 3 4 5
Low proficiency
before the reps are in
Output disappoints
a hallucination, some slop
Distrust hardens
must be the tool
Confirmation bias
knew I couldn't rely on it
Retreat to old ways
known beats unknown

Every stage of this loop masquerades as good judgment; however it is really a signal of low proficiency + low agency.                                                                    Distrust is a status report on your scaffolding.

And the hallucination fear assumes the model is always guessing...it isn't. Hold that thought.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced11
THE SHIFT

Transitioning from ground to air.

Airborne = restructuring how you plan, build, verify, and ship around what AI can do (not "typing faster").

Ground transport
{{ row }}
Airborne
{{ row }}
github.com/rishidean/tutorials-agent-loop  ·  Self-paced12
WHY US · WHY NOW

Only you can fly the helicopter

Walking → Driving
IT

Buys the tools. Rolls them out. Provides foundation & infra

Helicopter
Engineering

The people who write the code. The org can't get airborne until we do.

Jet → Rocket
CTO / CEO

System-wide change — but only after Helicopter is proven.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced13
What you'll walk away with

Get
airborne.

By the end of this deck you'll understand a multi-sprint, self-verifying, autonomous development loop...and the means to run it yourself.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced14
Part two

The double loop

Some files and a bash script. We'll take it apart piece by piece, and then watch it build an app, unattended.

What this answers: what the system is, and how it runs unattended.
github.com/rishidean/tutorials-agent-loop  ·  Self-paced15
THE OPEN SECRET

Agentic Orchestration is a fancy term,        for some files +  a loop.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced16
WHERE THE INTELLIGENCE LIVES

We're shifting the burden from the model to specification & scaffolding.

Model
The engine
Raw capability. Necessary, but the part everyone talks about.
+
Files
The judgment
Standards, specs, and memory. Where your engineering lives.
+
Loop
The discipline
Build, verify, document (repeat until done).

Take any one away and it stops working. And most of what makes it good isn't the model. It's the files (and what we'll cover in the rest of this deck.)

github.com/rishidean/tutorials-agent-loop  ·  Self-paced17
WHAT WE MEAN BY AGENTIC

We're going to orchestrate the model to run in a loop.

Many are still thinking about "autocomplete", but they're very different.

It is — a loop
Sets its own intermediate steps
Calls tools to act, not just answer
Checks its own work
Loops until the goal is met
It isn't — a turn
Autocomplete on steroids
A chatbot you babysit turn by turn
“AI that writes code for you,” one prompt at a time
github.com/rishidean/tutorials-agent-loop  ·  Self-paced18
TWO THINGS MODELS DO

It can generate, or it can call.

Only one of these can hallucinate. Knowing which is what makes the loop trustworthy.

Generate can hallucinate
Text from weights
The model writes the most plausible next tokens from what it learned. Brilliant for synthesis and code — and the only place a guess can sneak in.
promptmodelplausible text
Call returns a fact
A tool runs, for real
The model emits a structured request; a deterministic system runs it — code, a search, a test — and hands back ground truth the model then reads.
requesttool runsfactmodel reads

In a loop, most of the work is tool calls returning ground truth not the model free-associating. run-qa’s Playwright test actually ran; the model didn’t decide it passed.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced19
THE PRIMITIVES

A few words to help speak the language.

New to Claude Code? Everything that follows is just some form of these terms.

Session
One run of Claude Code. Fresh memory every time. Nothing carries over.
Context window
The working memory a session holds at once. Finite. Fill it and quality drops.
CLAUDE.md
A context file Claude reads automatically at the start of every session.
Task tool
How a session hands a job to a subagent — and gets the result back. It's how run-qa calls its three agents.
Skill
A reusable procedure Claude can invoke by name — like run-qa.
Command
A saved prompt you trigger with a slash — like /review.
Subagent
A scoped helper for one narrow job. Often a cheaper, faster model.
github.com/rishidean/tutorials-agent-loop  ·  Self-paced20
THE SYSTEM

Three parts that mirror what we do IRL.

Overview · the three stages
This map recurs — the next three slides zoom into one stage each.
{{ c.tag }}
{{ c.title }}
{{ it }}
github.com/rishidean/tutorials-agent-loop  ·  Self-paced21
THE TOPOLOGY

One loop inside another.

The outer loop marches through sprints with a fresh brain each time. The inner loop is one session: build → verify → document.

GREEN = passed, ship it · RED = failed, fix & re-run (max 3)
Outer loop · run.sh — marches through sprints, fresh brain each turn
Sprint 1
Sprint 2
Sprint 3
…until done
Inner loop · one session
Build
the feature
Verify
QA crew → GREEN / RED
Document
tick manifest, handoff
↺  RED → fix → re-verify  (max 3 attempts, then stop)
github.com/rishidean/tutorials-agent-loop  ·  Self-paced22
THE TOPOLOGY · BY FILE

How the loop is built in practice.

Same loop as the previous slide — now wired to the real filenames.

Outer loop · marches through sprints, fresh brain each turn
[script] run.sh roadmap.md → the manifest prompt.md → read each turn
Sprint 1
Sprint 2
Sprint 3
…until done
Inner loop · one session
the Builder → a fresh Claude Code session
Build
the feature
CLAUDE.md specs/sprint-N.md
Verify
QA crew → GREEN / RED
[skill] run-qa
[agent] linter [agent] code-reviewer [agent] playwright-tester
Document
tick manifest, handoff
roadmap.md → [x] PROGRESS.md → handoff
↺  RED → fix → re-run [skill] run-qa  (max 3 attempts, then stop)
github.com/rishidean/tutorials-agent-loop  ·  Self-paced23
THE SYSTEM

The setup stage is for our specifications and agreements

Stage 1 / 3 · Setup
Why first: the system can't act until it knows the rules.
Setup
What it knows
Tech Lead
PM
Design
QA Lead
Actors
Who does the work
The Builder
QA Team (x3)
QA Manager
Orchestrator
What keeps it moving
Eng Manager
PM
github.com/rishidean/tutorials-agent-loop  ·  Self-paced24
BROAD CONTEXT (CLAUDE.MD · PROMPT.MD)

How we work.

Everything true for every sprint: architecture, conventions, and the definition of done. One constraint forces quality —

"Not done until run-qa returns GREEN."
This is the scaffolding that earns the trust.
CLAUDE.md
# Project architecture & conventions
# Marching orders, every session
## Definition of done
Not done until run-qa is GREEN.
prompt.md
Read the manifest. Pick the next sprint.
Build it. Run QA. Update state.
# same instructions, every time
github.com/rishidean/tutorials-agent-loop  ·  Self-paced25
THE MANIFEST (ROADMAP.MD)

What needs to get done, where we're at.

Three characters per line — that's your state machine. The checkbox ticks after each sprint. Stuck work marks [!] and stops.

This file is your project status.
[ ]  to do [x]  done [!]  blocked — stops for a human
roadmap.md
- [x] Sprint 1: Core clicker — harvest + persistent score
- [x] Sprint 2: Farmhands — buy auto-harvesters
- [ ] Sprint 3: Scaling cost — price curve + spend guard
- [ ] Sprint 4: Polish — number formatting + click pulse
- [!] Sprint 5: Cloud sync — blocked, needs API key
github.com/rishidean/tutorials-agent-loop  ·  Self-paced26
How to decompose · roadmap.md

Slice it vertically, not horizontally.

Weak · horizontal layers
# Roadmap
- [ ] Build all data models
- [ ] Build all API endpoints
- [ ] Build all UI screens
- [ ] Wire it together
- [ ] Test everything
Nothing runs until the very end. No verifiable step — nothing for QA to gate.
Good · vertical slices
# Roadmap · vertical slices
- [x] S1 Click to earn ships a page
- [x] S2 Earnings chart on top of S1
- [ ] S3 Hire farmhands needs S1
- [ ] S4 Idle income needs S3
Each slice ships a working app, ordered by dependency. [ ] [x] [!] legible at a glance.
github.com/rishidean/tutorials-agent-loop  ·  Self-paced27
SPECIFIC CONTEXT (SPRINT-X.MD)

What to build this turn.

One spec per sprint: acceptance criteria, technical approach, edge cases. This is where your engineering judgment lives.

At Driving you write code. At Helicopter you write specs.
specs/sprint-1.md
# Sprint 1 — Core clicker
## Acceptance criteria
· Click #harvest#score +1
· Score persists on reload
## Approach
· Single index.html, localStorage `clickfarm` { score, farmhands }
## Edge cases
· Score stays a non-negative integer
github.com/rishidean/tutorials-agent-loop  ·  Self-paced28
Where judgment lives · sprint-3.md

A good spec makes the build trivial.

Thin · vague
# Sprint 3: Hiring

Let players hire farmhands
that help out automatically.
No criteria. No edge cases. The build is a guess; QA has nothing to assert.
Fat · explicit
# Sprint 3: Scaling Cost
## Acceptance Criteria
- #cost rises per the formula
- score < cost → #hire does nothing
- hire at score == cost succeeds
 
## Watch out for
- a hire you can't afford MUST be blocked
- score must never go below 0
Explicit criteria + the trap to avoid. The build is trivial; QA is decisive.
github.com/rishidean/tutorials-agent-loop  ·  Self-paced29
THE SYSTEM

Now, it's time to build.

Stage 2 / 3 · Actors
Why next: someone has to do the work — and check it.
Setup
What it knows
Tech Lead
PM
Design
QA Lead
Actors
Who does the work
The Builder
QA Team (x3)
QA Manager
Orchestrator
What keeps it moving
Eng Manager
PM
github.com/rishidean/tutorials-agent-loop  ·  Self-paced30
THE BUILDER
builder · sprint session
$ claude -p "$(cat prompt.md)"
reading context…
building feature…
writing index.html

Executes the spec for a single  feature / sprint. Then done.

The Builder is just a Claude Code session — no special framework, no persistent agent. Spun up fresh for one sprint, then thrown away. Its power is that nothing carries over.

Stateless
Clean brain every turn — no context rot from the last sprint.
Self-checking
It calls run-qa on its own work before it ever declares done.
Hands off
Writes its memory to PROGRESS.md so the next session can pick up.
github.com/rishidean/tutorials-agent-loop  ·  Self-paced31
THE QA MANAGER

When code is complete, the 
manager delegates the QA tasks

One call — run-qa — runs all three QA agents and returns a single verdict: GREEN or RED. RED three times and it stops for a human.

run-qa
Lint
Review
Test
GREEN
↺  RED → read failures → fix → re-run
max 3 attempts, then mark blocked & stop
.claude/skills/run-qa/SKILL.md
---
name: run-qa
description: Full QA pass — lint, review,
  e2e. Returns one GREEN/RED verdict.
---

# run-qa
Run these in order, each as a subagent:
1. Delegate to linter.
2. Delegate to code-reviewer.
3. Delegate to playwright-tester.

Verdict: any lint error, CRITICAL
finding, or failed spec → RED.
Otherwise → GREEN.

If RED: fix, re-run from step 1.
Stop after 3 attempts; report to human.
github.com/rishidean/tutorials-agent-loop  ·  Self-paced32
THE QUALITY TEAM

Each team member has a very specific job.

.claude/agents/They check, they don't build — cheaper, faster models.
This is the scaffolding that earns the trust — the answer to the doubt loop from Part 1.
Judgment · latent vs Verification · deterministic The crew runs a test, not a vibe check — that's why it catches the bug.
{{ q.mark }}
{{ q.name }}
{{ q.model }}
{{ q.job }}
{{ q.snippet }}
{{ q.file }}
github.com/rishidean/tutorials-agent-loop  ·  Self-paced33
THE INNER LOOP

RED doesn't stop the loop — it routes back to the Builder.

On RED, the verdict carries the actual failures — the Builder reads them, patches, and re-runs. Three attempts, then it stops. RED isn't an opinion — it's a failed assertion; the fix targets a fact, not a feeling.

RED → read failures · fix · re-run
The Builder
Write & fix
Builds the feature. On a RED return, reads the failures and patches the code.
run-qa
Verify
Lint · review · test. Returns one verdict — and the failures with it.
RED loops back ↑
GREEN exits → Document
Three attempts, then it stops. Still RED after the third try → mark the sprint [!] blocked and hand back to a human. Autonomous, never infinite.
github.com/rishidean/tutorials-agent-loop  ·  Self-paced34
THE SYSTEM

Close the sprint, and move to the next.

Stage 3 / 3 · Orchestrator
Why last: something has to keep the sprints moving, hands-free.
Setup
What it knows
Tech Lead
PM
Design
QA Lead
Actors
Who does the work
The Builder
QA Team (x3)
QA Manager
Orchestrator
What keeps it moving
Eng Manager
PM
github.com/rishidean/tutorials-agent-loop  ·  Self-paced35
KNOWLEDGE HANDOFF

On GREEN, the session writes itself down.

QA GREEN three writes — roadmap = state · PROGRESS = memory · CLAUDE = learning
01 roadmap.md
Tick the box. The state machine advances.
State
- [ ] Sprint 3: scaling cost
- [x] Sprint 3: scaling cost
02 PROGRESS.md
Append a handoff for the next session.
Memory
## Sprint 3
scaled hire cost.
spend bug → fixed
(guard the spend)
03 CLAUDE.md
Promote durable learnings to permanent context.
Learning
## Gotchas
+ score must never go
+ negative — guard spend
github.com/rishidean/tutorials-agent-loop  ·  Self-paced36
THE ORCHESTRATORrun.sh

Ten lines. Dead simple by design.

#!/usr/bin/env bash
set -euo pipefail
MAX=4; n=0
while grep -q '^- \[ \]' roadmap.md && (( n < MAX )); do
n=$((n+1)); echo "=== sprint $n ==="
claude -p "$(cat prompt.md)" || { echo "session error"; break; }
# --dangerously-skip-permissions omitted for readability
grep -q '^- \[!\]' roadmap.md && { echo "blocked"; break; }
done
echo "done — $n sprint(s)"
github.com/rishidean/tutorials-agent-loop  ·  Self-paced37
Part three

Your turn

Enough theory. Six files, one command, four sprints — here's how you run it yourself, now or later.

What this answers — how to run the loop on your own machine.
github.com/rishidean/tutorials-agent-loop  ·  Self-paced38
STEP 1 · SET UPSTEP 1 / 4

Clone it. Set up. Three minutes.

git clone https://github.com/rishidean/tutorials-agent-loop
cd tutorials-agent-loop
./setup.sh # Playwright + Chromium, prereq checks
# auth: run `claude` once, or export ANTHROPIC_API_KEY
The repo, top to bottom
your-service/
├── CLAUDE.md          # stack + "done = run-qa GREEN"
├── prompt.md          # marching orders, every session
├── roadmap.md         # the manifest / state machine
├── PROGRESS.md        # handoff memory (starts empty)
├── run.sh             # the orchestrator (~10 lines)
├── specs/
│   └── sprint-1.md    # one spec per sprint
└── .claude/
    ├── skills/run-qa/SKILL.md  # the run-qa gate
    └── agents/           # linter · reviewer · tester
Three steps to your first run
1
Drop a CLAUDE.md into one service — your stack, your conventions, and the one line: done = run-qa GREEN.
2
Write a single spec for that mind-numbing weekly task.
3
./run.sh — and watch it go.
Before you run
Runtime 10–20 min after setup
Real Claude Code sessions + subagents
Workshop → DEMO_MODE=planted ./run.sh
github.com/rishidean/tutorials-agent-loop  ·  Self-paced39
STEP 2 · RUNSTEP 2 / 4

One command. The loop writes the code.

$ ./run.sh # emergent run
$ DEMO_MODE=planted ./run.sh # guaranteed Sprint-3 RED → GREEN

A fresh session spins up for each sprint — it builds the feature, runs the QA gate, ticks the roadmap, and hands off. Then the next sprint starts with a clean brain. You write specs; the loop writes code.

Stuck or dirty repo?
$ ./reset-demo.sh # start over from Sprint 1
TROUBLESHOOTING.md # auth, model, Playwright, stuck terminal
github.com/rishidean/tutorials-agent-loop  ·  Self-paced40
WHILE IT RUNS  ·  ~10 MINS

Now go hydrate.

A full run takes up to ten minutes. Watch the console scroll by if you like, or step away, stretch, refill your water, and let the loop cook. It doesn't need you for this part.

// back in a few
github.com/rishidean/tutorials-agent-loop  ·  Self-paced41
STEP 3 · WATCHSTEP 3 / 4

What one full run produces — no human at the keyboard.

$ DEMO_MODE=planted ./run.sh
 
  Sprint 1: build -> run-qa ...
  Sprint 1: GREEN after 1m 04s
 
  Sprint 2: build -> run-qa ...
  Sprint 2: GREEN after 0m 58s
 
  Sprint 3: build -> run-qa ...
  (planted bug is seeded and fixed during QA)
  Sprint 3: GREEN after 1m 47s
 
  Sprint 4: build -> run-qa ...
  Sprint 4: GREEN after 1m 09s
 
  ALL SPRINTS COMPLETE - 4 / 4 GREEN
The run, recorded · Sprint 3 RED → GREEN
github.com/rishidean/tutorials-agent-loop  ·  Self-paced42
STEP 4 · PLAYSTEP 4 / 4

You didn't write a line of it.

$ open index.html # or reference/index.html for the finished build
Play the live build  →
Click Farm — the finished build

Click to harvest. Hire farmhands. Watch idle income roll in. Four sprints, zero keystrokes from you.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced43
DEBRIEF

What just happened.

01
Stateless sessions, real progress. Continuity came from the files, not memory.
02
The gate caught a real bug. Sprint 3 let you overspend; the tester caught it, the Builder fixed it. The loop found it - not a model, nor a human.
03
Four sprints, zero keystrokes. You wrote specs; it wrote, tested, fixed, and documented.
github.com/rishidean/tutorials-agent-loop  ·  Self-paced44
LOW FLOOR  ·  HIGH CEILING EXTENDING THE LOOP

This is the seed. Extend it forever.

Every extension is the same shape — one small file in .claude/. Four moves take you from solo to enterprise; the depth is in the appendix.

Encode judgment
the loop writes its own inputs
Add a gate
raise the bar on GREEN
Add structure
scale breadth
Close the loop
durable & auditable
The loop
the seed
{{ iconHeli }} Helicopter (this repo) {{ iconJet }} Jet (a team) {{ iconRocket }} Rocket (an org)
the four moves in full “Extending this model” (appendix)
github.com/rishidean/tutorials-agent-loop  ·  Self-paced45
Part four

Looking ahead

What being airborne means for you, the team, and the org — and the single ask that starts it.

What this answers — what airborne means next, and the one ask.
github.com/rishidean/tutorials-agent-loop  ·  Self-paced46
WHAT'S NEXT

How it spreads from here.

{{ p.who }}
{{ p.stage }}
{{ p.text }}
github.com/rishidean/tutorials-agent-loop  ·  Self-paced47
WHAT AIRBORNE LOOKS LIKE

The secret is that there is no secret. It's just putting in the work. 

{{ s.markerText }}
{{ s.icon }}
{{ s.num }}
{{ s.name }}
{{ s.desc }}
Driving — the old way
{{ row }}
Helicopter — what the loop does
{{ row }}
github.com/rishidean/tutorials-agent-loop  ·  Self-paced48
THE ACCELERATION

Each shift arrives faster than the last.

Walking
~ years
Biking
~ a year
Driving
months
Helicopter
now
Jet
The advantage compounds. The gap between Driving and Helicopter isn't closing — it's widening.
github.com/rishidean/tutorials-agent-loop  ·  Self-paced49
MY REQUEST OF YOU
90
days.

Every service gets a CLAUDE.md and a run-qa skill.

Pick the most mundane, repetitive task you do every week. That's your first candidate to automate. 

Where to start github.com/rishidean/tutorials-agent-loop
github.com/rishidean/tutorials-agent-loop  ·  Self-paced50
TECH REFERENCES

Appendices

Optional reference, for those implementing. 

github.com/rishidean/tutorials-agent-loop  ·  Self-paced51
APPENDIX A · THE THEORY

Why it works

Five ideas under the loop — optional reading. Every one resolves to a file you already ran.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced52
Why it works01 / 05

A skill stores a process, not content.

A skill file holds a procedure, not facts — how to do a recurring job: run the three checks, return one verdict. The method becomes reusable and versioned, kept separate from any single task it runs on.

that’s the skill SKILL.md / run-qa
github.com/rishidean/tutorials-agent-loop  ·  Self-paced53
Why it works02 / 05

Subagents keep context clean.

Each subagent gets its own context window and exactly one narrow job. The reviewer never sees the builder’s scratch work; the tester starts fresh. Narrow scope means a cheaper model and no cross-contamination.

that’s the QA crew linter · reviewer · tester
github.com/rishidean/tutorials-agent-loop  ·  Self-paced54
Why it works03 / 05

Fresh sessions beat a full context.

A model’s context is finite and degrades as it fills. So every sprint launches a new session with an empty window — the roadmap and PROGRESS files carry state forward, not the model’s fading memory.

that’s the clean brain per sprint roadmap.md + PROGRESS.md
github.com/rishidean/tutorials-agent-loop  ·  Self-paced55
Why it works04 / 05

The loop improves itself.

When a session learns something durable — a gotcha, a convention — it writes it back into the broad context. The next session starts smarter. The system quietly edits its own instructions.

that’s the Document step promotes → CLAUDE.md
github.com/rishidean/tutorials-agent-loop  ·  Self-paced56
Why it works05 / 05

Generate is latent. Verify is deterministic.

Generation is fuzzy by nature — synthesis, design, judgment. Verification isn’t: the assertion passes or it doesn’t. You never trust the model’s opinion of its own work; you run a tool that returns a fact.

that’s why QA is a test a test, not a vibe check
github.com/rishidean/tutorials-agent-loop  ·  Self-paced57
APPENDIX B · GOING FURTHER

Extending this model

The four moves in full — same shape every time, one small file in .claude/.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced58
Appendix · Going furtherThe four moves

Same shape, four directions.

Encode judgment
the loop writes its own inputs
·generate-roadmap
·spec-sprint
·plan-sprint  plan before build
Add a gate
raise the bar on GREEN
·security / a11y / perf
·planning approval
·PR-per-sprint + CI
Add structure
scale breadth
·epics → sprints → features
·parallel builders
·multi-repo
Close the loop
durable & auditable
·session-start / wrapup
·structured logs + cost
·knowledge base
Helicopter (this repo) Jet (a team) Rocket (an org) full catalog EXTENDING.md
github.com/rishidean/tutorials-agent-loop  ·  Self-paced59
Appendix · Going furtherEncode judgment · in depth

Add a PLAN phase.

Plan before you build — and, going further, before you spec.

pre-spec
runs once
intent
PLANauthor
roadmap.md + specs/
approve
post-spec
per sprint
spec
PLANbuild
plan.md
approve
build
QA
document
The artifact · concrete
# specs/sprint-3.plan.md
 
files to touch — index.html (hire, cost)
order — cost formula → guard → UI
approach — check affordability first
risksnegative score if unguarded
Four files, four jobs
roadmap = state
plan = intent
PROGRESS = memory
CLAUDE = learning
A plan is intent — its own slot, not jammed into memory.

Judgment moves up a rung — you approve plans instead of writing specs. That's the Jet step.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced60
Appendix · The principles

The principles, not the syntax.

Commands and flags change. What makes the output good doesn’t.

01
Plan before build. Decide the slice and the spec before a line of code.
02
Verify before done. “Done” means the gate is GREEN — never a feeling.
03
One job per subagent. Narrow scope, cheap model, clean context.
04
Fix, don’t ask. On RED, read the failure and patch — stop only when truly stuck.
05
Promote learnings. Write durable lessons back into the files so the next run is smarter.
This is my way — the syntax will change, but these don’t.
github.com/rishidean/tutorials-agent-loop  ·  Self-paced61
APPENDIX C · THE FILES

Making it run

The key file types for this model, and to run the exercises

github.com/rishidean/tutorials-agent-loop  ·  Self-paced62
APPENDIX · THE FILESFile 01 / 13

run.sh

The orchestrator — the outer loop
#!/usr/bin/env bash
set -euo pipefail
MAX=4; n=0
# planted mode: DEMO_MODE=planted swaps in a buggy build
# (fixtures/index.sprint2-buggy.html) before Sprint 3 so QA goes RED

echo "THE LOOP — starting..."
cat roadmap.md

while grep -q '^- \[ \]' roadmap.md && (( n < MAX )); do
  n=$((n+1))
  echo "SPRINT $n — launching fresh session..."
  claude -p "$(cat prompt.md)" \
    --dangerously-skip-permissions \
    --model opus \
    || { echo "session error"; break; }
  cat roadmap.md
  if grep -q '^- \[!\]' roadmap.md; then
    echo "BLOCKED — human needed."; break
  fi
  echo "Sprint $n complete."
done

echo "DONE — $n sprint(s) attempted"
cat PROGRESS.md
github.com/rishidean/tutorials-agent-loop  ·  Self-paced63
Appendix · The FILESFile 02 / 13

CLAUDE.md

Broad context — true for every sprint
# Click Farm

## Identity & Mission
You maintain the Click Farm game. Done = the feature
works, QA is GREEN, nothing else broke.

## Map
One index.html (inline CSS + JS). State lives in
localStorage. Specs are in specs/.
State: localStorage key clickfarm = { score, farmhands }.
Stable ids: #score, #harvest, #hire, #farmhands, #cost.

## Conventions
Vanilla HTML/CSS/JS — no frameworks, no build step.
Mobile-first, system fonts, one accent color.

## Running it
Open index.html in a browser to play. run-qa to check.
No package manager.

## Guardrails
Never add dependencies or a build tool. Don't edit
other sprints' specs. Ask before changing the schema.

## Definition of Done
Not done until run-qa returns GREEN.
If RED, read it, fix it, re-run. Never ship on RED.
github.com/rishidean/tutorials-agent-loop  ·  Self-paced64
Appendix · The FILESFile 03 / 13

prompt.md

Marching orders — read at every session start
# Sprint Turn
Read roadmap.md and pick the first sprint marked [ ].
Read that sprint's spec and PROGRESS.md for context.

Build the feature per the spec.
Then run the run-qa skill until GREEN (max 3 attempts).

When GREEN:
- Mark the sprint [x] in roadmap.md.
- Append a handoff to PROGRESS.md: what you built,
  key decisions, gotchas, what the next sprint needs.

If you can't reach GREEN after 3 attempts:
- Mark the sprint [!] in roadmap.md, then stop.

Updating the status is your final action — never skip it.
github.com/rishidean/tutorials-agent-loop  ·  Self-paced65
Appendix · The FILESFile 04 / 13

roadmap.md

The manifest — three characters per line
# Click Farm — Roadmap

- [ ] Sprint 1: Core clicker — harvest + persistent score    — specs/sprint-1.md
- [ ] Sprint 2: Farmhands — buy auto-harvesters             — specs/sprint-2.md
- [ ] Sprint 3: Scaling cost — price curve + spend guard    — specs/sprint-3.md
- [ ] Sprint 4: Polish — number formatting + click pulse    — specs/sprint-4.md
github.com/rishidean/tutorials-agent-loop  ·  Self-paced66
Appendix · THE FILESFile 04 / 13

roadmap.md

One file, nested — Epics → Sprints → Features
# Click Farm — Roadmap

## Epics  «1»
- [x] E1  Core game       shipped
- [ ] E2  AI Bridge       active
- [ ] E3  Multiplayer     planned

## Epic 2 · AI Bridge  «2»
- [x] S1  Tool library        epic2/sprint1-tools.md
- [ ] S2  Command-mode UI      epic2/sprint2-command.md
- [ ] S3  Multi-intent parse   epic2/sprint3-intents.md
  gate: Epic 1 stable + dogfooded 1 week   «3»

## Current Status  «4»
Active   E2 · S2 — Command-mode UI
Shipped  E2 · S1 — tool library, tests green
Next     Command-mode UI -> multi-intent
Gate     2-week dogfood window before E3
1
Epic dashboard. One row per epic — the bird's-eye view at a glance.
2
Same checkboxes, nested. Epics → Sprints → Features. Each sprint still points at one spec.
3
A gate can be human. Not just code — the loop stops and waits for a checkpoint.
4
The living handoff. Current Status is rewritten at the end of every session.
github.com/rishidean/tutorials-agent-loop  ·  Self-paced67
Appendix · The FILESFile 05 / 13

PROGRESS.md

The memory — the Builder writes it, sprint by sprint
# Progress Log
// starts empty — one handoff entry appended per sprint: // what was built · key decisions · gotchas · next-up
github.com/rishidean/tutorials-agent-loop  ·  Self-paced68
APPENDIX · THE FILESFile 06 / 13

specs/sprint-1.md

Core clicker
# Sprint 1: Core Clicker

## What to build
A big Harvest button. Clicking it adds 1 to the
score. Score persists across reloads.

## Acceptance Criteria
- Clicking #harvest increases #score by 1
- #score persists after a reload (localStorage)
- Score is a non-negative integer

## Technical Notes
- Single index.html, inline CSS + JS
- localStorage key "clickfarm" → { score, farmhands }
- Stable ids: #score, #harvest
github.com/rishidean/tutorials-agent-loop  ·  Self-paced69
APPENDIX · THE FILESFile 07 / 13

specs/sprint-2.md

Farmhands — buy auto-harvesters
# Sprint 2: Farmhands (auto-harvest)

## What to build
A "Hire Farmhand" button (cost 10). Hiring deducts
10 and adds a farmhand. Each farmhand earns
+1 score/sec automatically.

## Acceptance Criteria
- #hire with score ≥ 10 deducts 10 and adds a #farmhands
- With ≥1 farmhand, #score rises ~1/sec on its own
- #cost shows the hire cost (10)

## Technical Notes
- New ids: #hire, #farmhands, #cost
- setInterval 1000ms; persist state on each tick
github.com/rishidean/tutorials-agent-loop  ·  Self-paced70
APPENDIX · THE FILESFile 08 / 13

specs/sprint-3.md

Scaling cost — QA catches the bug
# Sprint 3: Scaling Cost

## What to build
Farmhands get more expensive the more you own.
cost = floor(10 * 1.15 ^ farmhands). #cost
shows the NEXT price, updates after each hire.

## Acceptance Criteria
- #cost increases after each hire (per the formula)
- When score < cost, clicking #hire does nothing
  (score & farmhands unchanged)
- Score never goes negative; hiring at score == cost succeeds

## Technical Notes
- Integer math, Math.floor
- Guard lives in the acceptance test; the bug is
  planted via the fixture swap, not the spec
github.com/rishidean/tutorials-agent-loop  ·  Self-paced71
APPENDIX · THE FILESFile 09 / 13

specs/sprint-4.md

Polish — formatting + feedback
# Sprint 4: Polish

## What to build
Number formatting + click feedback.

## Acceptance Criteria
- Score ≥ 1000 shows abbreviated (1.2K / 3.4M / 5.6B);
  below 1000 shows the integer
- Same formatting on #cost
- Underlying state stays an exact integer (display-only)
- #harvest pulses briefly on click

## Technical Notes
- Format on display only — never round the stored value
github.com/rishidean/tutorials-agent-loop  ·  Self-paced72
APPENDIX · THE FILESFile 10 / 13

.claude/skills/run-qa/SKILL.md

run-qa — the quality gate
---
name: run-qa
description: Full QA pass — lint, review, e2e.
  Returns one GREEN/RED verdict.
---

# run-qa
Run these in order. Each is a subagent; use its
returned message as the result.

1. Delegate to linter.
2. Delegate to code-reviewer.
3. Delegate to playwright-tester.

Verdict:
- Any lint error, any CRITICAL finding, or any
  failed spec → RED. List what failed.
- Otherwise → GREEN.

If RED: fix the issues, re-run from step 1.
Stop after 3 attempts; then report to the human.
github.com/rishidean/tutorials-agent-loop  ·  Self-paced73
APPENDIX · THE FILESFile 11 / 13

.claude/agents/linter.md

Subagent 1 — structure & syntax
---
name: linter
description: Validates HTML structure and JS errors.
tools: Read, Edit, Bash    model: haiku
---

Run validation on index.html:
1. Check HTML is well-formed (tags, nesting)
2. Run node --check on inline script content
3. Check for unclosed strings, missing semicolons,
   undefined variables

Auto-fix trivial violations (formatting, whitespace).
For judgment calls, leave it and report it.

Return "CLEAN" if no errors remain, else the list
of remaining errors with location.
github.com/rishidean/tutorials-agent-loop  ·  Self-paced74
APPENDIX · THE FILESFile 12 / 13

.claude/agents/code-reviewer.md

Subagent 2 — bugs & logic
---
name: code-reviewer
description: Reviews changes for bugs and UX issues.
tools: Read, Grep, Glob    model: sonnet
---

Review index.html against the sprint spec.
Do not change any files.

Check for:
- Logic bugs (especially localStorage read/write)
- State issues (stale data, race conditions)
- UX problems (unaffordable hires, negative balances)
- Missing acceptance criteria

Return findings as:
- CRITICAL — bugs, data corruption, broken function
- WARN — likely problems, edge cases
- NIT  — style, naming
For each: the issue + a one-line fix. Else "CLEAN".
github.com/rishidean/tutorials-agent-loop  ·  Self-paced75
APPENDIX · THE FILESFile 13 / 13

.claude/agents/playwright-tester.md

Subagent 3 — behavior & acceptance
---
name: playwright-tester
description: Runs behavioral tests against the app.
tools: Read, Write, Bash    model: sonnet
---

Write and run a Playwright test for the sprint's
acceptance criteria.
1. Create test-farm.spec.js from the sprint spec
2. Open index.html via file:// or a local server
3. Test each acceptance criterion

Key scenarios for Sprint 3:
- Click → score increments
- Hire with enough score → cost deducted, +1 farmhand
- Hire when you can't afford it → blocked, no change
- Hire at score == cost → succeeds, score ≥ 0
- Score never goes below 0
- State persists across refresh

Return "GREEN" if all pass, else "RED" plus, per
failure: test name, failed assertion, actual result.
github.com/rishidean/tutorials-agent-loop  ·  Self-paced76