DEVELOPING WITH AGENTIC WORKFLOWS
    HANDS-ON WORKSHOP · JUNE 2026
  

Getting
airborne

Demystifying agentic engineering, so we can leave the "ground" and "take flight" on the AI Adoption Continuum.

    
    RISHI DEAN · github.com/rishidean/tutorials-agent-loop
  

    HOW TO READ THISSELF-PACED
  

Mapping the flow.

Read straight through, or simply jump to the part you need.

Part 1

The frame

Where we are, and why it has to be you.

→

Part 2

The double loop

The system, taken apart piece by piece.

→

Part 3

Your turn

Clone it and run the loop yourself.

→

Part 4

Looking ahead

What airborne means — and the one ask.

Appendix · jump ahead

Why it works→

The theory under the loop — five ideas.

Extending this model→

From solo to org — the four moves.

Making it run→

The actual files, and how to run it.

Every claim here is runnable — the repo builds a real app, unattended.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced2

A quick caveat

This is one way.
Not the way.

There are many ways to build agentic loops. This is a simplified version of what I do. The tools are moving fast, and some of this may even be outdated by the time you finish reading this deck!

Take the principles, not the syntax. The shape of the loop matters more than which file does what.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced3

    THE PARADOX
  

Two realities coexist in the world

The measured reality

95%

of businesses fail to see meaningful returns on AI.

The lived reality

10×

Solo builders ship in a week what used to take a quarter.

How can both of these be true?

Both are true. The difference is how they move, not how hard they work — that's the map, next →

github.com/rishidean/tutorials-agent-loop  ·  Self-paced4

THE OPERATOR VS. THE TOOL

Same car. Different driver.

“AI hallucinates.” “It can't do X.” “We've seen it fail at other companies.”

Put an automatic-only driver in this car and they spin it into the wall...then tell you why it's undrivable. Put a pro in the same seat, and it's the fastest thing on Earth. Both are correct.

The car you drive in two years is up to you.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced5

Part one

The frame

Introduction to the AI Adoption Continuum, where we currently sit, and how we advance...today.

What this answers: where we are, where the gains hide, and why it's on you.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced6

    THE AI ADOPTION CONTINUUM
  

Six ways to move. Only some leave the ground.

95% of teams live here

THE 10X LIVE HERE

GROUND        AIR

{{ s.markerText }}

{{ s.num }}

      Ground — same work, faster
      Airborne — a different process
    

github.com/rishidean/tutorials-agent-loop  ·  Self-paced8

How you operate

The shift from player, to coach, to conductor.

Autocomplete

Player

Linear. You in every step.

Agentic

Coach

Cyclical. You manage the loop.

Orchestration

Conductor

Systemic. You conduct the fleet.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced9

    PLACING THE MAJORITY
  

We're widely still in the ground transport phase.

Driving gets you ~10–20% faster, but not the 10× solo builders see. The difference lies in the mode of transport.

{{ s.markerText }}

{{ s.num }}

github.com/rishidean/tutorials-agent-loop  ·  Self-paced10

Why the gap persists

What sounds like rigor, is often a skills gap.

The most responsible-sounding teams are often the least practiced.

Low proficiency
before the reps are in

Output disappoints
a hallucination, some slop

Distrust hardens
must be the tool

Confirmation bias
knew I couldn't rely on it

Retreat to old ways
known beats unknown

Every stage of this loop masquerades as good judgment; however it is really a signal of low proficiency + low agency. Distrust is a status report on your scaffolding.

And the hallucination fear assumes the model is always guessing...it isn't. Hold that thought.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced11

    THE SHIFT
  

Transitioning from ground to air.

Airborne = restructuring how you plan, build, verify, and ship around what AI can do (not "typing faster").

Ground transport

Airborne

github.com/rishidean/tutorials-agent-loop  ·  Self-paced12

    WHY US · WHY NOW
  

Only you can fly the helicopter

Walking → Driving

Buys the tools. Rolls them out. Provides foundation & infra

Helicopter

Engineering

The people who write the code. The org can't get airborne until we do.

Jet → Rocket

CTO / CEO

System-wide change — but only after Helicopter is proven.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced13

What you'll walk away with

Get
airborne.

By the end of this deck you'll understand a multi-sprint, self-verifying, autonomous development loop...and the means to run it yourself.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced14

Part two

The double loop

Some files and a bash script. We'll take it apart piece by piece, and then watch it build an app, unattended.

What this answers: what the system is, and how it runs unattended.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced15

WHERE THE INTELLIGENCE LIVES

We're shifting the burden from the model to specification & scaffolding.

Model

The engine

Raw capability. Necessary, but the part everyone talks about.

Files

The judgment

Standards, specs, and memory. Where your engineering lives.

Loop

The discipline

Build, verify, document (repeat until done).

Take any one away and it stops working. And most of what makes it good isn't the model. It's the files (and what we'll cover in the rest of this deck.)

github.com/rishidean/tutorials-agent-loop  ·  Self-paced17

WHAT WE MEAN BY AGENTIC

We're going to orchestrate the model to run in a loop.

Many are still thinking about "autocomplete", but they're very different.

↻ It is — a loop

●Sets its own intermediate steps

●Calls tools to act, not just answer

●Checks its own work

●Loops until the goal is met

→ It isn't — a turn

○Autocomplete on steroids

○A chatbot you babysit turn by turn

○“AI that writes code for you,” one prompt at a time

github.com/rishidean/tutorials-agent-loop  ·  Self-paced18

TWO THINGS MODELS DO

It can generate, or it can call.

Only one of these can hallucinate. Knowing which is what makes the loop trustworthy.

Generate can hallucinate

Text from weights

The model writes the most plausible next tokens from what it learned. Brilliant for synthesis and code — and the only place a guess can sneak in.

        prompt→model→plausible text
      

Call returns a fact

A tool runs, for real

The model emits a structured request; a deterministic system runs it — code, a search, a test — and hands back ground truth the model then reads.

        request→tool runs→fact→model reads
      

In a loop, most of the work is tool calls returning ground truth not the model free-associating. run-qa’s Playwright test actually ran; the model didn’t decide it passed.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced19

    THE PRIMITIVES
  

A few words to help speak the language.

New to Claude Code? Everything that follows is just some form of these terms.

Session

One run of Claude Code. Fresh memory every time. Nothing carries over.

Context window

The working memory a session holds at once. Finite. Fill it and quality drops.

CLAUDE.md

A context file Claude reads automatically at the start of every session.

Task tool

How a session hands a job to a subagent — and gets the result back. It's how run-qa calls its three agents.

Skill

A reusable procedure Claude can invoke by name — like run-qa.

Command

A saved prompt you trigger with a slash — like /review.

Subagent

A scoped helper for one narrow job. Often a cheaper, faster model.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced20

    THE TOPOLOGY
  

One loop inside another.

The outer loop marches through sprints with a fresh brain each time. The inner loop is one session: build → verify → document.

    GREEN = passed, ship it
    ·
    RED = failed, fix & re-run (max 3)
  

Outer loop · run.sh — marches through sprints, fresh brain each turn

Sprint 1

→

Sprint 2

→

Sprint 3

→

…until done

Inner loop · one session

Build

the feature

→

Verify

QA crew → GREEN / RED

→

Document

tick manifest, handoff

↺  RED → fix → re-verify  (max 3 attempts, then stop)

github.com/rishidean/tutorials-agent-loop  ·  Self-paced22

    THE TOPOLOGY · BY FILE
  

How the loop is built in practice.

Same loop as the previous slide — now wired to the real filenames.

Outer loop · marches through sprints, fresh brain each turn

[script] run.sh roadmap.md → the manifest prompt.md → read each turn

Sprint 1

→

Sprint 2

→

Sprint 3

→

…until done

Inner loop · one session

the Builder → a fresh Claude Code session

Build

the feature

CLAUDE.md specs/sprint-N.md

→

Verify

QA crew → GREEN / RED

[skill] run-qa

[agent] linter [agent] code-reviewer [agent] playwright-tester

→

Document

tick manifest, handoff

roadmap.md → [x] PROGRESS.md → handoff

↺  RED → fix → re-run [skill] run-qa  (max 3 attempts, then stop)

github.com/rishidean/tutorials-agent-loop  ·  Self-paced23

    THE SYSTEM
  

The setup stage is for our specifications and agreements

Stage 1 / 3 · Setup

Why first: the system can't act until it knows the rules.

Setup

What it knows

●Tech Lead

●PM

●Design

●QA Lead

Actors

Who does the work

●The Builder

●QA Team (x3)

●QA Manager

Orchestrator

What keeps it moving

●Eng Manager

●PM

github.com/rishidean/tutorials-agent-loop  ·  Self-paced24

BROAD CONTEXT (CLAUDE.MD · PROMPT.MD)

How we work.

Everything true for every sprint: architecture, conventions, and the definition of done. One constraint forces quality —

"Not done until run-qa returns GREEN."

This is the scaffolding that earns the trust.

CLAUDE.md

# Project architecture & conventions
# Marching orders, every session
## Definition of done
Not done until run-qa is GREEN.

prompt.md

Read the manifest. Pick the next sprint.
Build it. Run QA. Update state.
# same instructions, every time

github.com/rishidean/tutorials-agent-loop  ·  Self-paced25

THE MANIFEST (ROADMAP.MD)

What needs to get done, where we're at.

Three characters per line — that's your state machine. The checkbox ticks after each sprint. Stuck work marks [!] and stops.

This file is your project status.

      [ ]  to do
      [x]  done
      [!]  blocked — stops for a human
    

roadmap.md

- [x] Sprint 1: Core clicker — harvest + persistent score
- [x] Sprint 2: Farmhands — buy auto-harvesters
- [ ] Sprint 3: Scaling cost — price curve + spend guard
- [ ] Sprint 4: Polish — number formatting + click pulse
- [!] Sprint 5: Cloud sync — blocked, needs API key

github.com/rishidean/tutorials-agent-loop  ·  Self-paced26

How to decompose · roadmap.md

Slice it vertically, not horizontally.

Weak · horizontal layers✕

# Roadmap
- [ ] Build all data models
- [ ] Build all API endpoints
- [ ] Build all UI screens
- [ ] Wire it together
- [ ] Test everything

Nothing runs until the very end. No verifiable step — nothing for QA to gate.

Good · vertical slices✓

# Roadmap · vertical slices
- [x] S1  Click to earn  ships a page
- [x] S2  Earnings chart on top of S1
- [ ] S3  Hire farmhands needs S1
- [ ] S4  Idle income    needs S3

Each slice ships a working app, ordered by dependency. [ ] [x] [!] legible at a glance.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced27

SPECIFIC CONTEXT (SPRINT-X.MD)

What to build this turn.

One spec per sprint: acceptance criteria, technical approach, edge cases. This is where your engineering judgment lives.

At Driving you write code. At Helicopter you write specs.

specs/sprint-1.md

# Sprint 1 — Core clicker
## Acceptance criteria
· Click #harvest → #score +1
· Score persists on reload
## Approach
· Single index.html, localStorage `clickfarm` { score, farmhands }
## Edge cases
· Score stays a non-negative integer

github.com/rishidean/tutorials-agent-loop  ·  Self-paced28

Where judgment lives · sprint-3.md

A good spec makes the build trivial.

Thin · vague✕

# Sprint 3: Hiring

Let players hire farmhands
that help out automatically.

No criteria. No edge cases. The build is a guess; QA has nothing to assert.

Fat · explicit✓

# Sprint 3: Scaling Cost
## Acceptance Criteria
- #cost rises per the formula
- score < cost → #hire does nothing
- hire at score == cost succeeds
 
## Watch out for
- a hire you can't afford MUST be blocked
- score must never go below 0

Explicit criteria + the trap to avoid. The build is trivial; QA is decisive.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced29

    THE SYSTEM
  

Now, it's time to build.

Stage 2 / 3 · Actors

Why next: someone has to do the work — and check it.

Setup

What it knows

●Tech Lead

●PM

●Design

●QA Lead

Actors

Who does the work

●The Builder

●QA Team (x3)

●QA Manager

Orchestrator

What keeps it moving

●Eng Manager

●PM

github.com/rishidean/tutorials-agent-loop  ·  Self-paced30

THE BUILDER

builder · sprint session

$ claude -p "$(cat prompt.md)"
reading context…
building feature…
writing index.html

Executes the spec for a single feature / sprint. Then done.

The Builder is just a Claude Code session — no special framework, no persistent agent. Spun up fresh for one sprint, then thrown away. Its power is that nothing carries over.

Stateless

Clean brain every turn — no context rot from the last sprint.

Self-checking

It calls run-qa on its own work before it ever declares done.

Hands off

Writes its memory to PROGRESS.md so the next session can pick up.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced31

    THE QA MANAGER
  

When code is complete, the
manager delegates the QA tasks

One call — run-qa — runs all three QA agents and returns a single verdict: GREEN or RED. RED three times and it stops for a human.

run-qa

→

Lint

→

Review

→

Test

→

GREEN

↺  RED → read failures → fix → re-run
max 3 attempts, then mark blocked & stop

.claude/skills/run-qa/SKILL.md

---
name: run-qa
description: Full QA pass — lint, review,
  e2e. Returns one GREEN/RED verdict.
---

# run-qa
Run these in order, each as a subagent:
1. Delegate to linter.
2. Delegate to code-reviewer.
3. Delegate to playwright-tester.

Verdict: any lint error, CRITICAL
finding, or failed spec → RED.
Otherwise → GREEN.

If RED: fix, re-run from step 1.
Stop after 3 attempts; report to human.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced32

    THE QUALITY TEAM
  

Each team member has a very specific job.

.claude/agents/They check, they don't build — cheaper, faster models.

This is the scaffolding that earns the trust — the answer to the doubt loop from Part 1.

Judgment · latent vs Verification · deterministic The crew runs a test, not a vibe check — that's why it catches the bug.

{{ q.mark }}

{{ q.model }}

{{ q.snippet }}

{{ q.file }}

github.com/rishidean/tutorials-agent-loop  ·  Self-paced33

    THE INNER LOOP
  

RED doesn't stop the loop — it routes back to the Builder.

On RED, the verdict carries the actual failures — the Builder reads them, patches, and re-runs. Three attempts, then it stops. RED isn't an opinion — it's a failed assertion; the fix targets a fact, not a feeling.

RED → read failures · fix · re-run

The Builder

Write & fix

Builds the feature. On a RED return, reads the failures and patches the code.

→

run-qa

Verify

Lint · review · test. Returns one verdict — and the failures with it.

→

RED loops back ↑

GREEN exits → Document

Three attempts, then it stops. Still RED after the third try → mark the sprint [!] blocked and hand back to a human. Autonomous, never infinite.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced34

    THE SYSTEM
  

Close the sprint, and move to the next.

Stage 3 / 3 · Orchestrator

Why last: something has to keep the sprints moving, hands-free.

Setup

What it knows

●Tech Lead

●PM

●Design

●QA Lead

Actors

Who does the work

●The Builder

●QA Team (x3)

●QA Manager

Orchestrator

What keeps it moving

●Eng Manager

●PM

github.com/rishidean/tutorials-agent-loop  ·  Self-paced35

    KNOWLEDGE HANDOFF
  

On GREEN, the session writes itself down.

    QA GREEN
    →
    three writes — roadmap = state · PROGRESS = memory · CLAUDE = learning
  

01 roadmap.md

Tick the box. The state machine advances.

State

- [ ] Sprint 3: scaling cost
- [x] Sprint 3: scaling cost

02 PROGRESS.md

Append a handoff for the next session.

Memory

## Sprint 3
scaled hire cost.
spend bug → fixed
(guard the spend)

03 CLAUDE.md

Promote durable learnings to permanent context.

Learning

## Gotchas
+ score must never go
+ negative — guard spend

github.com/rishidean/tutorials-agent-loop  ·  Self-paced36

    THE ORCHESTRATORrun.sh
  

Ten lines. Dead simple by design.

#!/usr/bin/env bash
set -euo pipefail
MAX=4; n=0
while grep -q '^- \[ \]' roadmap.md && (( n < MAX )); do
n=$((n+1)); echo "=== sprint $n ==="
claude -p "$(cat prompt.md)" || { echo "session error"; break; }
# --dangerously-skip-permissions omitted for readability
grep -q '^- \[!\]' roadmap.md && { echo "blocked"; break; }
done
echo "done — $n sprint(s)"

github.com/rishidean/tutorials-agent-loop  ·  Self-paced37

Part three

Your turn

Enough theory. Six files, one command, four sprints — here's how you run it yourself, now or later.

What this answers — how to run the loop on your own machine.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced38

    STEP 1 · SET UPSTEP 1 / 4
  

Clone it. Set up. Three minutes.

git clone https://github.com/rishidean/tutorials-agent-loop
cd tutorials-agent-loop
./setup.sh          # Playwright + Chromium, prereq checks
# auth: run `claude` once, or export ANTHROPIC_API_KEY

The repo, top to bottom

your-service/
├── CLAUDE.md          # stack + "done = run-qa GREEN"
├── prompt.md          # marching orders, every session
├── roadmap.md         # the manifest / state machine
├── PROGRESS.md        # handoff memory (starts empty)
├── run.sh             # the orchestrator (~10 lines)
├── specs/
│   └── sprint-1.md    # one spec per sprint
└── .claude/
    ├── skills/run-qa/SKILL.md  # the run-qa gate
    └── agents/           # linter · reviewer · tester

Three steps to your first run

Drop a CLAUDE.md into one service — your stack, your conventions, and the one line: done = run-qa GREEN.

Write a single spec for that mind-numbing weekly task.

./run.sh — and watch it go.

Before you run

Runtime 10–20 min after setup
Real Claude Code sessions + subagents
Workshop → DEMO_MODE=planted ./run.sh

→

Clone the template

github.com/rishidean/tutorials-agent-loop

github.com/rishidean/tutorials-agent-loop  ·  Self-paced39

    STEP 2 · RUNSTEP 2 / 4
  

One command. The loop writes the code.

$ ./run.sh                      # emergent run
$ DEMO_MODE=planted ./run.sh    # guaranteed Sprint-3 RED → GREEN

A fresh session spins up for each sprint — it builds the feature, runs the QA gate, ticks the roadmap, and hands off. Then the next sprint starts with a clean brain. You write specs; the loop writes code.

Stuck or dirty repo?

$ ./reset-demo.sh       # start over from Sprint 1
TROUBLESHOOTING.md      # auth, model, Playwright, stuck terminal

github.com/rishidean/tutorials-agent-loop  ·  Self-paced40

WHILE IT RUNS  ·  ~10 MINS

Now go hydrate.

A full run takes up to ten minutes. Watch the console scroll by if you like, or step away, stretch, refill your water, and let the loop cook. It doesn't need you for this part.

// back in a few

github.com/rishidean/tutorials-agent-loop  ·  Self-paced41

    STEP 3 · WATCHSTEP 3 / 4
  

What one full run produces — no human at the keyboard.

$ DEMO_MODE=planted ./run.sh
 
  Sprint 1: build -> run-qa ...
  Sprint 1: GREEN after 1m 04s
 
  Sprint 2: build -> run-qa ...
  Sprint 2: GREEN after 0m 58s
 
  Sprint 3: build -> run-qa ...
  (planted bug is seeded and fixed during QA)
  Sprint 3: GREEN after 1m 47s
 
  Sprint 4: build -> run-qa ...
  Sprint 4: GREEN after 1m 09s
 
  ALL SPRINTS COMPLETE - 4 / 4 GREEN

The run, recorded · Sprint 3 RED → GREEN

github.com/rishidean/tutorials-agent-loop  ·  Self-paced42

    STEP 4 · PLAYSTEP 4 / 4
  

You didn't write a line of it.

$ open index.html        # or reference/index.html for the finished build

Play the live build →

Click to harvest. Hire farmhands. Watch idle income roll in. Four sprints, zero keystrokes from you.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced43

    DEBRIEF
  

What just happened.

01

Stateless sessions, real progress. Continuity came from the files, not memory.

02

The gate caught a real bug. Sprint 3 let you overspend; the tester caught it, the Builder fixed it. The loop found it - not a model, nor a human.

03

Four sprints, zero keystrokes. You wrote specs; it wrote, tested, fixed, and documented.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced44

    LOW FLOOR  ·  HIGH CEILING
    EXTENDING THE LOOP
  

This is the seed. Extend it forever.

Every extension is the same shape — one small file in .claude/. Four moves take you from solo to enterprise; the depth is in the appendix.

Encode judgment

the loop writes its own inputs

Add a gate

raise the bar on GREEN

Add structure

scale breadth

Close the loop

durable & auditable

The loop

↻

the seed

      {{ iconHeli }}
      Helicopter (this repo)
      →
      {{ iconJet }}
      Jet (a team)
      →
      {{ iconRocket }}
      Rocket (an org)
    

the four moves in full → “Extending this model” (appendix)

github.com/rishidean/tutorials-agent-loop  ·  Self-paced45

Part four

Looking ahead

What being airborne means for you, the team, and the org — and the single ask that starts it.

What this answers — what airborne means next, and the one ask.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced46

    WHAT AIRBORNE LOOKS LIKE
  

The secret is that there is no secret. It's just putting in the work.

{{ s.markerText }}

{{ s.num }}

Driving — the old way

Helicopter — what the loop does

github.com/rishidean/tutorials-agent-loop  ·  Self-paced48

    THE ACCELERATION
  

Each shift arrives faster than the last.

Walking

~ years

Biking

~ a year

Driving

months

Helicopter

now

Jet

The advantage compounds. The gap between Driving and Helicopter isn't closing — it's widening.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced49

MY REQUEST OF YOU

days.

Every service gets a CLAUDE.md and a run-qa skill.

Pick the most mundane, repetitive task you do every week. That's your first candidate to automate.

    Where to start
    github.com/rishidean/tutorials-agent-loop

github.com/rishidean/tutorials-agent-loop  ·  Self-paced50

    Why it works01 / 05
  

A skill stores a process, not content.

A skill file holds a procedure, not facts — how to do a recurring job: run the three checks, return one verdict. The method becomes reusable and versioned, kept separate from any single task it runs on.

that’s the skill → SKILL.md / run-qa

github.com/rishidean/tutorials-agent-loop  ·  Self-paced53

    Why it works02 / 05
  

Subagents keep context clean.

Each subagent gets its own context window and exactly one narrow job. The reviewer never sees the builder’s scratch work; the tester starts fresh. Narrow scope means a cheaper model and no cross-contamination.

that’s the QA crew → linter · reviewer · tester

github.com/rishidean/tutorials-agent-loop  ·  Self-paced54

    Why it works03 / 05
  

Fresh sessions beat a full context.

A model’s context is finite and degrades as it fills. So every sprint launches a new session with an empty window — the roadmap and PROGRESS files carry state forward, not the model’s fading memory.

that’s the clean brain per sprint → roadmap.md + PROGRESS.md

github.com/rishidean/tutorials-agent-loop  ·  Self-paced55

    Why it works04 / 05
  

The loop improves itself.

When a session learns something durable — a gotcha, a convention — it writes it back into the broad context. The next session starts smarter. The system quietly edits its own instructions.

that’s the Document step → promotes → CLAUDE.md

github.com/rishidean/tutorials-agent-loop  ·  Self-paced56

    Why it works05 / 05
  

Generate is latent. Verify is deterministic.

Generation is fuzzy by nature — synthesis, design, judgment. Verification isn’t: the assertion passes or it doesn’t. You never trust the model’s opinion of its own work; you run a tool that returns a fact.

that’s why QA is a test → a test, not a vibe check

github.com/rishidean/tutorials-agent-loop  ·  Self-paced57

    Appendix · Going furtherThe four moves
  

Same shape, four directions.

Encode judgment

the loop writes its own inputs

·generate-roadmap
·spec-sprint
·plan-sprint  plan before build

Add a gate

raise the bar on GREEN

·security / a11y / perf
·planning approval
·PR-per-sprint + CI

Add structure

scale breadth

·epics → sprints → features
·parallel builders
·multi-repo

Close the loop

durable & auditable

·session-start / wrapup
·structured logs + cost
·knowledge base

    Helicopter (this repo) → Jet (a team) → Rocket (an org)
    full catalog → EXTENDING.md
  

github.com/rishidean/tutorials-agent-loop  ·  Self-paced59

    Appendix · Going furtherEncode judgment · in depth
  

Add a PLAN phase.

Plan before you build — and, going further, before you spec.

pre-spec
runs once

intent

→

PLANauthor

→

roadmap.md + specs/

→

◇approve

post-spec
per sprint

spec

→

PLANbuild

→

plan.md

→

◇approve

→

build

→

document

The artifact · concrete

# specs/sprint-3.plan.md
 
files to touch — index.html (hire, cost)
order — cost formula → guard → UI
approach — check affordability first
risks — negative score if unguarded

Four files, four jobs

roadmap = state

plan = intent

PROGRESS = memory

CLAUDE = learning

A plan is intent — its own slot, not jammed into memory.

Judgment moves up a rung — you approve plans instead of writing specs. That's the Jet step.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced60

Appendix · The principles

The principles, not the syntax.

Commands and flags change. What makes the output good doesn’t.

01

Plan before build. Decide the slice and the spec before a line of code.

02

Verify before done. “Done” means the gate is GREEN — never a feeling.

03

One job per subagent. Narrow scope, cheap model, clean context.

04

Fix, don’t ask. On RED, read the failure and patch — stop only when truly stuck.

05

Promote learnings. Write durable lessons back into the files so the next run is smarter.

This is my way — the syntax will change, but these don’t.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced61

    APPENDIX · THE FILESFile 01 / 13
  

run.sh

The orchestrator — the outer loop

#!/usr/bin/env bash
set -euo pipefail
MAX=4; n=0
# planted mode: DEMO_MODE=planted swaps in a buggy build
# (fixtures/index.sprint2-buggy.html) before Sprint 3 so QA goes RED

echo "THE LOOP — starting..."
cat roadmap.md

while grep -q '^- \[ \]' roadmap.md && (( n < MAX )); do
  n=$((n+1))
  echo "SPRINT $n — launching fresh session..."
  claude -p "$(cat prompt.md)" \
    --dangerously-skip-permissions \
    --model opus \
    || { echo "session error"; break; }
  cat roadmap.md
  if grep -q '^- \[!\]' roadmap.md; then
    echo "BLOCKED — human needed."; break
  fi
  echo "Sprint $n complete."
done

echo "DONE — $n sprint(s) attempted"
cat PROGRESS.md

github.com/rishidean/tutorials-agent-loop  ·  Self-paced63

    Appendix · The FILESFile 02 / 13
  

CLAUDE.md

Broad context — true for every sprint

# Click Farm

## Identity & Mission
You maintain the Click Farm game. Done = the feature
works, QA is GREEN, nothing else broke.

## Map
One index.html (inline CSS + JS). State lives in
localStorage. Specs are in specs/.
State: localStorage key clickfarm = { score, farmhands }.
Stable ids: #score, #harvest, #hire, #farmhands, #cost.

## Conventions
Vanilla HTML/CSS/JS — no frameworks, no build step.
Mobile-first, system fonts, one accent color.

## Running it
Open index.html in a browser to play. run-qa to check.
No package manager.

## Guardrails
Never add dependencies or a build tool. Don't edit
other sprints' specs. Ask before changing the schema.

## Definition of Done
Not done until run-qa returns GREEN.
If RED, read it, fix it, re-run. Never ship on RED.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced64

    Appendix · The FILESFile 03 / 13
  

prompt.md

Marching orders — read at every session start

# Sprint Turn
Read roadmap.md and pick the first sprint marked [ ].
Read that sprint's spec and PROGRESS.md for context.

Build the feature per the spec.
Then run the run-qa skill until GREEN (max 3 attempts).

When GREEN:
- Mark the sprint [x] in roadmap.md.
- Append a handoff to PROGRESS.md: what you built,
  key decisions, gotchas, what the next sprint needs.

If you can't reach GREEN after 3 attempts:
- Mark the sprint [!] in roadmap.md, then stop.

Updating the status is your final action — never skip it.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced65

    Appendix · The FILESFile 04 / 13
  

roadmap.md

The manifest — three characters per line

# Click Farm — Roadmap

- [ ] Sprint 1: Core clicker — harvest + persistent score    — specs/sprint-1.md
- [ ] Sprint 2: Farmhands — buy auto-harvesters             — specs/sprint-2.md
- [ ] Sprint 3: Scaling cost — price curve + spend guard    — specs/sprint-3.md
- [ ] Sprint 4: Polish — number formatting + click pulse    — specs/sprint-4.md

github.com/rishidean/tutorials-agent-loop  ·  Self-paced66

    Appendix · THE FILESFile 04 / 13
  

roadmap.md

One file, nested — Epics → Sprints → Features

# Click Farm — Roadmap

## Epics  «1»
- [x] E1  Core game       shipped
- [ ] E2  AI Bridge       active
- [ ] E3  Multiplayer     planned

## Epic 2 · AI Bridge  «2»
- [x] S1  Tool library        epic2/sprint1-tools.md
- [ ] S2  Command-mode UI      epic2/sprint2-command.md
- [ ] S3  Multi-intent parse   epic2/sprint3-intents.md
  gate: Epic 1 stable + dogfooded 1 week   «3»

## Current Status  «4»
Active   E2 · S2 — Command-mode UI
Shipped  E2 · S1 — tool library, tests green
Next     Command-mode UI -> multi-intent
Gate     2-week dogfood window before E3

Epic dashboard. One row per epic — the bird's-eye view at a glance.

Same checkboxes, nested. Epics → Sprints → Features. Each sprint still points at one spec.

A gate can be human. Not just code — the loop stops and waits for a checkpoint.

The living handoff. Current Status is rewritten at the end of every session.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced67

    Appendix · The FILESFile 05 / 13
  

PROGRESS.md

The memory — the Builder writes it, sprint by sprint

# Progress Log

// starts empty — one handoff entry appended per sprint:
// what was built · key decisions · gotchas · next-up

github.com/rishidean/tutorials-agent-loop  ·  Self-paced68

    APPENDIX · THE FILESFile 06 / 13
  

specs/sprint-1.md

Core clicker

# Sprint 1: Core Clicker

## What to build
A big Harvest button. Clicking it adds 1 to the
score. Score persists across reloads.

## Acceptance Criteria
- Clicking #harvest increases #score by 1
- #score persists after a reload (localStorage)
- Score is a non-negative integer

## Technical Notes
- Single index.html, inline CSS + JS
- localStorage key "clickfarm" → { score, farmhands }
- Stable ids: #score, #harvest

github.com/rishidean/tutorials-agent-loop  ·  Self-paced69

    APPENDIX · THE FILESFile 07 / 13
  

specs/sprint-2.md

Farmhands — buy auto-harvesters

# Sprint 2: Farmhands (auto-harvest)

## What to build
A "Hire Farmhand" button (cost 10). Hiring deducts
10 and adds a farmhand. Each farmhand earns
+1 score/sec automatically.

## Acceptance Criteria
- #hire with score ≥ 10 deducts 10 and adds a #farmhands
- With ≥1 farmhand, #score rises ~1/sec on its own
- #cost shows the hire cost (10)

## Technical Notes
- New ids: #hire, #farmhands, #cost
- setInterval 1000ms; persist state on each tick

github.com/rishidean/tutorials-agent-loop  ·  Self-paced70

    APPENDIX · THE FILESFile 08 / 13
  

specs/sprint-3.md

Scaling cost — QA catches the bug

# Sprint 3: Scaling Cost

## What to build
Farmhands get more expensive the more you own.
cost = floor(10 * 1.15 ^ farmhands). #cost
shows the NEXT price, updates after each hire.

## Acceptance Criteria
- #cost increases after each hire (per the formula)
- When score < cost, clicking #hire does nothing
  (score & farmhands unchanged)
- Score never goes negative; hiring at score == cost succeeds

## Technical Notes
- Integer math, Math.floor
- Guard lives in the acceptance test; the bug is
  planted via the fixture swap, not the spec

github.com/rishidean/tutorials-agent-loop  ·  Self-paced71

    APPENDIX · THE FILESFile 09 / 13
  

specs/sprint-4.md

Polish — formatting + feedback

# Sprint 4: Polish

## What to build
Number formatting + click feedback.

## Acceptance Criteria
- Score ≥ 1000 shows abbreviated (1.2K / 3.4M / 5.6B);
  below 1000 shows the integer
- Same formatting on #cost
- Underlying state stays an exact integer (display-only)
- #harvest pulses briefly on click

## Technical Notes
- Format on display only — never round the stored value

github.com/rishidean/tutorials-agent-loop  ·  Self-paced72

    APPENDIX · THE FILESFile 10 / 13
  

.claude/skills/run-qa/SKILL.md

run-qa — the quality gate

---
name: run-qa
description: Full QA pass — lint, review, e2e.
  Returns one GREEN/RED verdict.
---

# run-qa
Run these in order. Each is a subagent; use its
returned message as the result.

1. Delegate to linter.
2. Delegate to code-reviewer.
3. Delegate to playwright-tester.

Verdict:
- Any lint error, any CRITICAL finding, or any
  failed spec → RED. List what failed.
- Otherwise → GREEN.

If RED: fix the issues, re-run from step 1.
Stop after 3 attempts; then report to the human.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced73

    APPENDIX · THE FILESFile 11 / 13
  

.claude/agents/linter.md

Subagent 1 — structure & syntax

---
name: linter
description: Validates HTML structure and JS errors.
tools: Read, Edit, Bash    model: haiku
---

Run validation on index.html:
1. Check HTML is well-formed (tags, nesting)
2. Run node --check on inline script content
3. Check for unclosed strings, missing semicolons,
   undefined variables

Auto-fix trivial violations (formatting, whitespace).
For judgment calls, leave it and report it.

Return "CLEAN" if no errors remain, else the list
of remaining errors with location.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced74

    APPENDIX · THE FILESFile 12 / 13
  

.claude/agents/code-reviewer.md

Subagent 2 — bugs & logic

---
name: code-reviewer
description: Reviews changes for bugs and UX issues.
tools: Read, Grep, Glob    model: sonnet
---

Review index.html against the sprint spec.
Do not change any files.

Check for:
- Logic bugs (especially localStorage read/write)
- State issues (stale data, race conditions)
- UX problems (unaffordable hires, negative balances)
- Missing acceptance criteria

Return findings as:
- CRITICAL — bugs, data corruption, broken function
- WARN — likely problems, edge cases
- NIT  — style, naming
For each: the issue + a one-line fix. Else "CLEAN".

github.com/rishidean/tutorials-agent-loop  ·  Self-paced75

    APPENDIX · THE FILESFile 13 / 13
  

.claude/agents/playwright-tester.md

Subagent 3 — behavior & acceptance

---
name: playwright-tester
description: Runs behavioral tests against the app.
tools: Read, Write, Bash    model: sonnet
---

Write and run a Playwright test for the sprint's
acceptance criteria.
1. Create test-farm.spec.js from the sprint spec
2. Open index.html via file:// or a local server
3. Test each acceptance criterion

Key scenarios for Sprint 3:
- Click → score increments
- Hire with enough score → cost deducted, +1 farmhand
- Hire when you can't afford it → blocked, no change
- Hire at score == cost → succeeds, score ≥ 0
- Score never goes below 0
- State persists across refresh

Return "GREEN" if all pass, else "RED" plus, per
failure: test name, failed assertion, actual result.

github.com/rishidean/tutorials-agent-loop  ·  Self-paced76

Gettingairborne

Mapping the flow.

This is one way.Not the way.

Two realities coexist in the world

Same car. Different driver.

The frame

Think of AI adoption as modes of transport.

Six ways to move. Only some leave the ground.

The shift from player, to coach, to conductor.

We're widely still in the ground transport phase.

What sounds like rigor, is often a skills gap.

Transitioning from ground to air.

Only you can fly the helicopter

Getairborne.

The double loop

We're shifting the burden from the model to specification & scaffolding.

We're going to orchestrate the model to run in a loop.

It can generate, or it can call.

A few words to help speak the language.

Three parts that mirror what we do IRL.

One loop inside another.

How the loop is built in practice.

The setup stage is for our specifications and agreements

How we work.

What needs to get done, where we're at.

Slice it vertically, not horizontally.

What to build this turn.

A good spec makes the build trivial.

Now, it's time to build.

Executes the spec for a single feature / sprint. Then done.

When code is complete, the manager delegates the QA tasks

Each team member has a very specific job.

RED doesn't stop the loop — it routes back to the Builder.

Close the sprint, and move to the next.

On GREEN, the session writes itself down.

Ten lines. Dead simple by design.

Your turn

Clone it. Set up. Three minutes.

One command. The loop writes the code.

Now go hydrate.

What one full run produces — no human at the keyboard.

You didn't write a line of it.

What just happened.

This is the seed. Extend it forever.

Looking ahead

How it spreads from here.

The secret is that there is no secret. It's just putting in the work.

Each shift arrives faster than the last.

Appendices

Why it works

A skill stores a process, not content.

Subagents keep context clean.

Fresh sessions beat a full context.

The loop improves itself.

Generate is latent. Verify is deterministic.

Extending this model

Same shape, four directions.

Add a PLAN phase.

The principles, not the syntax.

Making it run

run.sh

CLAUDE.md

prompt.md

roadmap.md

roadmap.md

PROGRESS.md

specs/sprint-1.md

specs/sprint-2.md

specs/sprint-3.md

specs/sprint-4.md

.claude/skills/run-qa/SKILL.md

.claude/agents/linter.md

.claude/agents/code-reviewer.md

.claude/agents/playwright-tester.md

Getting
airborne

This is one way.
Not the way.

Get
airborne.

When code is complete, the
manager delegates the QA tasks