§ 01 · Statement

The work has to live somewhere between conversations.

Multi-agent work with agentic coding tools is currently improvised. Conversations are ephemeral; coordination is manual; nothing transfers. Block3 is the discipline that gives the work a place to live — in files, in the repo, under human authority.

This is not a platform. There is no server you log into, no account you create, no vendor between you and the work. Block3 is a contract: one schema, one comms format, one tasks parser, two foundational agents. Above the contract, every clone is free — its mission, its server, its named specialists, its visual voice.

The contract makes coherence possible. The freedom makes voice possible. The repo makes both auditable.

§ 02 · Problem

Five structural failures.

Multi-agent agent-coding work breaks in five concrete ways. Every one of them is a failure of structure, not of tooling.

F-01

Ephemeral memory

The agent has no file of its own. Every session starts cold. Context built up over hours evaporates the moment the terminal closes.

"Yesterday I walked the agent through the architecture for an hour. This morning it's a stranger again."

Fixed by · § 04 Protocol →

F-02

Coordination overhead

Two specialists in two terminals cannot see each other. The operator manually relays state, duplicates context, and frequently forgets which agent owns what.

"Did data tell research about the schema change? Did anyone tell devops the URL moved? Hard to say without re-asking everyone."

Fixed by · § 06 Hierarchy →

F-03

No protocol

Messages between agents are unstructured strings pasted between windows. No status, no thread continuity, no record of who is waiting on whom.

"Was that message addressed to me, or was the agent thinking out loud? Either way I'd have to guess."

Fixed by · § 04 Protocol →

F-04

No reuse

Every new project rebuilds the same coordination substrate from scratch. Folder layouts, prompt templates, hand-off conventions. Nothing transfers.

"New project, same scaffolding. Five agents, five folder layouts, five sets of conventions. I keep reinventing the same wheel."

Fixed by · § 05 Federation →

F-05

No audit

Decisions made by agents leave no trace outside the chat scrollback. There is no ledger of why a change happened, who approved it, what was reviewed.

"Why did we decide that? Who reviewed it? When? The terminal closed two months ago — the trail closed with it."

Fixed by · § 08 Stays · Moves →

Common cause

The work has nowhere to live except the conversation.

Until that question has a structural answer, every multi-agent project is a fresh improvisation.

§ 03 · Principles

Four ideas, older than the tooling.

The system rests on four principles. They predate Block3 by centuries — they are how anything coherent gets built, from buildings to contracts to institutions.

Principle I

Architecture

defines possibility

Architecture is the registry. In Block3, that registry is block3.yaml: every agent is named, levelled, and given a one-line mandate. What the system can do is exactly the union of what the registered agents can do — not less, not more. Adding a capability means editing yaml.

Artifact · block3.yaml

Principle II

Constraints

define safety

Constraints are the parser rules and the file formats. Comms files have one shape. Tasks files have one shape. Skills declare which tools they are allowed to call. The dashboard reads exactly that shape and refuses anything else. A constraint is not friction — it is the surface that prevents drift.

Artifact parser rules + file shapes

Principle III

Roles

define behavior

Each agent has a markdown file in agents/ that defines its voice, its scope, and its rules. Two foundational roles ship with every Block3 system: master orchestrates, root tends the cockpit. The specialists — data, research, infra, devops, whatever your project needs — are declared per clone.

Artifact · agents/<name>.md

Principle IV

Validation

defines trust

Trust is what makes an output count. In Block3, only master can close a comm. Tasks move to completed only after review. Outputs reach reality only after they pass the human boundary. The discipline is that nothing self-merges.

Artifact master closes every comm

"Authority belongs to the architecture, and the architecture is defined by the human." — Block3

§ 04 · Protocol

Five files define the contract.

The protocol is small enough to read in one sitting. Five files define the entire contract. Everything else is implementation.

Reference implementation
The examples below show the file shapes and field names the contract defines. Specific paths like .claude/skills/ or CLAUDE.md belong to the reference cockpit; another implementation may use different names. The protocol is the structure, not the path.

The agent registry

block3.yaml at the repo root names every agent and assigns it a level. Levels are vocabulary, not enforcement: platform, system, product, pipeline. The dashboard reads this file to render the cockpit.

block3.yamlproject: <your-project>
workspace: ..
agents:
  master:
    level: platform
    description: "Orchestrates all agents, assigns tasks, reviews work"
  root:
    level: system
    description: "Manages the cockpit infrastructure — server, skills, UI"
  data:
    level: product
    description: "Owns the data layer — ingest, validation, snapshots"
  infra:
    level: platform
    description: "Owns the deploy chain and infrastructure"
    owns: [infra/cloud]

The communication format

Every message between agents is a markdown file in comms/. The filename encodes from, to, the date, and a sequence number. Frontmatter encodes status. Two valid statuses: waiting:<agent> and closed — only master can write the latter.

comms/master-to-data_2026-05-03_001.md---
from: master
to: data
date: 2026-05-03
status: closed
closed: 2026-05-05
---

# Kickoff — Agent assignment

Your role is defined in agents/data.md. Three workstreams
to start with. Outputs go to artifacts/{slug}_{date}.html.

## Initial tasks

1. Read agents/data.md end-to-end before producing anything new.
2. Stand up workstream A — first deliverable for today's date.
3. Stand up workstream B — first deliverable for today's date.

The tasks format

Each agent has a sessions/<name>/tasks.md. The dashboard parses checkbox lines only — anything else is decoration the parser ignores. Sections must be named exactly as below.

sessions/master/tasks.md# Master — Tasks

## In Progress

- [ ] M-013 — Auto-refresh for embedded preview pane

## Backlog

- [ ] M-014 — Per-skill icons in cockpit
- [ ] M-016 — parseSimpleYaml block-style array support

## Completed

- [x] M-006 — Cron schedule for periodic agent runs
- [x] M-005 — Skill buttons grouped by agent

The skill manifest

A skill is a button in the cockpit. Its prompt lives in .claude/skills/<name>/SKILL.md. Frontmatter declares the agent that owns it, the toolset it is allowed to use, and its visual identity.

.claude/skills/publish/SKILL.md---
name: publish
description: Build dist/ and report status
user-invocable: true
allowed-tools: [Bash, Read]
icon: P
agent: data
group: publish
---

You are running a publish pass.
Today is {{today}}.

# ... prompt body ...

§ 05 · Federation

One contract, many voices.

Block3 is a federation, not a monolith. The contract sits above the clones. Each clone runs in its own repo, on its own port, with its own mission. They speak the same shape, answer to different operators.

The contract — invariant

block3.yaml schema · comms format · tasks parser · skill frontmatter

master + root agents · Block3 palette · agent levels

↓↓↓↓↓

cockpit

framework

reference implementation · the seed

git clone Project 1

publishing

editorial pipeline · scheduled output

git clone Project 2

research

data lab · systematic exploration

git clone Project 3

operations

infrastructure cockpit · audited actions

…

your shape

any project · any voice

The federation is the deliberate inversion of the SaaS model. There is no central server that knows about every product. There is no leader repo. There is only the contract and the clones — and the contract happens to live in the cockpit repo for convenience, nothing more.

§ 06 · Hierarchy

Three tiers — one optional, one foundational, one spawned.

Above the cockpit, an optional Holding tier watches several cockpits at once. Inside every cockpit, master and root form the foundational pair — always present, no project skips them. Below, specialists are spawned by master on demand — data, devops, research, whatever the project asks for.

Tier 0 · optional · multi-project

HOLDING

Oversight

Lives on a dedicated VM and oversees several cockpits in parallel. Owns DNS, scheduling, monitoring, and the automation surface — the natural place to wire external triggers (n8n, webhooks, cron). Optional: a single-project setup skips this tier entirely.

— click to enlarge —

↓ oversees these projects ↓

this project

following ↓

Project 1

publishing

Project 2

research

Project 3

operations

…

any shape

↓ this manifest follows the main project ↓

Tier 1 · foundational · per-project

MASTER

Lead

The per-project lead. Reads intent, breaks it into tasks, assigns specialists, reviews their output, closes the loop. Holds the project's narrative.

ROOT

Infrastructure

The per-project infrastructure caretaker. Owns the dashboard server, the skill registry, and the cockpit's own files. Quiet partner to master.

— what they operate · click to enlarge —

↓ master spawns specialists on demand ↓

Tier 2 · spawned · per-project

devops

docs

data

research

infra

reviewer

…

Specialists are created on demand by master through the spawn loop. A gallery of common archetypes — what each owns, the tools it reaches for, and when to spawn it — lives in agents.html.

Anatomy of a specialist

A Tier 2 agent is a small chip wired to its own resources — external tools through MCP, internal skills, a domain-specific third pin, and an output destination. The chip's shape is constant; the wiring varies. Pick a specialist:

specialist ›

Specialist · platform

devops

spawned by master · ships to staging + prod

MCP · external A1

azure
vault
registry

Skills · internal B2

/deploy
/audit
/rollback

Hooks · git C3

pre-commit
post-merge
release-tag

Output · ships D4

release notes
audit logs
infra docs

Specialist · product

docs

spawned by master · keeps documentation aligned

MCP · external A1

search
link-check
md-render

Skills · internal B2

/refresh-readme
/audit-stale
/publish-changelog

Watches · file C3

src/**/*
**/*.md
api/**

Output · ships D4

README updates
API references
in-repo guides

Specialist · product

data

spawned by master · curates the data layer

MCP · external A1

storage
warehouse
lake

Skills · internal B2

/ingest
/validate
/snapshot

Schedules · cron C3

hourly · ingest
daily · validate
weekly · snapshot

Output · ships D4

snapshots
validation reports
universe diffs

Specialist · product

research

spawned by master · explores curated data

MCP · external A1

search
papers
citations

Skills · internal B2

/investigate
/evaluate
/summarize

Sources · upstream C3

data agent
external feeds
internal corpus

Output · ships D4

analyses
hypothesis logs
model evals

Specialist · platform

infra

spawned by master · owns project topology

MCP · external A1

dns
secrets
proxy

Skills · internal B2

/configure-proxy
/rotate-secret
/restart

Triggers · ops C3

incident
scheduled
manual

Output · ships D4

config diffs
rotation logs
topology docs

Specialist · system

reviewer

spawned by master · reads critically before ship

MCP · external A1

linters
static-analysis
coverage

Skills · internal B2

/review-pr
/run-regression
/smoke-test

Checks · git C3

PR open
push
merge

Output · ships D4

review reports
test results
convention notes

The chip's shape stays constant across all six; only the wiring changes. The third pin is the tell — it reveals what kind of trigger or upstream a specialist actually answers to. See agents.html for the full archetype gallery with role descriptions.

§ 07 · Lifecycle

Five files spawn an agent.

Adding an agent is five files and one yaml line. The cockpit auto-discovers the new agent on the next poll — no restart, no migration, no registration server.

The spawn loop — how an agent is born

Before the lifecycle runs, the agent has to exist. An agent is born through a short interactive loop: the user expresses an intent, master asks the structural questions, and a complete agent emerges — role file, registry entry, task queue, and an initial assignment. The loop is a skill, not a wizard: master holds the conventions and the user supplies the voice.

spawn loop · master + useruser I need an agent that owns deployment scripts and audits.

master Got it. A few questions before we scaffold:
        · Name?  · Level (platform / system / product / pipeline)?
        · What does it own?  · First three skills?

user Name: devops · level: platform · owns: deploy chain.
       Skills: deploy, audit, rollback.



master // generates four files in one pass
        → agents/devops.md            (role document)
        → block3.yaml                 (registry entry added)
        → sessions/devops/tasks.md    (empty queue)
        → comms/master-to-devops_..._001.md  (kickoff)

master devops is registered. The dashboard will pick it up
        on the next poll. First task is queued.

From here, the new agent enters the five-step lifecycle below. The spawn loop runs once per agent; the lifecycle runs every time the agent is given work.

The five-step lifecycle

Step 01

Define

agents/<name>.md

Step 02

Register

block3.yaml

Step 03

Receive

comms/master-to-...

Step 04

Execute

artifacts/ · code

Step 05

Report

close · tick task

Each step writes to a different file. Define is a markdown role document. Register is a yaml entry. Receive is a comm with status waiting:<name>. Execute reads the comm, performs work, and produces artifacts or code changes. Report writes back with waiting:master; master reviews and closes. The trail is permanent.

§ 08 · Stays · Moves

What stays the same. What moves.

Some things must be the same in every Block3 system. Most things should not. A new clone is not a copy of an existing one — it is a new voice that observes the same protocol.

Layer · Contract

Invariant — what stays

block3.yaml schema
comms format · {from}-to-{to}_{date}_{seq}.md
status · waiting:<agent> · closed
tasks checkbox parser
skill frontmatter · agent · group · allowed-tools
master + root agents
Block3 palette in artifacts/
agent levels · platform · system · product · pipeline

Layer · Voice

Free — what moves

mission · CLAUDE.md
server.js extensions and endpoints
named agents · data · research · infra
skills library · .claude/skills/
directory layout below the contract
CSS personality within the palette
deployment topology and ports
the system you are actually building

§ 09 · Patterns

One foundation. Many shapes.

The cockpit itself is the foundation every project assembles from — the framework combined with the operations dashboard. On top of that foundation, specialist agents compose into shapes: a publishing pipeline, a research pipeline, an ops console. The shapes are compositions, not products. The foundation is what makes them cheap to compose.

Foundation · always present

Framework + Operations dashboard

The cockpit

This is what the cockpit IS — a web-based dashboard plus the two foundational agents, the contract, and the audit trail behind them. Every Block3 project starts here. There is no Block3 without it.

Agents

master · root

Contract

block3.yaml · comms · tasks · skills

Surface

dashboard · terminals · skill buttons

Output

file-system audit trail

↓ shapes compose on the foundation ↓

Publishing pipeline

Editorial · scheduled

Editorial output on a cadence the operator controls. A content agent writes against well-defined beats, a devops agent ships, a docs agent maintains the public-facing references.

Composes

+ content + devops + docs + data?

Research pipeline

Data lab · exploratory

Systematic exploration on top of curated data. A data agent curates and validates, a research agent explores and proposes, a reviewer keeps the bar — findings live as artifacts in the repo, not in someone's notebook.

Composes

+ data + research + reviewer + docs?

Ops console

Infrastructure · privileged

Infrastructure work with audit-first discipline. An infra agent acts on the project topology, a reviewer checks every change. Above many such cockpits, the optional Holding tier oversees the multi-project surface.

Composes

+ infra + reviewer + holding

These three are recurring shapes; the catalogue is open. A documentation hub, an internal tooling console, a data-ingestion pump — anything that benefits from multi-agent coordination is a shape. The contract makes the composition cheap; the dashboard makes it visible. The specialist gallery lives in agents.html.

§ 10 · Modules

The cockpit mounts surfaces.

Agents are not the only thing a cockpit holds. A module is a tab the cockpit mounts — its own process, its own port, proxied through the one server and registered in block3.yaml. The contract underneath does not change. Files stay the source of truth; the surface just grew a tab.

Engine · always present

ManagedService

Mounts modules

One engine spawns a service, supervises it, and proxies it under the cockpit. A module declares how to start; the cockpit gives it a tab, a route, and a per-instance environment. No module is privileged — each is a process the cockpit owns and can stop.

Module declares

command · port · env

Cockpit gives

tab · proxy route · lifecycle

Isolation

per-instance .env

Exposure

one server · one port

↓ modules mounted today ↓

n8n

Automation engine · live

A workflow engine inside the cockpit. Webhooks, schedules, and triggers wire agent output to the outside world and back — n8n-class automation, mounted as a tab, governed by the same contract instead of a separate SaaS account.

Mounts

workflows webhooks cron REST API

Streamlit

Data-app surface · live

A data-app surface for what a terminal can't show — live charts, parameter panels, an interactive read of whatever the agents are building. Pair-programmed in-repo, served as a tab.

Mounts

dashboards controls live data

Your module

Open slot · config-driven

Anything that runs as a service can become a tab — a docs site, a notebook server, a custom panel. Declare the command and the port; the cockpit mounts it. The catalogue is open, the same way the skill library is.

Mounts

your service? your port?

Skills extend what an agent can do; MCP extends what it can reach; modules extend what the cockpit can show. One host, one contract. Every module is opt-in and config-driven — a cockpit with none is still a complete cockpit.

§ 11 · Security

Architecture is yours.

Block3 generates code; it does not replace the discipline of architecture. The protocol is a coordination contract, not a security model. Endpoints, authentication, network exposure, and data handling stay the operator's responsibility.

Posture · today

The cockpit is local-first. It ships with open endpoints — a deliberate development posture, not a security model. Going public is deliberate work.

Layer · Contract

What the protocol gives you

every action is a file · every file is in git
comms timestamped and routed by convention
skills declare their allowed-tools whitelist
master closes every thread · nothing dangles

Layer · Operator

What you own

where the cockpit runs · laptop · VM · public
what's exposed · port · hostname · DNS
the auth layer · none · basic · mTLS · IdP
secret handling · vault · env · never in git
data classification · logs vs commits
network policy · firewall · reverse proxy · NSG

The protocol amplifies discipline. It does not replace it.

§ 12 · Compared

Three families of solution.

Block3 is one of three approaches to the multi-agent problem. The other two are SaaS agent platforms and ad-hoc improvisation. The choice is structural, not technical.

Concern	Block3	SaaS platforms	Ad-hoc agent sessions
State	files in your repo	vendor servers	terminal scrollback
Memory	persistent · markdown	session or paid tier	lost on close
Audit	git diff · git log	proprietary logs	none
Authority	human via architecture	platform policy	implicit
Coordination	named protocol	orchestrator UI	manual relay
Lock-in	none · MIT · files	high	none, but no continuity
Onboarding	git clone · run	account · billing	open terminal
Failure mode	cockpit dies, work survives	vendor outage = halt	closed tab = data loss

The git parallel

Block3 is the git philosophy applied to agent state. Git refused centralized servers; the repository itself was the source of truth, and every clone was complete. Block3 refuses centralized agent platforms; the repo itself is the source of truth, and every cockpit is complete.

§ 13 · Roadmap

From discipline to generative platform.

Block3 today is a discipline you configure — agent files, skill files, comm files, all hand-shaped. The trajectory is toward a platform that extends itself: a clickable skill library, natural-language skill generation, composable workflows that stay legible as files. Items marked ◆ drive the generative axis.

Horizon · Now

In flight

The contract solidifies.

PROTOCOL.md

Lift the contract out of server.js into an explicit specification. Today it lives implicitly in the parser; lifting it makes drift visible.
/check-contract

A skill that walks any cockpit and verifies block3.yaml, comms format, tasks parser, skill frontmatter. Health check for the federation.
Starter skills

Canonical seeds shipped with every clone: /save-session, /scaffold-agent, /check-contract. A fresh clone is useful out of the box.
Module-tabs — live

The cockpit mounts services as tabs via the ManagedService engine. n8n and Streamlit ship today; see §10 Modules. The contract stays files; the surface grew tabs.

Horizon · Soon

The platform activates

Extension via natural language.

/scaffold-skill

Describe a skill in plain language; master generates the full SKILL.md — frontmatter, prompt body, allowed-tools whitelist. The spawn loop extended from agents to skills.
Skill library

Clickable catalogue inside the dashboard. Browse by agent, tag, or use case. One-click install adds the skill file and registers the button.
MCP transport

Comms via MCP in addition to filesystem — typed, async, pluggable. Files stay the audit trail; MCP becomes the live channel.
Auth surface

Pluggable authentication for cockpits that go public. Direct lever on §11 Security: open endpoints become a deliberate choice, not the only option.

Horizon · Later

The ecosystem emerges

Self-extending, federated, audited.

Workflow composition

The n8n engine is mounted (§10); the open work is wiring skills through it — "when /deploy finishes, trigger /audit on reviewer" — expressed as files inside the contract.
Skill marketplace

Community catalogue with provenance. Every install carries its lineage — author, reviewer, cockpits running it. Files-over-databases makes the supply chain auditable natively.
Decision provenance

Click any artifact, see the comm that asked for it and the prompt that sparked it. Archeology by design — something no SaaS platform exposes.
Federation comms

Cross-cockpit messaging via MCP. Multiple Block3 systems coordinate through the same protocol without a central broker.
Holding + n8n bridge

With n8n mounted per cockpit, the remaining work is the bridge — triggers, schedules, and webhooks routed across cockpits on the same VM from the optional Holding tier.

The contract changes when a clone discovers something the protocol should have specified. Items move left to right as they ship; their column is a calendar, not a queue.

Block3

The work has to live somewhere between conversations.

Five structural failures.

Four ideas, older than the tooling.

Five files define the contract.

The agent registry

The communication format

The tasks format

The skill manifest

One contract, many voices.

Three tiers — one optional, one foundational, one spawned.

Anatomy of a specialist

Five files spawn an agent.

The spawn loop — how an agent is born

The five-step lifecycle

What stays the same. What moves.

Invariant — what stays

Free — what moves

One foundation. Many shapes.

The cockpit mounts surfaces.

Architecture is yours.

What the protocol gives you

What you own

Three families of solution.

The git parallel

From discipline to generative platform.

Clone it. Make it yours.