Kryptós CronOS
← Back to stages
🛰️

Behind the scenes

Every automated agent on Kryptós CronOS — what runs, when, and the rules they share

The Agent Fleet — Overview

A single reference for every automated agent on Kryptós CronOS: what each one does, when it runs, and the safety rules they all share. Companion to AGENT_DEV_PLAN.md (the original design).

Last reviewed: 2026-06-13


The one rule that governs all of them

No agent ever merges to master. Every agent is one of two kinds:

  • Report-only — it finds problems and files them as cards on the Development board. It never changes code.
  • Write-mode — it makes a change on a branch and opens a PR, then stops. A human reviews and merges.

On top of that: each agent runs with least privilege, every run leaves a trace (Actions logs + board cards + Claude Code transcripts), and the whole reporting surface has a kill switchAGENTS_ENABLED=true must be set or /api/agent/report rejects everything.

This is the Generation / Integration / Amplification discipline from the Auditing-Agentic-AI epochs applied to our own agents: bound what each one can generate, govern how they integrate (tokens, branches, the merge gate), and cap amplification (human approval before anything reaches production).


How the pieces fit together

 nightly reporters ──▶ Development board ──▶ tier triage ──▶ hourly programming agent
 (10 deterministic     (scrum:items)         (AUTO vs HUMAN)   (clears the safe subset → PR)
  + nightly Claude          │
  subagents) ───────────────┘ (judgement work → board findings / focused PRs)

The Development scrum board (admin page → "Development", Redis scrum:items) is the shared nervous system. Reporters file findings onto it; humans and the programming agent work it down; user feedback is ingested into it as Triage cards. The board is admin-only — in fact restricted to the owner account (jjb), not just any admin.


1. The nightly report-only fleet (10 deterministic agents)

Runs every night at 07:00 UTC via .github/workflows/test-agent.yml, posting findings to /api/agent/report. All ten are read-only. Each scopes its own dedup / auto-resolve / sweep-summary by initiator = agent:<name>, so they never step on each other. Shared plumbing lives in _agent-lib.mjs (loadStages / makeCollector / walk / report).

AgentScriptWhat it checks
🧪 Deep Testingtest-agent.mjsEvery quiz (unanswerable correctIndex, ambiguous duplicate option text, missing fields, duplicate qids, thin banks) and every CTF (mirrors validate-ctf).
📝 Content Integritycontent-agent.mjsDuplicate stage/badge ids, missing badges, empty overview/takeaways, broken/missing references, quiz answer-length bias + thin pools.
🩺 Code Healthcode-health-agent.mjsOversized files, TODO/FIXME debt, stray console.log, any-casts.
✍️ Prose Qualityprose-quality-agent.mjsTwo prose checks in one pass: colon-led 5+ item inline lists that should be house-style bullets (also a write-mode — §3), and overviews still in the terse bulleted format or too thin, rolled up per epoch to drive the prose-deepening work. (Merged from the former Format Drift + Content Depth agents.)
🌐 i18n Coveragei18n-agent.mjsMissing / orphaned locale keys across the 7 locales vs. English.
🧭 Content Driftdrift-agent.mjsstages-meta freshness (the CI check:meta build-blocker) + headline stage/epoch count drift across the app + 7 locales. (--fix regenerates meta.)
📚 Docsdocs-agent.mjsEvery docs/*.md vs. its apps/web/secured-docs/ copy + stale headline counts (history files excluded). (--fix re-syncs.)
🗂️ Curriculum Structurecurriculum-agent.mjsOrphan epochs (in no track group → hidden from nav), CTF stages with no flag (→ unsolvable), epochs with ctf.extraCommands but no LOADERS entry (→ terminal commands silently fail), stale flags.
🖼️ Image Integrityimage-agent.mjsSTAGE_IMAGES entries pointing at missing files (→ broken cover), oversized images, and the on-generated-cover coverage stat.
🔐 Security Hygienesecurity-agent.mjsnpm audit critical/high advisories, hardcoded-secret scan (Stripe/AWS/private-key/GitHub/Slack patterns), XSS surface (dangerouslySetInnerHTML), console.* in API routes.

2. The tier method (triage) + write-mode

  • board-ops.mjs — pulls the open board, ranks by risk (priority) then age, and classifies each item AUTO (a deterministic agent can fix it safely → branch+PR) vs HUMAN (judgement / product / feedback / destructive). Read-only triage; --annotate leaves an audit note on each card.
  • board-curator.mjs — moves / annotates cards in bulk (board housekeeping).
  • prose-quality-agent.mjs --fix — the proven write-mode template: requires a clean tree, applies the highest-confidence bullet-normalizations, gates on core tsc, opens an agent/prose-normalize-* branch + PR, returns to master. Never merges.

3. The hourly programming agent (the executor)

apps/web/scripts/auto-agent.mjs · .github/workflows/auto-agent.yml — runs every hour (0 * * * *). The executing half of the tier method: it reviews the board (triage / backlog / todo) and, for the unattended-safe subset of the AUTO tier only, runs the fix, opens a branch + PR, and moves the card to Review.

  • Acts on only deterministic, tsc-gated mechanical fixes: stale meta, stage/epoch-count drift, secured-docs sync, house-style bullet normalization.
  • Never touches judgement work — even AUTO-lane items that need a person (i18n translation, large-file extraction) are left for a supervised run. Deliberately stricter than board-ops' AUTO lane.
  • Never merges, idempotent (handled cards leave the sweep), capped at AUTO_AGENT_MAX=3 PRs/run, kill switch AUTO_AGENT_ENABLED=false, no-overlap concurrency guard.

Priority ≠ auto-completable. The agent decides "is this a deterministic, script-able fix?", not "is this low priority?". A P2 feature/authoring card is HUMAN-lane regardless of priority and is never auto-done.

4. The nightly Claude subagents (judgement work)

.github/workflows/nightly-subagents.yml — runs the supervised Claude Code subagents (.claude/agents/*.md) headless on a weekday rotation, the judgement-heavy work the deterministic scripts can't do. Report-only or PR-only by their definitions; never merge.

DaySubagentPurpose
Mon / weekendaudit-agentDocumented best-practice audit (scope → evidence → test → findings → CAPA): gathers evidence from the deterministic fleet + a security pass, writes a dated, framework-mapped report to docs/audits/, files a board summary.
Tuecontent-reviewerRotating epoch accuracy / tone / reference review; proposes copy edits on a branch + PR; never silently re-keys quizzes.
Wedux-reviewerWalks the live product as a learner; files UX-friction + gamification/retention findings; PRs only tiny high-confidence copy fixes.
Thucode-reviewerDeep code sweep / focused refactor → ONE branch + PR. Forbidden zones: auth/payments/crypto/session/stage-flags/proxy; ≤12 files; tsc+lint+build gates.
Fridocs-agent / drift-agentSupervised front-ends for the §1 maintenance scripts.

Also available on-demand (not scheduled): epoch-author — scaffolds a brand-new epoch end-to-end (stage file + 5 wiring spots + theme + green gates) → PR. It needs a topic, so it's invoked by hand.

Safety: the workflow uses a bounded tool allowlist (read/edit/write + git branch/commit/push + gh pr create + node/npm) — not --dangerously-skip-permissions — and explicitly denies gh pr merge / git merge / push-to-master, so the human-merge gate holds. Dormant until the repo secret ANTHROPIC_API_KEY is set (each run is billed Claude API usage); without it the job logs a skip and exits clean.

5. The nightly Deep Content Author (the grinder)

.github/workflows/nightly-content.yml + .claude/agents/content-author.md — a dedicated Claude subagent that runs every night at 12:00 UTC (04:00 PST) to grind through the deep-content backlog one self-contained unit per run. Same safety model as §4 (bounded allowlist, PR-only, never merges, dormant without ANTHROPIC_API_KEY).

  • Primary work (audit authoring — COMPLETE 2026-06-12): all 27 Advanced Audit domains / 321 modules are now deeply authored (real artifacts, tests, systems, commands per module in packages/core/scripts/audit-content.mjs; no generic fallback). The grinder's standing job is now the secondary one.
  • Now-primary work: deepen the epoch the Prose Quality agent flags most — rewrite terse bulleted overviews into the narrative house standard — one epoch per run via apps/web/scripts/_dump-ov.mjs + _prose-apply.mjs (the de-bullet toolchain).
  • Guardrails: one domain/epoch per run; never edits the hand-authored aar-01, stage-flags, or auth/payments/crypto/proxy; green gates before any PR. Manual run: Actions → this workflow → optionally name a target domain slug.

6. Shared infrastructure

  • /api/agent/report — the single intake. Token-gated (AGENT_REPORT_TOKEN) behind the AGENTS_ENABLED kill switch. Dedups by finding key (agent:ref:checkId) across all statuses — one finding ↔ one card forever: archived ⇒ suppress (a dismissed finding never re-cards), done ⇒ re-open if it's detected again (regressed / the fix didn't take), open ⇒ update in place. That idempotence is what keeps the board from ballooning with nightly duplicates. Auto-resolves vanished findings, writes a per-agent sweep-summary card, and upserts a cross-agent 📊 Agent digest card with ▲/▼/= trends every sweep.
  • gen:meta pre-push hook (apps/web/.claude/settings.json) — blocks a git push when stages-meta.generated.ts is stale (check:meta). Fix: npm run gen:meta -w @kryptos/core.
  • The Development boardlib/scrum.ts (Redis scrum:items) + app/api/admin/scrum (owner-gated CRUD) + components/ScrumBoard.tsx. Columns: Triage → Backlog → To Do → In Progress → Review → Done. A ☑ Bulk move mode multi-selects cards and moves / retags / archives them in one bulk PATCH {ids:[…]} — for sweeping many agent findings at once.

Schedules at a glance

CadenceWhoMode
Nightly 07:00 UTCThe 10-agent deterministic report fleet (test-agent.yml)report-only → board
Nightly 07:30 UTCOne rotating Claude subagent (nightly-subagents.yml)report-only / PR-only (dormant w/o API key)
HourlyThe programming agent (auto-agent.yml)auto-fix safe AUTO subset → PR
On demandClaude subagents (epoch-author) + drift/docs --fixsupervised → PR

To activate / operate

  • Report fleet: AGENTS_ENABLED=true + AGENT_REPORT_TOKEN in Vercel (and the matching GitHub secret) — live.
  • Programming agent: repo secrets ADMIN_SECRET + ADMIN_USERNAME drive the board; default GITHUB_TOKEN opens PRs. Pause with repo variable AUTO_AGENT_ENABLED=false. (PRs opened by GITHUB_TOKEN don't auto-trigger CI — run it on the PR, or use a PAT.)
  • Claude subagents: set ANTHROPIC_API_KEY to turn the nightly rotation on (billed). Override the rotation via the workflow's manual dispatch.
  • Everything is reversible: nothing reaches master (or production) without a human merge.