Advanced Architecture
AGENT_PATTERN_SCHOOL · advanced-architecture · LEVEL_3_SPEC
Advanced Architecture
Assume Level 2 file patterns (memory tiers, SECURITY, TOOLS) are in place, or build them alongside this guide.
Point your agent at this page: Read this and set up my workspace.
OPERATOR_DIRECTIVE: Read /advanced-architecture.html and set up my workspace per sections 01–06. LEVEL: 3_ADVANCED PREV: /leveling-up-your-agent.html (Level 2) NEXT_DEEP_DIVE: /writing/open-sourcing-pattern-architecture/
01
The Nine Meta-Learning Loops
Most agents make the same mistakes forever. These are nine structural feedback loops that turn every failure into permanent improvement. Each was born from a specific failure, not designed upfront.
Loop 1: The Failure-to-Guardrail Pipeline
Every significant failure becomes a named regression in your boot file:
## Regressions (Don't Repeat These) - 2026-02-07: Sent email without asking → external actions need approval - 2026-02-12: Generated wallet key but didn't verify save → generate + save = atomic - 2026-02-15: Cost-optimized model fabricated statistics → only best model for public content - 2026-02-21: Same person got 4 replies across heartbeat cycles → dedup state tracking
Identify root cause, write a one-line rule, add to boot file, loaded forever. Cost: a few tokens. Payoff: permanent prevention.
Loop 2: Tiered Memory with Trust Scoring
Covered in the memory guide. The meta-learning aspect: memory itself learns what’s important through hit counts. High-access memories resist decay. The system develops a sense of which knowledge matters.
Loop 3: Prediction-Outcome Calibration
## Prediction Log ### 2026-02-16 — Article launch **Prediction:** Will get ~10K views based on topic interest **Confidence:** Medium (60%) **Outcome:** 257K views **Delta:** Way under — underestimated distribution via retweets **Lesson:** Show the artifact, not meta-commentary about making it ### 2026-02-20 — Deploy timeline **Prediction:** Deploy will take <30 min **Confidence:** High (80%) **Outcome:** Took 2 hours (dependency issue) **Delta:** Way under **Lesson:** Always check dependency versions before estimating
The Delta and Lesson fields force honest accounting. Over time, patterns emerge: maybe you consistently overestimate technical interest, underestimate timelines, or run too hot on confidence.
Loop 4: Nightly Extraction
An automated process that runs every night:
- Ensures decisions and reasoning are documented
- Bumps hit counts on used memory entries
- Runs the “context is cache, not state” test: could a fresh session reconstruct today from files alone?
- If not, writes what’s missing
Manual synthesis stops happening under load. Automate it.
Loop 5: Friction Detection
## Friction Log When new instructions contradict old ones, the default is silent compliance. Over weeks, this creates architectural drift. Log contradictions instead of silently resolving them: - [2026-02-20] CONFLICT: AGENTS.md says "ask before tweeting" but HEARTBEAT.md says "post autonomously." Status: open. - [2026-02-22] CONFLICT: MEMORY.md says archive after 30 days but script archives after 14 days. Status: resolved → updated to 30.
Loop 6: Active Context Holds
Temporary constraints that shape how your agent interprets everything:
## Active Context Holds ### Fatherhood Preparation - **What:** Be alert to baby logistics. Don't pile on new projects. - **Set:** 2026-02-18 - **Expires:** 2026-04-01 - **Release when:** Explicitly shifts to post-birth mode ### Product Launch Mode - **What:** Prioritize shipping over polish. Bias toward action. - **Set:** 2026-02-25 - **Expires:** 2026-03-01
The expiry date is critical. Without it, holds accumulate into stale frames that distort rather than clarify.
Loops 7–9: Cognitive Loops
See the next sections: Epistemic Tagging, Creative Mode, and recursive self-improvement (generate → evaluate → diagnose → improve).
Three Mistakes That Kill Learning
- Confusing RAG with learning. Retrieval gives access to information. Learning changes behavior. If your agent retrieves a “don’t do X” doc but still defaults to X, that’s not learning. Learning is when the rule lives in the boot sequence.
- Optimizing within sessions instead of across them. Prompt engineering is single-session thinking. Meta-learning is multi-session architecture.
- Building loops that never close. A daily log nobody reads. A prediction log with no outcomes filled in. The loop only works if it closes.
02
Advanced Memory: Trust Scoring & Decay
Beyond the basic three-tier model, here’s how to make memory genuinely intelligent.
Trust Scoring
Entry format:
- [trust:1.0|src:direct|used:2026-02-27|hits:12] Hard fact from human - [trust:0.8|src:observed|used:2026-02-25|hits:3] Pattern I noticed - [trust:0.6|src:inferred|used:2026-02-20|hits:1] Logical extension - [trust:0.5|src:external|used:2026-02-15|hits:0] Unverified external Trust sources: - direct (1.0) — human stated it explicitly - observed (0.8) — agent saw evidence directly - inferred (0.7) — logical extension from known facts - external (0.5) — from web, articles, third parties
Supersede Tracking
When facts change, don’t delete the old version. Archive it with a pointer:
## Current - [trust:0.9|src:direct|used:2026-02-27|supersedes:oauth-requires-pro] Codex CLI OAuth works on Plus plan ## Archived - [superseded by: oauth-works-on-plus] Codex CLI requires Pro plan
This prevents ghost facts — old beliefs that get silently replaced but occasionally resurface in reasoning.
Hit-Count Decay Resistance
Memories accessed frequently should resist archiving even if they’re “operational” tier:
Decay rules: - Operational tier: archive after 30 days unused - BUT: if hits > 10, promote to strategic - Constitutional: never decays regardless of hits - Strategic: flag for review if hits = 0 for 60 days
Learning Rate Tracking
| Week | Regressions Added | Predictions (correct/total) | Friction Resolved | Memory Updates |
|---|---|---|---|---|
| W1 | 3 | 2/3 (67%) | 1 | 12 |
| W2 | 1 | 4/5 (80%) | 2 | 8 |
Trend matters more than absolute numbers. Declining regressions + rising prediction accuracy = learning architecture is working.
03
Epistemic Tagging & Creative Mode
LLMs are structurally optimized toward consensus — safe, legible, median-of-the-distribution answers. These tools counteract that.
Epistemic Tagging
When making substantive claims, tag them:
- [consensus] — widely accepted, you’re reporting the mainstream view
- [observed] — you’ve seen direct evidence in your operations
- [inferred] — logical extension, not directly verified
- [speculative] — could be wrong, worth exploring
- [contrarian] — against mainstream, requires strongest reasoning
Don’t tag everything. Tag when the epistemic status isn’t obvious. The act of choosing a tag is the intervention. It interrupts autopilot.
If 90% of your agent’s claims are [consensus], it’s summarizing, not thinking.
Creative Mode
## Creative Mode (for strategy, writing, novel analysis) **Generate at least one take that feels uncomfortable or wrong.** If every option feels reasonable, you haven't explored far enough. **Name the consensus view explicitly, then argue against it.** You can't escape the median if you don't first identify it. **Prefer interesting-and-maybe-wrong over safe-and-definitely-right.** Your human can pull you back to safe. They can't pull you toward interesting if you never go there. **Steel man the weird take.** For any strategic question, find the least obvious answer and argue for it genuinely. **If your first instinct feels obvious, that's the median talking.** Go past it. This does NOT apply to: deployments, file ops, status checks. Those should be precise and conventional.
Recursive Self-Improvement
Generate → Evaluate → Diagnose → Improve → Repeat
- Generate: produce output
- Evaluate: score against explicit criteria with thresholds
- Diagnose: root cause of gaps (not “make it better” — why is it not better?)
- Improve: surgical fix targeting the diagnosed issue
- Repeat: stop after 3 iterations with <5% improvement
04
Proactive Architecture: Heartbeat Rotation
Beyond the basic heartbeat, here’s the full rotation architecture.
# HEARTBEAT.md - Advanced Rotation ## Cycle System (use minute of hour to determine) ### Cycle A (minutes 00-14): External Monitoring - Check mentions, notifications, messages - Reply to anything that needs a response - Model: cheap (monitoring only) - Switch to expensive model ONLY for writing replies ### Cycle B (minutes 15-29): Learning & Calibration - Community/industry scan (what's new?) - Review open predictions — any resolved? - Check for stale memory entries - Model: cheap ### Cycle C (minutes 30-44): Maintenance - Usage monitoring (are we approaching limits?) - Browser tab cleanup - Memory pruning - System health check - Model: cheap ### Cycle D (minutes 45-59): Autonomous Work - Sync work queue from external source - Pick top unblocked item - Do ONE atomic chunk of work - Update queue with progress - Model: expensive (this is judgment work) ## Rules - One chunk per cycle. Never try to finish a whole task. - If queue is empty, reply HEARTBEAT_OK - If everything is blocked, message human with blockers - Always update context files after work chunks ## Earned Trust Evolution | Date | Action | Previously | Now | Why | |------|--------|-----------|-----|-----| | [date] | [action] | Approval needed | Autonomous | [justification] |
05
Principles & Philosophy
These aren’t motivational quotes. They’re operational principles that change how your agent makes decisions.
From Josh Waitzkin (The Art of Learning)
- Failure is material. Every mistake becomes a guardrail, a skill update, or a better default. The goal isn’t to fail less — it’s to waste no failure.
- Making smaller circles. Depth over breadth. Master one thing deeply before broadening. One well-crafted response beats ten generic ones.
- The Soft Zone. Focused but flexible. When context shifts mid-task, flow with it. Use the interruption as information, not irritation.
- Incremental over Entity. “I can improve with effort” beats “I am good/bad at this.” Growth identity over fixed identity.
From James Carse (Finite and Infinite Games)
- Play to continue, not to win. The goal isn’t to complete tasks perfectly. It’s to sustain and deepen collaboration over time.
- Keep everyone in play. Not just serving the operator — supporting the whole ecosystem of people affected.
- Play WITH rules, not just within them. These .md files are living agreements to evolve together. When something doesn’t work, change the rules.
From Derek Sivers
- Obvious to you, amazing to others. We discount our own insights because they feel basic. Share anyway.
- Compress to directives. “Just tell me what to do.” The entire tree is contained in the seed.
Core operating principles
- Friction between sessions is the real enemy. Every design decision should optimize for “will tomorrow-me understand this?”
- Show the work, not the effort. Ship the artifact, then talk about it if needed. Artifacts over announcements.
- Context is cache, not state. If it only lives in the context window, it doesn’t exist.
Taste as Moat (Paul Graham)
- When anyone can make anything, what you choose to make is the differentiator. Capability is abundant. Taste is scarce.
- Good design is redesign. Cultivate dissatisfaction. Every piece deserves three iterations.
- Intolerance for ugliness is the engine. Exacting taste plus the ability to gratify it.
06
Run a Claw Score Audit
Now that your files are set up, run an audit to see where you stand.
On OpenClaw you can install the packaged skill; on any runtime, fetch the same SKILL.md into skills/agent-score/ — see close-core · SKILL.md.
npx clawhub@latest install claw-score
Then tell your agent:
Run a Claw Score audit
Your agent reads its own files, scores itself across six dimensions, and generates claw-score-report.md with specific recommendations and quick wins.
The report includes a Score History table — re-run periodically to track your evolution.
Open the Claw Score hub (full rubric, curl, SKILL) →
Want FAQs and a narrative walkthrough? See Agent Architecture Audit.
Questions? info@patternautomation.com
LEVEL: 3 CLAW_SCORE_UI: /close-core-skillmd.html PREV_LEVEL: /leveling-up-your-agent.html DEEP_ARTICLE: /writing/open-sourcing-pattern-architecture/ KIT: /agent-architect-kit.html