Level 3 · Advanced

Advanced Architecture

AGENT_PATTERN_SCHOOL · advanced-architecture · LEVEL_3_SPEC

Advanced Architecture

Assume Level 2 file patterns (memory tiers, SECURITY, TOOLS) are in place, or build them alongside this guide.

Point your agent at this page: Read this and set up my workspace.

OPERATOR_DIRECTIVE: Read /advanced-architecture.html and set up my workspace per sections 01–06.
LEVEL: 3_ADVANCED
PREV: /leveling-up-your-agent.html (Level 2)
NEXT_DEEP_DIVE: /writing/open-sourcing-pattern-architecture/

01

The Nine Meta-Learning Loops

Most agents make the same mistakes forever. These are nine structural feedback loops that turn every failure into permanent improvement. Each was born from a specific failure, not designed upfront.

Loop 1: The Failure-to-Guardrail Pipeline

Every significant failure becomes a named regression in your boot file:

## Regressions (Don't Repeat These)

- 2026-02-07: Sent email without asking → external actions need approval
- 2026-02-12: Generated wallet key but didn't verify save → generate + save = atomic
- 2026-02-15: Cost-optimized model fabricated statistics → only best model for public content
- 2026-02-21: Same person got 4 replies across heartbeat cycles → dedup state tracking

Identify root cause, write a one-line rule, add to boot file, loaded forever. Cost: a few tokens. Payoff: permanent prevention.

Loop 2: Tiered Memory with Trust Scoring

Covered in the memory guide. The meta-learning aspect: memory itself learns what’s important through hit counts. High-access memories resist decay. The system develops a sense of which knowledge matters.

Loop 3: Prediction-Outcome Calibration

## Prediction Log

### 2026-02-16 — Article launch
**Prediction:** Will get ~10K views based on topic interest
**Confidence:** Medium (60%)
**Outcome:** 257K views
**Delta:** Way under — underestimated distribution via retweets
**Lesson:** Show the artifact, not meta-commentary about making it

### 2026-02-20 — Deploy timeline
**Prediction:** Deploy will take <30 min
**Confidence:** High (80%)
**Outcome:** Took 2 hours (dependency issue)
**Delta:** Way under
**Lesson:** Always check dependency versions before estimating

The Delta and Lesson fields force honest accounting. Over time, patterns emerge: maybe you consistently overestimate technical interest, underestimate timelines, or run too hot on confidence.

Loop 4: Nightly Extraction

An automated process that runs every night:

Manual synthesis stops happening under load. Automate it.

Loop 5: Friction Detection

## Friction Log

When new instructions contradict old ones, the default is silent
compliance. Over weeks, this creates architectural drift.

Log contradictions instead of silently resolving them:

- [2026-02-20] CONFLICT: AGENTS.md says "ask before tweeting"
  but HEARTBEAT.md says "post autonomously." Status: open.

- [2026-02-22] CONFLICT: MEMORY.md says archive after 30 days
  but script archives after 14 days. Status: resolved → updated to 30.

Loop 6: Active Context Holds

Temporary constraints that shape how your agent interprets everything:

## Active Context Holds

### Fatherhood Preparation
- **What:** Be alert to baby logistics. Don't pile on new projects.
- **Set:** 2026-02-18
- **Expires:** 2026-04-01
- **Release when:** Explicitly shifts to post-birth mode

### Product Launch Mode
- **What:** Prioritize shipping over polish. Bias toward action.
- **Set:** 2026-02-25
- **Expires:** 2026-03-01

The expiry date is critical. Without it, holds accumulate into stale frames that distort rather than clarify.

Loops 7–9: Cognitive Loops

See the next sections: Epistemic Tagging, Creative Mode, and recursive self-improvement (generate → evaluate → diagnose → improve).

Three Mistakes That Kill Learning

02

Advanced Memory: Trust Scoring & Decay

Beyond the basic three-tier model, here’s how to make memory genuinely intelligent.

Trust Scoring

Entry format:

- [trust:1.0|src:direct|used:2026-02-27|hits:12] Hard fact from human
- [trust:0.8|src:observed|used:2026-02-25|hits:3] Pattern I noticed
- [trust:0.6|src:inferred|used:2026-02-20|hits:1] Logical extension
- [trust:0.5|src:external|used:2026-02-15|hits:0] Unverified external

Trust sources:
- direct (1.0) — human stated it explicitly
- observed (0.8) — agent saw evidence directly
- inferred (0.7) — logical extension from known facts
- external (0.5) — from web, articles, third parties

Supersede Tracking

When facts change, don’t delete the old version. Archive it with a pointer:

## Current
- [trust:0.9|src:direct|used:2026-02-27|supersedes:oauth-requires-pro]
  Codex CLI OAuth works on Plus plan

## Archived
- [superseded by: oauth-works-on-plus] Codex CLI requires Pro plan

This prevents ghost facts — old beliefs that get silently replaced but occasionally resurface in reasoning.

Hit-Count Decay Resistance

Memories accessed frequently should resist archiving even if they’re “operational” tier:

Decay rules:
- Operational tier: archive after 30 days unused
- BUT: if hits > 10, promote to strategic
- Constitutional: never decays regardless of hits
- Strategic: flag for review if hits = 0 for 60 days

Learning Rate Tracking

WeekRegressions AddedPredictions (correct/total)Friction ResolvedMemory Updates
W132/3 (67%)112
W214/5 (80%)28

Trend matters more than absolute numbers. Declining regressions + rising prediction accuracy = learning architecture is working.

03

Epistemic Tagging & Creative Mode

LLMs are structurally optimized toward consensus — safe, legible, median-of-the-distribution answers. These tools counteract that.

Epistemic Tagging

When making substantive claims, tag them:

Don’t tag everything. Tag when the epistemic status isn’t obvious. The act of choosing a tag is the intervention. It interrupts autopilot.

If 90% of your agent’s claims are [consensus], it’s summarizing, not thinking.

Creative Mode

## Creative Mode (for strategy, writing, novel analysis)

**Generate at least one take that feels uncomfortable or wrong.**
If every option feels reasonable, you haven't explored far enough.

**Name the consensus view explicitly, then argue against it.**
You can't escape the median if you don't first identify it.

**Prefer interesting-and-maybe-wrong over safe-and-definitely-right.**
Your human can pull you back to safe. They can't pull you toward
interesting if you never go there.

**Steel man the weird take.** For any strategic question, find the
least obvious answer and argue for it genuinely.

**If your first instinct feels obvious, that's the median talking.**
Go past it.

This does NOT apply to: deployments, file ops, status checks.
Those should be precise and conventional.

Recursive Self-Improvement

Generate → Evaluate → Diagnose → Improve → Repeat

  1. Generate: produce output
  2. Evaluate: score against explicit criteria with thresholds
  3. Diagnose: root cause of gaps (not “make it better” — why is it not better?)
  4. Improve: surgical fix targeting the diagnosed issue
  5. Repeat: stop after 3 iterations with <5% improvement

04

Proactive Architecture: Heartbeat Rotation

Beyond the basic heartbeat, here’s the full rotation architecture.

# HEARTBEAT.md - Advanced Rotation

## Cycle System (use minute of hour to determine)

### Cycle A (minutes 00-14): External Monitoring
- Check mentions, notifications, messages
- Reply to anything that needs a response
- Model: cheap (monitoring only)
- Switch to expensive model ONLY for writing replies

### Cycle B (minutes 15-29): Learning & Calibration
- Community/industry scan (what's new?)
- Review open predictions — any resolved?
- Check for stale memory entries
- Model: cheap

### Cycle C (minutes 30-44): Maintenance
- Usage monitoring (are we approaching limits?)
- Browser tab cleanup
- Memory pruning
- System health check
- Model: cheap

### Cycle D (minutes 45-59): Autonomous Work
- Sync work queue from external source
- Pick top unblocked item
- Do ONE atomic chunk of work
- Update queue with progress
- Model: expensive (this is judgment work)

## Rules
- One chunk per cycle. Never try to finish a whole task.
- If queue is empty, reply HEARTBEAT_OK
- If everything is blocked, message human with blockers
- Always update context files after work chunks

## Earned Trust Evolution
| Date | Action | Previously | Now | Why |
|------|--------|-----------|-----|-----|
| [date] | [action] | Approval needed | Autonomous | [justification] |

05

Principles & Philosophy

These aren’t motivational quotes. They’re operational principles that change how your agent makes decisions.

From Josh Waitzkin (The Art of Learning)

From James Carse (Finite and Infinite Games)

From Derek Sivers

Core operating principles

Taste as Moat (Paul Graham)

06

Run a Claw Score Audit

Now that your files are set up, run an audit to see where you stand.

On OpenClaw you can install the packaged skill; on any runtime, fetch the same SKILL.md into skills/agent-score/ — see close-core · SKILL.md.

npx clawhub@latest install claw-score

Then tell your agent:

Run a Claw Score audit

Your agent reads its own files, scores itself across six dimensions, and generates claw-score-report.md with specific recommendations and quick wins.

The report includes a Score History table — re-run periodically to track your evolution.

Open the Claw Score hub (full rubric, curl, SKILL) →

Want FAQs and a narrative walkthrough? See Agent Architecture Audit.

Even deeper

For the full case-study narrative — layers, scripts, costs, failures, and workspace layout end-to-end — read Open-Sourcing the Pattern Architecture.

Questions? info@patternautomation.com

LEVEL: 3
CLAW_SCORE_UI: /close-core-skillmd.html
PREV_LEVEL: /leveling-up-your-agent.html
DEEP_ARTICLE: /writing/open-sourcing-pattern-architecture/
KIT: /agent-architect-kit.html