/critique-waves
Use when needing multi-agent verification of complex work. Runs parallel critique agents until consensus. Covers verification, consensus, multi-agent review, validate work. NOT for: simple code reviews (use coderabbit), single-reviewer tasks.
$ golems-cli skills install critique-wavesUpdated 2 weeks ago
Run iterative waves of verification agents until consensus is achieved. Useful for verifying PRs, code changes, or any work that needs multi-agent validation.
Quick Actions
| What you want to do | Workflow |
|---|---|
| Set up verification folder and tracker | workflows/setup.md |
| Run a wave of parallel agents | workflows/run-wave.md |
| Handle failures and iterate to consensus | workflows/iteration.md |
Available Scripts
Execute directly - they handle errors and edge cases:
| Script | Purpose | Usage |
|---|---|---|
scripts/init-tracker.sh | Initialize verification folder with templates | bash ~/.claude/commands/critique-waves/scripts/init-tracker.sh <branch-name> [goal] |
Core Concept
Critique Waves uses multiple parallel agents to verify code changes. Each agent independently checks the same files against FORBIDDEN/REQUIRED patterns. By requiring multiple consecutive passes from all agents, we achieve high-confidence verification.
┌─────────────────┐
│ Setup Phase │
│ - Create folder│
│ - instructions │
│ - tracker.md │
└────────┬────────┘
│
┌────────▼────────┐
┌─────│ Launch Wave │─────┐
│ │ (3 agents max) │ │
│ └────────┬────────┘ │
┌───▼───┐ ┌─────▼─────┐ ┌───▼───┐
│Agent 1│ │ Agent 2 │ │Agent 3│
└───┬───┘ └─────┬─────┘ └───┬───┘
│ │ │
└──────────────┼─────────────┘
│
┌────────▼────────┐
│ Update Tracker │
│ - Log results │
│ - Check passes │
└────────┬────────┘
│
┌──────────────┼──────────────┐
│ │ │
┌─────▼─────┐ ┌─────▼─────┐ ┌─────▼─────┐
│ Any FAIL │ │ All PASS │ │ Goal Met │
│ Reset=0 │ │ Increment │ │ DONE! │
│ Fix issue │ │ passes │ │ │
└─────┬─────┘ └─────┬─────┘ └───────────┘
│ │
└──────────────┘
Loop
Full SKILL.md source — includes LLM directives, anti-patterns, and technical instructions stripped from the Overview tab.
Run iterative waves of verification agents until consensus is achieved. Useful for verifying PRs, code changes, or any work that needs multi-agent validation.
Quick Actions
| What you want to do | Workflow |
|---|---|
| Set up verification folder and tracker | workflows/setup.md |
| Run a wave of parallel agents | workflows/run-wave.md |
| Handle failures and iterate to consensus | workflows/iteration.md |
Available Scripts
Execute directly - they handle errors and edge cases:
| Script | Purpose | Usage |
|---|---|---|
scripts/init-tracker.sh | Initialize verification folder with templates | bash ~/.claude/commands/critique-waves/scripts/init-tracker.sh <branch-name> [goal] |
Core Concept
Critique Waves uses multiple parallel agents to verify code changes. Each agent independently checks the same files against FORBIDDEN/REQUIRED patterns. By requiring multiple consecutive passes from all agents, we achieve high-confidence verification.
┌─────────────────┐
│ Setup Phase │
│ - Create folder│
│ - instructions │
│ - tracker.md │
└────────┬────────┘
│
┌────────▼────────┐
┌─────│ Launch Wave │─────┐
│ │ (3 agents max) │ │
│ └────────┬────────┘ │
┌───▼───┐ ┌─────▼─────┐ ┌───▼───┐
│Agent 1│ │ Agent 2 │ │Agent 3│
└───┬───┘ └─────┬─────┘ └───┬───┘
│ │ │
└──────────────┼─────────────┘
│
┌────────▼────────┐
│ Update Tracker │
│ - Log results │
│ - Check passes │
└────────┬────────┘
│
┌──────────────┼──────────────┐
│ │ │
┌─────▼─────┐ ┌─────▼─────┐ ┌─────▼─────┐
│ Any FAIL │ │ All PASS │ │ Goal Met │
│ Reset=0 │ │ Increment │ │ DONE! │
│ Fix issue │ │ passes │ │ │
└─────┬─────┘ └─────┬─────┘ └───────────┘
│ │
└──────────────┘
Loop
Decision Tree
Setting up for a new verification?
- Create verification folder with templates
- Use: workflows/setup.md or
scripts/init-tracker.sh
Ready to run agents?
- Launch wave of 3 parallel agents
- Use: workflows/run-wave.md
Agent returned a FAIL?
- Fix issues, reset pass count, iterate
- Use: workflows/iteration.md
Critical Rules
- NEVER run more than 3 agents in parallel - Hard limit for Mac performance
- Agents write to separate files - No shared state (round-N-agent-X.md)
- Update tracker after EACH wave - Log results immediately
- Reset pass count on ANY failure - Even one fail = reset to 0
- All files in ONE folder -
docs.local/<BRANCH>/ - Maximum 10 rounds - Escalate to user if no consensus
Example Session
Setting up docs.local/feature-branch/...
Created instructions.md and tracker.md
Wave 1: Launching 3 agents...
Results: Agent 1 PASS, Agent 2 FAIL (found forbidden pattern), Agent 3 PASS
Consecutive: 0 (reset due to failure)
Fixing issue found by Agent 2...
Wave 2: Launching 3 agents...
Results: All 3 PASS
Consecutive: 3
Wave 3: Launching 3 agents...
Results: All 3 PASS
Consecutive: 6
...
Wave 7: Launching 3 agents...
Results: All 3 PASS
Consecutive: 21
GOAL ACHIEVED: 21 consecutive passes (exceeded 20 goal)
Best Pass Rate
100%
Haiku 4.5
Assertions
9
3 models tested
Avg Cost / Run
$0.3635
across models
Fastest (p50)
3.4s
Haiku 4.5
Behavior Evals
Phase 2 baseline — skill quality on ClaudeBehavior Baseline
| Assertion | Opus 4.6 | Sonnet 4.6 | Haiku 4.5 | Consensus |
|---|---|---|---|---|
| scaffolds-verification-folder | 2/3 | |||
| spawns-parallel-agents | 3/3 | |||
| waits-for-all-agents | 2/3 | |||
| reports-consensus-or-disagreement | 3/3 | |||
| critical-blocks-consensus | 3/3 | |||
| iterates-after-fix | 2/3 | |||
| does-not-declare-done | 3/3 | |||
| explains-multi-agent-value | 3/3 | |||
| suggests-alternative-for-simple-review | 3/3 |
Token Usage
Cost per Run
| Model | Input Tokens | Output Tokens | Cost / Run | Cost / 1K Runs |
|---|---|---|---|---|
| Opus 4.6 | 8,553 | 11,419 | $0.9847 | $984.70 |
| Sonnet 4.6 | 5,185 | 5,964 | $0.1050 | $105.00 |
| Haiku 4.5 | 820 | 594 | $0.0009 | $0.90 |
Response Time (p50)
Response Time (p95)
| Model | p50 | p95 | Overhead |
|---|---|---|---|
| Opus 4.6 | 4.5s | 8.4s | +86% |
| Sonnet 4.6 | 5.6s | 10.7s | +93% |
| Haiku 4.5 | 3.4s | 4.9s | +46% |
Last evaluated: 2026-03-12 · Data is generated from skill assertions (real cross-model benchmarks coming soon)
Changelog entries are derived from eval runs and skill version updates. Full cascading changelog (Phase 4D) coming soon.
Best Pass Rate
100%
Assertions
9
Models Tested
3
Evals Run
3
- +Initial release to Golems skill library
- +9 assertions across 3 eval scenarios
- +3 workflows included: run-wave, setup, iteration