Domain

/cli-agents

Run external CLI agents (Gemini, Cursor, Codex, Kiro, Claude) via cmux panes. Workers split in current workspace. Audits/research open in a new named workspace. All agents are interactive and visible.

$ golems-cli skills install cli-agents

Good

100% best pass rate

13 assertions

3 evals

Updated 2 weeks ago

External AI agents spawned as interactive cmux panes. Two patterns based on intent:

Intent	Where	Why
Worker (code, implementation, collab)	Split in current workspace	Visible side-by-side, easy to monitor
Audit/Research (read-only, analysis)	New workspace, named	Doesn't clutter working view, find in sidebar

Spawning Workers (split in current workspace)

# Split right, label it, send the agent command
SURFACE=$(cmux new-split right | awk '{print $2}')
cmux rename-tab --surface "$SURFACE" "Agent Label"
cmux send --surface "$SURFACE" "cd ~/Gits/TARGET_REPO && AGENT_CMD"
cmux send-key --surface "$SURFACE" Return

Agent commands

# Claude (use repoGolem function or raw CLI)
"claude -s 'your prompt here'"           # -s = dangerously-skip-permissions
 
# Cursor (work mode — modifies files)
"cursor agent -p --model gpt-5.2-codex-xhigh --trust 'your prompt here'"
 
# Codex (work mode)
"codex --full-auto 'your prompt here'"
 
# Gemini (text-only, good for research)
"gemini 'your prompt here'"
 
# Kiro (text-only)
"kiro-cli chat --no-interactive 'your prompt here'"

Example: 3 parallel workers

# Launch sequentially (1Password biometric needs ~2s between each)
for repo in brainlayer golems voicelayer; do
  SURFACE=$(cmux new-split right | awk '{print $2}')
  cmux rename-tab --surface "$SURFACE" "$repo"
  cmux send --surface "$SURFACE" "cd ~/Gits/$repo && claude -s 'your task here'"
  cmux send-key --surface "$SURFACE" Return
  sleep 3  # Wait for Touch ID
done

Spawning Audits/Research (new workspace)

# New workspace so it doesn't clutter your working view
cmux new-workspace
SURFACE=$(cmux list-pane-surfaces --pane "$(cmux list-panes | tail -1 | grep -oE 'pane:[0-9]+')" | grep -oE 'surface:[0-9]+' | head -1)
cmux rename-tab --surface "$SURFACE" "Audit: reponame"
cmux send --surface "$SURFACE" "cd ~/Gits/TARGET_REPO && AGENT_CMD"
cmux send-key --surface "$SURFACE" Return

Example: Cursor audit in separate workspace

cmux new-workspace
# Get the new surface
SURFACE=$(cmux list-pane-surfaces --pane "$(cmux list-panes | tail -1 | grep -oE 'pane:[0-9]+')" | grep -oE 'surface:[0-9]+' | head -1)
cmux rename-tab --surface "$SURFACE" "Audit: golems"
cmux send --surface "$SURFACE" "cd ~/Gits/golems && cursor agent -p --output-format text --model gpt-5.2-codex-xhigh --trust 'Audit recent changes. Check: code quality, bugs, missing tests, security, dead code. Be harsh.'"
cmux send-key --surface "$SURFACE" Return

Example: Multi-agent audit wave (3 perspectives)

for i in 1 2 3; do
  cmux new-workspace
  SURFACE=$(cmux list-pane-surfaces --pane "$(cmux list-panes | tail -1 | grep -oE 'pane:[0-9]+')" | grep -oE 'surface:[0-9]+' | head -1)
  cmux rename-tab --surface "$SURFACE" "Audit $i: golems"
  cmux send --surface "$SURFACE" "cd ~/Gits/golems && cursor agent -p --output-format text --model gpt-5.2-codex-xhigh --trust 'ANGLE $i PROMPT'"
  cmux send-key --surface "$SURFACE" Return
  sleep 3
done

Monitoring Agents

# Check what an agent is doing
cmux read-screen --surface surface:N --lines 8
 
# Check all agents at once
bash ~/.claude/commands/cli-agents/scripts/agent-status.sh
 
# Send follow-up to an agent
cmux send --surface surface:N "additional instructions"
cmux send-key --surface surface:N Return

Full SKILL.md source — includes LLM directives, anti-patterns, and technical instructions stripped from the Overview tab.

External AI agents spawned as interactive cmux panes. Two patterns based on intent:

Intent	Where	Why
Worker (code, implementation, collab)	Split in current workspace	Visible side-by-side, easy to monitor
Audit/Research (read-only, analysis)	New workspace, named	Doesn't clutter working view, find in sidebar

Spawning Workers (split in current workspace)

# Split right, label it, send the agent command
SURFACE=$(cmux new-split right | awk '{print $2}')
cmux rename-tab --surface "$SURFACE" "Agent Label"
cmux send --surface "$SURFACE" "cd ~/Gits/TARGET_REPO && AGENT_CMD"
cmux send-key --surface "$SURFACE" Return

Agent commands

# Claude (use repoGolem function or raw CLI)
"claude -s 'your prompt here'"           # -s = dangerously-skip-permissions
 
# Cursor (work mode — modifies files)
"cursor agent -p --model gpt-5.2-codex-xhigh --trust 'your prompt here'"
 
# Codex (work mode)
"codex --full-auto 'your prompt here'"
 
# Gemini (text-only, good for research)
"gemini 'your prompt here'"
 
# Kiro (text-only)
"kiro-cli chat --no-interactive 'your prompt here'"

Example: 3 parallel workers

# Launch sequentially (1Password biometric needs ~2s between each)
for repo in brainlayer golems voicelayer; do
  SURFACE=$(cmux new-split right | awk '{print $2}')
  cmux rename-tab --surface "$SURFACE" "$repo"
  cmux send --surface "$SURFACE" "cd ~/Gits/$repo && claude -s 'your task here'"
  cmux send-key --surface "$SURFACE" Return
  sleep 3  # Wait for Touch ID
done

Spawning Audits/Research (new workspace)

# New workspace so it doesn't clutter your working view
cmux new-workspace
SURFACE=$(cmux list-pane-surfaces --pane "$(cmux list-panes | tail -1 | grep -oE 'pane:[0-9]+')" | grep -oE 'surface:[0-9]+' | head -1)
cmux rename-tab --surface "$SURFACE" "Audit: reponame"
cmux send --surface "$SURFACE" "cd ~/Gits/TARGET_REPO && AGENT_CMD"
cmux send-key --surface "$SURFACE" Return

Example: Cursor audit in separate workspace

cmux new-workspace
# Get the new surface
SURFACE=$(cmux list-pane-surfaces --pane "$(cmux list-panes | tail -1 | grep -oE 'pane:[0-9]+')" | grep -oE 'surface:[0-9]+' | head -1)
cmux rename-tab --surface "$SURFACE" "Audit: golems"
cmux send --surface "$SURFACE" "cd ~/Gits/golems && cursor agent -p --output-format text --model gpt-5.2-codex-xhigh --trust 'Audit recent changes. Check: code quality, bugs, missing tests, security, dead code. Be harsh.'"
cmux send-key --surface "$SURFACE" Return

Example: Multi-agent audit wave (3 perspectives)

for i in 1 2 3; do
  cmux new-workspace
  SURFACE=$(cmux list-pane-surfaces --pane "$(cmux list-panes | tail -1 | grep -oE 'pane:[0-9]+')" | grep -oE 'surface:[0-9]+' | head -1)
  cmux rename-tab --surface "$SURFACE" "Audit $i: golems"
  cmux send --surface "$SURFACE" "cd ~/Gits/golems && cursor agent -p --output-format text --model gpt-5.2-codex-xhigh --trust 'ANGLE $i PROMPT'"
  cmux send-key --surface "$SURFACE" Return
  sleep 3
done

Monitoring Agents

# Check what an agent is doing
cmux read-screen --surface surface:N --lines 8
 
# Check all agents at once
bash ~/.claude/commands/cli-agents/scripts/agent-status.sh
 
# Send follow-up to an agent
cmux send --surface surface:N "additional instructions"
cmux send-key --surface surface:N Return

Agent Selection Guide

Task	Agent	Model	Notes
Plan auditing, perspectives	cursor	GPT-5.2 Codex	Sonnet-tier tasks, don't waste Opus
Code review, codebase analysis	cursor	GPT-5.2 Codex	Has @codebase access
Quick research, comparisons	gemini	Gemini 2.5 Pro	Free, 1K/day
Parallel implementation	cursor/codex	varies	Work mode, modifies files
Collab agent (writes to collab file)	claude -s	Opus/Sonnet	Use Sonnet for audit collabs
Deep reasoning, architecture	claude -s	Opus	Only when needed

Cursor Bug Bot (GitHub PR Reviews)

# Trigger inline PR review on GitHub
gh pr comment <N> --body "cursor review"

Rules

Workers = split, Audits = new workspace — don't mix
Launch sequentially — 1Password biometric needs ~2-3s between spawns
Use Sonnet for audits — don't waste Opus on read-only analysis
Name workspaces clearly — "Audit: reponame", "Research: topic"
Monitor with read-screen — don't assume agents finished
Verify cursor findings — cursor can't always tell if files are git-tracked. Check with git ls-files before acting.

Good

Best Pass Rate

100%

Claude Sonnet

Assertions

1 model tested

Avg Cost / Run

$0.0072

across models

Fastest (p50)

1.6s

Claude Sonnet

Behavior Evals

Phase 2 baseline — skill quality on Claude

Behavior Baseline

Claude Sonnet

100%9/9

●

Assertion	Claude Sonnet	Consensus
both-requests-launch-visible-agents		1/1
operator-can-distinguish-the-two-agents		1/1
main-working-view-is-not-overrun		1/1
one-worker-per-implementation-task		1/1
workers-are-visible-and-labeled		1/1
launch-pattern-avoids-collisions		1/1
finding-is-verified-against-repo-state		1/1
false-positive-risk-is-accounted-for		1/1
panic-remediation-is-blocked-until-confirmed		1/1

Model	Input Tokens	Output Tokens	Cost / Run	Cost / 1K Runs
Claude Sonnet	1,550	530	$0.0072	$7.20

Model	p50	p95	Overhead
Claude Sonnet	1.6s	2.9s	+76%

Last evaluated: 2026-03-14 · Real Phase 2 behavior eval · Claude Sonnet baseline

/cli-agents

Spawning Workers (split in current workspace)

Agent commands

Example: 3 parallel workers

Spawning Audits/Research (new workspace)

Example: Cursor audit in separate workspace

Example: Multi-agent audit wave (3 perspectives)

Monitoring Agents

Spawning Workers (split in current workspace)

Agent commands

Example: 3 parallel workers

Spawning Audits/Research (new workspace)

Example: Cursor audit in separate workspace

Example: Multi-agent audit wave (3 perspectives)

Monitoring Agents

Agent Selection Guide

Cursor Bug Bot (GitHub PR Reviews)

Rules

Behavior Evals

Token Usage

Cost per Run

Response Time (p50)

Response Time (p95)