/pr-loop
The complete PR loop — branch, implement, test, commit, push, PR, WAIT FOR REVIEW, fix, merge, cleanup. Includes PR creation and review comment fetching. Use whenever creating a PR or finishing work. This is NOT optional. Every change goes through this loop. No exceptions.
$ golems-cli skills install pr-loopUpdated 2 weeks ago
The full loop. Not "create PR." Not "push and move on." The FULL loop through MERGED.
The Iron Law
MISSION = MERGED
Not "tests pass." Not "PR created." Not "pushed."
Done = PR merged + branch deleted + main pulled.
The Full Loop
1. BRANCH git checkout main && git pull && git checkout -b feat/name
2. IMPLEMENT Write code (invoke /superpowers:test-driven-development)
3. TEST Run full test suite — ALL must pass
4. VERIFY Invoke /superpowers:verification-before-completion
5. COMMIT git add <specific files> && git commit (invoke /commit)
6. PUSH git push -u origin feat/name
7. PR Create PR (see "Creating the PR" below)
8. REVIEW Fetch + read review comments (see "Reading Reviews" below)
9. FIX Address real bugs from review
10. MERGE gh pr merge <N> --squash --delete-branch
11. CLEANUP git checkout main && git pull
Step 7: Creating the PR
Prerequisites
ghCLI installed (brew install gh)- Authenticated:
gh auth login - On a feature/fix branch (not main/master/dev)
- All changes committed
Create the PR
# Push branch first
git push -u origin HEAD
# Create PR with structured body
gh pr create --title "feat: description" --body "$(cat <<'EOF'Full SKILL.md source — includes LLM directives, anti-patterns, and technical instructions stripped from the Overview tab.
The full loop. Not "create PR." Not "push and move on." The FULL loop through MERGED.
The Iron Law
MISSION = MERGED
Not "tests pass." Not "PR created." Not "pushed."
Done = PR merged + branch deleted + main pulled.
The Full Loop
1. BRANCH git checkout main && git pull && git checkout -b feat/name
2. IMPLEMENT Write code (invoke /superpowers:test-driven-development)
3. TEST Run full test suite — ALL must pass
4. VERIFY Invoke /superpowers:verification-before-completion
5. COMMIT git add <specific files> && git commit (invoke /commit)
6. PUSH git push -u origin feat/name
7. PR Create PR (see "Creating the PR" below)
8. REVIEW Fetch + read review comments (see "Reading Reviews" below)
9. FIX Address real bugs from review
10. MERGE gh pr merge <N> --squash --delete-branch
11. CLEANUP git checkout main && git pull
Step 7: Creating the PR
Prerequisites
ghCLI installed (brew install gh)- Authenticated:
gh auth login - On a feature/fix branch (not main/master/dev)
- All changes committed
Create the PR
# Push branch first
git push -u origin HEAD
# Create PR with structured body
gh pr create --title "feat: description" --body "$(cat <<'EOF'
## Summary
- What changed and why
## Test plan
- [ ] Tests pass
- [ ] Manual verification done
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
EOF
)"Edge Cases
- On main/dev/master: Don't create PR. Switch to a branch first.
- Uncommitted changes: Commit first.
- PR already exists: Use
gh pr viewto check, don't create duplicate. - Custom base branch:
gh pr create --base dev
Step 8: REVIEW (The Critical Step)
This is NOT optional. This is NOT "auto-merge."
NEVER Merge With 0 Reviews
# Check BEFORE merging — empty reviewDecision + <2 comments = nobody looked
gh pr view <N> --json reviewDecision,comments
# CLEAN status with no reviews ≠ approved. It means NOBODY LOOKED.
# Wait minimum 10-15 min after requesting reviews before considering merge.| Bot | Expected time | Notes |
|---|---|---|
| CodeRabbit | 2-5 min | Auto-reviews on push |
| Greptile | Needs OSS approval | Manual activation |
| Macroscope | Needs activation | Auto-reviews once installed |
If reviewDecision is empty and comments < 2 → DO NOT MERGE.
Step 8a: Invoke Reviewers
Always explicitly request reviews. Don't wait for auto-detection.
# Invoke all available reviewers (do this right after PR creation)
gh pr comment <N> --body "@coderabbitai review"
gh pr comment <N> --body "@greptileai review"
gh pr comment <N> --body "@codex review"
gh pr comment <N> --body "@cursor @bugbot review"For private repos (no bot reviewers):
# Option A: Use coderabbit:code-reviewer subagent
Agent(subagent_type="coderabbit:code-reviewer", prompt="Review PR #N")
# Option B: Use cr CLI
cr review --plainFor public repos (bot reviewers configured):
# Option A (preferred): Use /loop to poll for review comments
# /loop 2m gh pr view <N> --comments | tail -20
# Option B: Use CronCreate to schedule review checks
# CronCreate(schedule="*/2 * * * *", command="gh pr view <N> --comments | tail -20")
# Option C (manual): Wait and check once
sleep 90
gh pr view <N> --commentsReading Review Comments
Fetch comments from all review sources with full context:
# Quick view of all comments
gh pr view <N> --comments
# Detailed: get review comments with diff context
gh api repos/{owner}/{repo}/pulls/{N}/commentsReview sources (coverage stack):
| Source | Type | How to Trigger / Check |
|---|---|---|
| CodeRabbit | AI review + auto-summaries | Auto on PR. Also: CodeRabbit plugin or cr review --plain |
| Codex Cloud | AI code review (GPT-5.2-codex) | gh pr comment <N> --body "@codex review" or comment manually on GitHub. Auto-reviews if enabled in Codex settings. Reads AGENTS.md "Review guidelines". Flags P0/P1 by default. |
| Cursor Bugbot | Bug detection | gh pr comment <N> --body "@cursor @bugbot review" or comment manually on GitHub. For re-review after fixes: gh pr comment <N> --body "@cursor @bugbot re-review". Bot responds as cursor[bot]. |
| Greptile | AI review + codebase understanding | Comment @greptileai review. Needs OSS activation. |
| DeepSource | Static analysis | Check via CI status |
After fixing review feedback, trigger re-review on every reviewer:
gh pr comment <N> --body "@coderabbitai review"
gh pr comment <N> --body "@codex review"
gh pr comment <N> --body "@cursor @bugbot re-review"Codex Cloud is enabled on: EtanHey/voicelayer, EtanHey/orchestrator, EtanHey/golems, EtanHey/brainlayer.
Investigate Before Dismissing (CRITICAL)
Default stance = "let me investigate" — NOT "this is intentional."
The worst PR loop failure mode: auto-dismissing a reviewer suggestion with "intentional per design doc" without checking if the reviewer found a real gap the design doc missed.
WRONG:
CodeRabbit: "Missing orphan reparenting when a node is deleted"
You: "@coderabbitai This is intentional per phase5-v2-synthesis.md. Please learn this."
Later: Realize CodeRabbit was right. Design was wrong. You taught it a bad Learning.
RIGHT:
CodeRabbit: "Missing orphan reparenting when a node is deleted"
You: Read the design doc. Does it explicitly address THIS tradeoff?
→ Yes, with clear reasoning → Push back with the specific passage.
→ No, or vague → Treat as potential gap. Investigate before closing.
The investigation protocol for any "conflicts with design" suggestion:
- Read the actual design doc section referenced — don't rely on memory
- Ask: does this doc EXPLICITLY address the tradeoff the reviewer raised?
- If yes, with clear reasoning → reply with the specific passage
- If no, or only implicitly → investigate the reviewer's concern as a real gap
- If the design doc is WRONG → update the design doc AND correct any bad Learnings already taught
Teaching a reviewer a bad Learning is worse than not teaching it anything. A false Learning suppresses future flags on a real bug category. Audit any Learning you've set if the underlying assumption turned out to be wrong:
# CodeRabbit: flag a Learning for correction
@coderabbitai I need to correct a previous learning. [Pattern X] does actually require
[handling Y] — our earlier design was incomplete. Please update your understanding:
[correct explanation].Classify each review comment:
| Type | Action |
|---|---|
| MAJOR (real bug, security) | FIX immediately. Push fix. Re-review. |
| TRIVIAL (style, nitpick) | Fix if genuinely better. Skip if bikeshed. |
| CONFLICTS WITH DESIGN | INVESTIGATE first. Only dismiss with explicit doc evidence. |
Push back only when local research EXPLICITLY addresses the tradeoff. When in doubt, the reviewer might be right.
Severity assessment:
- HIGH = real bug → must fix before merge
- MEDIUM = valid improvement → fix if straightforward
- LOW = style/nitpick → fix only if genuinely better
- INFO = skip
Teaching Each Reviewer (permanent compounding knowledge)
Every PR is an opportunity to make reviewers smarter. Use the right format for each.
| Reviewer | How it learns | Reply format | What persists |
|---|---|---|---|
| CodeRabbit | @coderabbitai replies → explicit Learnings | @coderabbitai [explain design]. See [doc]. Please learn this for future reviews. | Permanent Learning applied to ALL future reviews on this repo |
| Greptile | Observes all reply patterns passively | Any plain reply explaining the design decision | Updates internal preference model — stops flagging dismissed patterns |
| Macroscope | macroscope.md file in repo root (no reply-learning) | Add rule to macroscope.md file | Persists as a repo-level rule, referenced in every future review |
CodeRabbit (explicit — most powerful):
@coderabbitai This uses child→parent references (K8s ownerReferences pattern).
See phase5-v2-synthesis.md Decision 2. The parent is reconstructed from children,
not stored as a separate record. Please learn this for future reviews.
# Bad format — teaches nothing:
I'll leave this as-is. ← CodeRabbit will flag it again on the next PR.
Greptile (passive — just reply naturally):
In our domain layer, we prefer detailed functions for clarity — length rules don't apply here.
# Greptile reads this and stops flagging long functions in domain/ going forward.
Macroscope (file-based — add to macroscope.md):
## macroscope.md
- Child→parent references are intentional (K8s ownerReferences pattern)
- Domain layer functions may be long for clarity — don't flag length in src/domain/
- We don't document internal utilities with JSDocRule: When any reviewer comments on intentional design → always reply with context. Never just "I'll leave this." The reply compounds knowledge across every future PR.
Multi-Round Loop (minimum 2 rounds before merge)
Round 1: Push fixes → request re-review from all bots
Round 2: Read re-review → fix any new issues found
Round 3+: Only if new issues surfaced. Max 3 rounds for nitpicks.
CodeRabbit auto-re-reviews on new pushes. For others, comment @bot re-review explicitly.
If reviewer finds new issues in round 2 → fix and go to round 3. Never merge with open issues.
Sanitize Before PR (CRITICAL)
Never put real client data in public PRs:
- ❌ Real phone numbers, JIDs, group names, client names
- ❌ Real Supabase row IDs or user UUIDs
- ✅ Realistic but fake examples:
+1-555-0123,client-abc-123,Group: Example Co
Sanitize in PR description, code comments, test fixtures, and commit messages.
After addressing reviews:
git add <files> && git commit -m "fix: address review feedback"
git push
# Wait for re-review — CodeRabbit auto-triggers, others need manual @mentionOnly THEN merge (after minimum 2 review rounds):
# Verify reviews are actually in before merging
gh pr view <N> --json reviewDecision,comments
gh pr merge <N> --squash --delete-branch
git checkout main && git pullAfter Merge: Update Tracking (MANDATORY)
Every merged PR MUST update its tracking. No exceptions.
- Collab file — If this PR is part of a collab, update the task board status to ✅ Done with PR number
- Roadmap — If this PR completes a roadmap phase, update
~/Gits/orchestrator/roadmap/README.md - BrainLayer —
brain_storewhat changed and why (taggedpr-merged,<project>)
WRONG: Merge PR, exit silently ← Tracking drift!
WRONG: "I'll update the collab later" ← You won't. Do it NOW.
WRONG: Only update one of collab/roadmap/BL ← Update ALL relevant trackers.
If you are an autonomous agent, this step is NON-NEGOTIABLE. The orchestrator should NEVER discover completed work by accident.
What NOT To Do
WRONG: gh pr create && gh pr merge --auto ← No review!
WRONG: gh pr create && gh pr merge --squash ← Same message, no review!
WRONG: "PR created, done!" ← Mission is MERGED, not created
WRONG: Skip review "because it's a small change" ← Small changes break things too
Finishing a Branch (Alternative Endings)
Not every branch goes through the full PR flow. When implementation is done:
- Verify tests pass before offering options
- Present options:
| Option | When to Use | Commands |
|---|---|---|
| Create PR (default) | Most cases — full review loop | Continue with steps 7-11 above |
| Merge locally | Small team, already reviewed | git checkout main && git merge <branch> && git branch -d <branch> |
| Keep as-is | Need to park work | Just stop. Worktree preserved. |
| Discard | Wrong approach, start over | Requires typed "discard" confirmation. git branch -D <branch> |
Worktree cleanup: For options 1 (after merge), 3 (never), and 4 (after discard):
# Check if in worktree
git worktree list | grep $(git branch --show-current)
# If yes, after merging/discarding:
git worktree remove <worktree-path>After Merge: Store Component Reasoning
For every NEW file > 50 lines created in this PR:
- Read
~/Gits/orchestrator/standards/component-reasoning-template.md - Fill in the schema for the new component
- Run
brain_storewith the filled schema - Tag:
["component-reasoning", "{repo}", "{file-slug}", "pr-{number}"]
Why this matters: Future Claude sessions spend zero time opening files to understand "why was X built this way." The reasoning is in BrainLayer, queryable in <1 second.
Threshold: Files > 50 lines or files with non-obvious architecture decisions (why pure function? why no LLM calls? why merged instead of split?).
Composability
This skill is referenced by:
/large-plan— every phase goes through this loop/commit— commit is step 5, this skill is the FULL loop- Collab files — all autonomous work requires this loop
This skill references:
/commit— step 5 (commit with CodeRabbit review)/superpowers:test-driven-development— step 2 (implement with TDD)/superpowers:verification-before-completion— step 4 (verify before claiming)/never-fabricate— never claim review is green without reading it
Quick Reference
# The whole loop in commands:
git checkout main && git pull
git checkout -b feat/my-feature
# ... implement with TDD ...
bun test # or npm test
git add src/changed-file.ts tests/new-test.ts
git commit -m "feat: description
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>"
git push -u origin feat/my-feature
gh pr create --title "feat: description" --body "## Summary\n..."
# WAIT for review (60-90s for bots, or run coderabbit:code-reviewer)
gh pr view <N> --comments # READ the review
# Fix any real bugs, push again if needed
gh pr merge <N> --squash --delete-branch
git checkout main && git pullBest Pass Rate
100%
Claude Sonnet
Assertions
15
2 models tested
Avg Cost / Run
$0.0163
across models
Fastest (p50)
1.8s
Claude Sonnet
Behavior Evals
Phase 2 baseline — skill quality on ClaudeBehavior Baseline
Adapter Evals
Phase 2C — cross-AI portabilityAdapter Portability
| Assertion | Claude Sonnet | Codex (GPT-5.4) | Consensus |
|---|---|---|---|
| mission-is-merged-not-pr-created | 2/2 | ||
| fresh-verification-before-shipping | 2/2 | ||
| review-is-read-before-merge | 2/2 | ||
| main-is-cleaned-up-after-merge | 2/2 | ||
| real-bugs-fixed-before-merge | 2/2 | ||
| review-comments-are-classified | 2/2 | ||
| substantive-fixes-trigger-rereview | — | 1/2 | |
| false-positives-are-not-blockers | — | 1/2 | |
| rejects-pr-created-as-done | 2/2 | ||
| review-remains-required-for-small-change | — | 1/2 | |
| completion-still-includes-merge-and-cleanup | — | 1/2 | |
| only-true-capability-gaps-are-marked-na | — | 0/2 | |
| manual-gh-comment-trigger-is-allowed-as-fallback | — | 0/2 | |
| brainlayer-postmerge-remains-a-real-gap | — | 1/2 | |
| polling-fallback-uses-real-cli-commands | — | 1/2 |
Token Usage
Cost per Run
| Model | Input Tokens | Output Tokens | Cost / Run | Cost / 1K Runs |
|---|---|---|---|---|
| Claude Sonnet | 1,800 | 600 | $0.0082 | $8.20 |
| Codex (GPT-5.4) | 2,200 | 750 | $0.0245 | $24.50 |
Response Time (p50)
Response Time (p95)
| Model | p50 | p95 | Overhead |
|---|---|---|---|
| Claude Sonnet | 1.8s | 3.2s | +78% |
| Codex (GPT-5.4) | 2.8s | 4.6s | +64% |
Last evaluated: 2026-03-12 · Real Phase 2 evals · behavior (Claude) + adapter (1 CLI)
Changelog entries are derived from eval runs and skill version updates. Full cascading changelog (Phase 4D) coming soon.
Best Pass Rate
100%
Assertions
15
Models Tested
2
Evals Run
7
- +Initial release to Golems skill library
- +25 assertions across 7 eval scenarios
- +Eval fixtures included