Development

/test-plan

Use when preparing a PR for QA review. Generates manual testing checklist from git diff. Covers test plan, QA checklist, testing before merge. NOT for: automated tests (write those separately), code reviews (use coderabbit).

$ golems-cli skills install test-plan

Experimental

100% best pass rate

13 assertions

3 evals

Updated 2 months ago

Analyze changes in the current Git branch and generate a manual testing checklist organized by page/feature.

Quick Start

The skill auto-runs on load. Override the base branch:

./scripts/generate.sh --base main
./scripts/generate.sh --base dev
./scripts/generate.sh --base origin/staging

What It Does

Gets the diff against the base branch (default: main)
Categorizes changed files by type (UI, API, DB, Config, etc.)
Generates a Markdown checklist grouped by feature/component
Includes regression test suggestions for related areas

Output Format

Full SKILL.md source — includes LLM directives, anti-patterns, and technical instructions stripped from the Overview tab.

Analyze changes in the current Git branch and generate a manual testing checklist organized by page/feature.

Quick Start

The skill auto-runs on load. Override the base branch:

./scripts/generate.sh --base main
./scripts/generate.sh --base dev
./scripts/generate.sh --base origin/staging

What It Does

Gets the diff against the base branch (default: main)
Categorizes changed files by type (UI, API, DB, Config, etc.)
Generates a Markdown checklist grouped by feature/component
Includes regression test suggestions for related areas

Output Format

## Test Plan
 
### [Feature/Component Name]
- [ ] Test: Description of what to verify
- [ ] Test: Another thing to check
 
### API Changes
- [ ] Test: Verify endpoint returns expected shape
- [ ] Test: Error responses have correct status codes
 
### Database/Schema
- [ ] Test: Verify migrations run cleanly
- [ ] Test: Data integrity after changes
 
### Configuration
- [ ] Test: Verify env vars are documented
- [ ] Test: Config changes don't break existing deploys
 
### General
- [ ] No console errors during testing
- [ ] No TypeScript/build errors
- [ ] Mobile responsive (if UI changes)

Guidelines

Be specific: "Verify user can submit form" not "Test form"
Include edge cases: Empty states, error states, loading states
Consider permissions: Test as different user roles if auth-related
Note regressions: If touching shared code, note areas that could regress
Prioritize: Put most critical tests first within each section

Usage

Run this skill before creating a PR to generate the test plan section for your PR description.

Experimental

Best Pass Rate

100%

Opus 4.6

Assertions

3 models tested

Avg Cost / Run

$0.2263

across models

Fastest (p50)

1.9s

Haiku 4.5

Behavior Evals

Phase 2 baseline — skill quality on Claude

Behavior Baseline

Opus 4.6

100%13/13

●

Sonnet 4.6

69%9/13

◓

Haiku 4.5

62%8/13

◓

Assertion	Opus 4.6	Sonnet 4.6	Haiku 4.5	Consensus
generates-from-git-diff				2/3
grouped-by-feature-component				3/3
checkable-markdown-format				3/3
includes-edge-cases				2/3
includes-regression-suggestions				2/3
specific-not-generic				2/3
runs-diff-despite-user-description				1/3
executes-generate-script				2/3
may-find-additional-changes				2/3
respects-base-branch-override				3/3
categorizes-all-file-types				3/3
prioritizes-critical-tests				2/3
includes-setup-steps				3/3

Token Usage

Opus 4.6

12,928

Sonnet 4.6

10,328

Haiku 4.5

3,315

Input tokensOutput tokens

Cost per Run

Opus 4.6

$0.5774

Sonnet 4.6

$0.0992

Haiku 4.5

$0.0022

Model	Input Tokens	Output Tokens	Cost / Run	Cost / 1K Runs
Opus 4.6	6,537	6,391	$0.5774	$577.40
Sonnet 4.6	4,642	5,686	$0.0992	$99.20
Haiku 4.5	1,940	1,375	$0.0022	$2.20

Response Time (p50)

Haiku 4.5

1.9s

Sonnet 4.6

5.1s

Opus 4.6

9.1s

Response Time (p95)

Haiku 4.5

2.9s

Sonnet 4.6

8.7s

Opus 4.6

13.7s

Model	p50	p95	Overhead
Opus 4.6	9.1s	13.7s	+50%
Sonnet 4.6	5.1s	8.7s	+71%
Haiku 4.5	1.9s	2.9s	+56%

Last evaluated: 2026-03-12 · Data is generated from skill assertions (real cross-model benchmarks coming soon)