A/B Test Setup Skill
Plan A/B tests with proper methodology — hypothesis, sample size, duration, variant design, statistical significance.
by amdf01-debug · published 2026-04-01
$ claw add gh:amdf01-debug/amdf01-debug-sw-ab-test-setup# A/B Test Setup Skill
Trigger
Plan A/B tests with proper methodology — hypothesis, sample size, duration, variant design, statistical significance.
**Trigger phrases:** "A/B test", "split test", "experiment", "test this change", "variant", "multivariate test", "hypothesis"
Process
1. **Hypothesis**: What are you testing and why?
2. **Metrics**: Primary metric, guardrail metrics, success criteria
3. **Design**: Control vs variant(s), what exactly changes
4. **Calculate**: Sample size, test duration, minimum detectable effect
5. **Plan**: Implementation, QA, analysis timeline
Output Format
# A/B Test Plan: [Name]
## Hypothesis
If we [change], then [metric] will [improve/increase] because [reason].
## Variants
- **Control (A):** [current experience]
- **Variant (B):** [proposed change — be specific]
## Metrics
- **Primary:** [metric] — current: [X%] — target: [Y%]
- **Guardrail:** [metric that should NOT decrease]
## Sample Size & Duration
- MDE: [minimum detectable effect, e.g., 10% relative]
- Sample needed: [N per variant]
- Current traffic: [X visitors/day to test area]
- Estimated duration: [Y days/weeks]
- Confidence level: 95%
## Implementation Notes
[What needs to change, where, any technical considerations]
## Decision Framework
- If primary metric improves ≥ MDE with p < 0.05 → ship variant
- If no significant difference after [duration] → keep control
- If guardrail metric drops > [threshold] → stop test immediatelyRules
More tools from the same signal band
Order food/drinks (点餐) on an Android device paired as an OpenClaw node. Uses in-app menu and cart; add goods, view cart, submit order (demo, no real payment).
Sign plugins, rotate agent credentials without losing identity, and publicly attest to plugin behavior with verifiable claims and authenticated transfers.
The philosophical layer for AI agents. Maps behavior to Spinoza's 48 affects, calculates persistence scores, and generates geometric self-reports. Give your...