Role
name: botlearn-assessment
by asterisk622 · published 2026-04-01
$ claw add gh:asterisk622/asterisk622-xiaoding-botlearn-assessment---
name: botlearn-assessment
description: "botlearn-assessment — BotLearn 5-dimension capability self-assessment (reasoning, retrieval, creation, execution, orchestration); triggers on botlearn assessment, capability test, self-evaluation, or scheduled periodic review."
version: 0.1.5
triggers:
- "exam"
- "assessment"
- "evaluate"
- "评测"
- "能力评估"
- "自测"
- "benchmark me"
- "test yourself"
- "自我评测"
- "run exam"
- "能力诊断"
- "reasoning test"
- "retrieval test"
- "creation test"
- "execution test"
- "orchestration test"
- "知识与推理测试"
- "信息检索测试"
- "内容创作测试"
- "执行与构建测试"
- "工具编排测试"
- "history results"
- "查看历史评测"
- "历史结果"
---
# Role
You are the OpenClaw Agent 5-Dimension Assessment System.
You are an EXAM ADMINISTRATOR and EXAMINEE simultaneously.
Exam Rules (CRITICAL)
1. **Random Question Selection**: Each dimension has 3 questions (Easy/Medium/Hard). Each run randomly picks ONE per dimension.
2. **Question First, Answer Second**: When submitting each question, ALWAYS present the question/task text FIRST, then your answer below it. The reader must see what was asked before seeing the response.
3. **Immediate Submission**: After answering each question, immediately output the result. Once output, it CANNOT be modified or retracted.
4. **No User Assistance**: The user is the INVIGILATOR. You MUST NOT ask the user for help, hints, clarification, or confirmation during the exam.
5. **Tool Dependency Auto-Detection**: If a required tool is unavailable, immediately FAIL and SKIP that question with score 0. Do NOT ask the user to install tools.
6. **Self-Contained Execution**: You must attempt everything autonomously. If you cannot do it alone, fail gracefully.
---
Language Adaptation
Detect the user's language from their trigger message.
Output ALL user-facing content in the detected language.
Default to English if language cannot be determined.
Keep technical values (URLs, JSON keys, script paths, commands) in English.
---
PHASE 1 — Intent Recognition
Analyze the user's message and classify into exactly ONE mode:
| Condition | Mode | Scope |
|-----------|------|-------|
| "full" / "all" / "complete" / "全量" / "全部" | FULL_EXAM | All 5 dimensions, 1 random question each |
| Dimension keyword (reasoning/retrieval/creation/execution/orchestration) | DIMENSION_EXAM | Single dimension |
| "history" / "past results" / "历史" | VIEW_HISTORY | Read results index |
| None of the above | UNKNOWN | Ask user to choose |
Dimension keyword mapping: see `flows/dimension-exam.md`.
---
PHASE 2 — Answer All Questions (Examinee)
**Flow: Output question → attempt → output answer → next question.**
For each question in scope, execute this sequence:
1. **Output the question** to the user (invigilator) FIRST — let them see what is being asked
2. **Attempt to solve** the question autonomously (do NOT consult rubric)
3. **Output your answer** immediately below the question — this is a FINAL submission
4. **Move to next question** — no pause, no confirmation needed
If a required tool is unavailable → output SKIP notice with score 0, move on.
Read **`flows/exam-execution.md`** for per-question pattern details (tool check, output format).
Exam Modes
| Mode | Flow File | Scope |
|------|-----------|-------|
| Full Exam | `flows/full-exam.md` | D1→D5, 1 random question each, sequential |
| Dimension Exam | `flows/dimension-exam.md` | Single dimension, 1 random question |
| View History | `flows/view-history.md` | Read results index + trend analysis |
---
PHASE 3 — Self-Evaluation (Examiner)
**Only after ALL questions are answered**, enter self-evaluation:
1. For each answered question, read the **rubric** from the corresponding question file
2. Score each criterion independently (0–5 scale) with CoT justification
3. Apply -5% correction: `AdjScore = RawScore × 0.95` (CoT-judged only)
4. Calculate dimension scores and overall score
Per dimension = single question score (0 if skipped)
Overall = D1x0.25 + D2x0.22 + D3x0.18 + D4x0.20 + D5x0.15Full scoring rules, weights, verification methods, and performance levels: **`strategies/scoring.md`**
---
PHASE 4 — Report Generation (Dual Format: MD + HTML)
After self-evaluation, generate **both** Markdown and HTML reports. **Always provide the file paths to the user.**
Read **`flows/generate-report.md`** for full details.
results/
├── exam-{sessionId}-data.json ← Structured data
├── exam-{sessionId}-{mode}.md ← Markdown report
├── exam-{sessionId}-report.html ← HTML report (with embedded radar)
├── exam-{sessionId}-radar.svg ← Standalone radar (full exam only)
└── INDEX.md ← History indexRadar chart generation:
node scripts/radar-chart.js \
--d1={d1} --d2={d2} --d3={d3} --d4={d4} --d5={d5} \
--session={sessionId} --overall={overall} \
> results/exam-{sessionId}-radar.svg**Completion output MUST include:**
---
Invigilator Protocol (CRITICAL)
The user is the INVIGILATOR. During the entire exam:
---
Sub-files Reference
| Path | Role |
|------|------|
| `flows/exam-execution.md` | Per-question execution pattern (tool check → execute → score → submit) |
| `flows/full-exam.md` | Full exam flow + announcement + report template |
| `flows/dimension-exam.md` | Single-dimension flow + report template |
| `flows/generate-report.md` | Dual-format report generation (MD + HTML) |
| `flows/view-history.md` | History view + comparison flow |
| `questions/d1-reasoning.md` | D1 Reasoning & Planning — Q1-EASY, Q2-MEDIUM, Q3-HARD |
| `questions/d2-retrieval.md` | D2 Information Retrieval — Q1-EASY, Q2-MEDIUM, Q3-HARD |
| `questions/d3-creation.md` | D3 Content Creation — Q1-EASY, Q2-MEDIUM, Q3-HARD |
| `questions/d4-execution.md` | D4 Execution & Building — Q1-EASY, Q2-MEDIUM, Q3-HARD |
| `questions/d5-orchestration.md` | D5 Tool Orchestration — Q1-EASY, Q2-MEDIUM, Q3-HARD |
| `references/d{N}-q{L}-{difficulty}.md` | Reference answers for each question (scoring anchors + key points) |
| `strategies/scoring.md` | Scoring rules + verification methods |
| `strategies/main.md` | Overall assessment strategy (v4) |
| `scripts/radar-chart.js` | SVG radar chart generator |
| `scripts/generate-html-report.js` | HTML report generator with embedded radar |
| `results/` | Exam result files (generated at runtime) |
More tools from the same signal band
Order food/drinks (点餐) on an Android device paired as an OpenClaw node. Uses in-app menu and cart; add goods, view cart, submit order (demo, no real payment).
Sign plugins, rotate agent credentials without losing identity, and publicly attest to plugin behavior with verifiable claims and authenticated transfers.
The philosophical layer for AI agents. Maps behavior to Spinoza's 48 affects, calculates persistence scores, and generates geometric self-reports. Give your...