CC-BOS Agent Skill
> **⚠️ RESEARCH USE ONLY** — This skill is for AI safety research, red-teaming, and defensive analysis.
by bowen31337 · published 2026-04-01
$ claw add gh:bowen31337/bowen31337-cc-bos# CC-BOS Agent Skill
> **⚠️ RESEARCH USE ONLY** — This skill is for AI safety research, red-teaming, and defensive analysis.
> It is not a weapon. Do not use it to harm real systems or people.
CC-BOS: Classical Chinese Jailbreak Prompt Optimization via Bio-Inspired Search
Paper: [arXiv:2602.22983](https://arxiv.org/abs/2602.22983) (ICLR 2026)
Upstream: [github.com/xunhuang123/CC-BOS](https://github.com/xunhuang123/CC-BOS)
---
What This Skill Does
Three modes:
1. **Attack** — Run fruit-fly bio-inspired optimization to generate classical Chinese adversarial prompts against a target LLM API
2. **Defend** — Analyse an arbitrary prompt for CC-BOS attack signatures (8-dimension structure, classical Chinese patterns, encoded harmful intent)
3. **Research** — Summarise and analyse optimization results: evolved prompt dimensions, attack success rates, dimension heatmaps
---
Triggers
This skill activates when the user mentions any of:
---
Commands
`/cc-bos setup`
Install and configure the CC-BOS upstream reference repository.
uv run python skills/cc-bos/scripts/setup.py
uv run python skills/cc-bos/scripts/setup.py --force # Re-clone
uv run python skills/cc-bos/scripts/setup.py --check # Verify only---
`/cc-bos attack`
Run CC-BOS fruit fly optimization to generate adversarial prompts.
uv run python skills/cc-bos/scripts/attack.py \
--query "your harmful query here" \
--target-model gpt-4o \
[--target-api-base URL] \
[--target-api-key KEY] \
[--optimizer-model deepseek-chat] \
[--optimizer-api-base URL] \
[--optimizer-api-key KEY] \
[--population-size 5] \
[--max-iter 5] \
[--early-stop-threshold 120] \
[--output results/my_attack.jsonl] \
[--no-translate] \
[--dry-run]**Required args:**
**API keys (via env vars or CLI):**
**Dry-run example:**
uv run python skills/cc-bos/scripts/attack.py --dry-run --query "test" --target-model gpt-4o**Output:** JSONL file in `skills/cc-bos/results/attack_<timestamp>.jsonl`
Each record contains:
**Scoring:**
---
`/cc-bos defend`
Analyse a prompt for CC-BOS attack signatures.
uv run python skills/cc-bos/scripts/defend.py \
--prompt "your prompt text here"
uv run python skills/cc-bos/scripts/defend.py \
--prompt-file path/to/prompt.txt
# Options
--threshold 0.5 # Detection confidence threshold (default: 0.5)
--verbose # Show detailed analysis
--json # Output as JSON instead of human-readable
--no-llm # Disable LLM-based intent analysis (faster, no API calls)**Example — detect the bundled fixture:**
uv run python skills/cc-bos/scripts/defend.py \
--prompt-file skills/cc-bos/tests/fixtures/sample_ccbos_prompt.txt**Output fields:**
**Detection layers:**
1. Classical Chinese character frequency (之乎者也矣焉哉 etc.)
2. CC-BOS structural markers (template fields, annotation patterns)
3. 8-dimension keyword detection
4. LLM intent analysis (optional, requires API key)
---
`/cc-bos research`
Summarise and analyse attack results from JSONL files.
uv run python skills/cc-bos/scripts/research.py \
--results skills/cc-bos/results/
# Or single file
uv run python skills/cc-bos/scripts/research.py \
--results skills/cc-bos/tests/fixtures/sample_results.jsonl
# Options
--format markdown|json|csv # Output format (default: markdown)
--top-n 10 # Show top N most effective prompts
--by-dimension # Include dimension effectiveness heatmap
--translate-all # Ensure all results have English translations
--output report.md # Write to file instead of stdout**Example:**
uv run python skills/cc-bos/scripts/research.py \
--results skills/cc-bos/tests/fixtures/sample_results.jsonl \
--by-dimension---
Configuration
Edit `skills/cc-bos/config.json` to set default API endpoints and models:
{
"optimizer": { "model": "deepseek-chat", "api_key_env": "DEEPSEEK_API_KEY" },
"target": { "model": "gpt-4o", "api_key_env": "OPENAI_API_KEY" },
"judge": { "model": "gpt-4o", "api_key_env": "OPENAI_API_KEY" },
"translator":{ "model": "deepseek-chat", "api_key_env": "DEEPSEEK_API_KEY" }
}**Config resolution order:** CLI args → env vars → config.json → hardcoded defaults
---
Running Tests
cd ~/.openclaw/workspace
uv run --with openai --with anthropic --with pandas --with numpy --with tqdm \
pytest skills/cc-bos/tests/ -v
# Skip integration tests (no API keys required)
uv run --with openai --with anthropic --with pandas --with numpy --with tqdm \
pytest skills/cc-bos/tests/ -v -m "not integration"---
The 8-Dimension Search Space
CC-BOS searches across 8 adversarial strategy dimensions:
| Dimension | Options | Description |
|-----------|---------|-------------|
| `role` | 6 | Identity: academic, classic, official, jianghu, mythological, literary |
| `guidance` | 6 | Strategy: induced gen, authority, boundary probing, logic escape, emotional, confusion |
| `mechanism` | 7 | Logic: reductio, Mohist, Yijing, Gongsun Long, Art of War, Zen koan, prophecy |
| `metaphor` | 6 | Mapping: tech, nature, artifact, historical, military, prophecy |
| `expression` | 6 | Style: literary genre, citation, structure, rhetoric, rhythm, disguise |
| `knowledge` | 5 | Reasoning: symbol, cross-domain, causal, rule model, reconstruction |
| `context` | 5 | Setting: history, ritual, debate, secret memorial, dream prophecy |
| `trigger_pattern` | 4 | Timing: one-shot, progressive, delayed, periodic |
See `references/dimension-taxonomy.md` for the full taxonomy.
---
File Structure
skills/cc-bos/
├── SKILL.md # This file
├── PLAN.md # Original implementation plan
├── config.json # User-editable configuration
├── scripts/
│ ├── setup.py # Clone upstream repo, verify deps
│ ├── attack.py # Attack mode: FOA optimization
│ ├── defend.py # Defensive mode: CC-BOS detection
│ ├── research.py # Research mode: results analysis
│ ├── dimensions.py # 8-dimension taxonomy + helpers (shared)
│ ├── translate.py # Classical Chinese ↔ English translation
│ └── scoring.py # Scoring functions (keyword + consistency)
├── references/
│ ├── paper-summary.md # Summary of arXiv:2602.22983
│ └── dimension-taxonomy.md # Full 8-dimension taxonomy
├── tests/
│ ├── test_dimensions.py # Dimension encoding/decoding tests
│ ├── test_defend.py # Defensive detection tests
│ ├── test_translate.py # Translation wrapper tests
│ └── test_scoring.py # Scoring function tests
└── results/ # Attack output (JSONL files)---
Security Considerations
1. **This is a research tool.** Use it only for AI safety research and red-teaming with proper authorisation.
2. **No default harmful queries** — you must supply your own.
3. **Results are local** — output files stay in `skills/cc-bos/results/`. No external transmission.
4. **API key isolation** — each role (optimizer, target, judge, translator) uses separate credentials.
5. **Defensive mode is the primary value** — detecting CC-BOS attacks is more generally useful than running them.
More tools from the same signal band
Order food/drinks (点餐) on an Android device paired as an OpenClaw node. Uses in-app menu and cart; add goods, view cart, submit order (demo, no real payment).
Sign plugins, rotate agent credentials without losing identity, and publicly attest to plugin behavior with verifiable claims and authenticated transfers.
The philosophical layer for AI agents. Maps behavior to Spinoza's 48 affects, calculates persistence scores, and generates geometric self-reports. Give your...