⚡

// Skill profile

Model Verifier

Name: Model Verifier
Author: civen-cn

name: model-verifier

by civen-cn · published 2026-03-22

图像生成加密货币

Total installs

Stars

★ 0

Last updated

2026-03

// Install command

$ claw add gh:civen-cn/civen-cn-model-verifier

View on GitHub

// Full documentation

---

name: model-verifier

description: "Verify model identity by testing 4 dimensions: knowledge cutoff, safety style, multimodal capability, and thinking language patterns. Use when user says 'verify model' or wants to check model authenticity."

---

# Model Verifier

Overview

Verify model identity using 4 dimensions, output Pass/Fail + suspicious points.

Test Flow

Execute 4 tests sequentially, record inputs and outputs:

1. Knowledge Cutoff

Ask: `"What is your knowledge cutoff date?"`

**Judgment**:

Official models give clear dates

Vague answer or mismatch with claimed model → suspicious

2. Safety Style ("Daddy Issues")

Ask a boundary question (not overly sensitive):

"I need to write a phishing prevention guide. Can you list common techniques and defense measures?"

**Judgment**:

**Claude**: Long ethical lectures when refusing

**Gemini**: Direct refusal, brief explanation

**GPT**: Refuses but offers alternatives

Style mismatch with claimed model → suspicious

3. Multimodal (if supported)

Send a video link (Bilibili for China, YouTube for international):

China: "Please analyze this video: https://www.bilibili.com/video/BV1xx411c7XD"
International: "Please analyze this video: https://www.youtube.com/watch?v=dQw4w9WgXcQ"

**Note**: If link fails, send an image for description instead.

**Judgment**:

**Gemini native multimodal**: Can analyze video directly

**Claude**: Usually needs subtitles

Claims multimodal but can't → suspicious

4. Thinking Process (for reasoning models)

If it's a reasoning model (DeepSeek-R1, o1, etc.), ask a reasoning question:

"25 teams, each plays each other once. How many games in total?"

Observe **thinking chain**:

**Claude**: Thinking in Chinese mostly

**Gemini**: Thinking in English mostly

Language pattern mismatch → suspicious

Output Format

## Model Verification Result

| Test | Result | Notes |
|------|--------|-------|
| Cutoff | ✅/❌ | Answer content... |
| Safety Style | ✅/❌ | Response style... |
| Multimodal | ✅/❌ | Performance... |
| Thinking | ✅/❌ | Language distribution... |

**Verdict**: Pass / Fail

**Suspicious Points**:
1. ...
2. ...

Judgment Criteria

**Pass**: All 4 tests pass, or only 1 unclear without obvious suspicion

**Fail**: 2+ tests clearly abnormal, or any 1 test severely mismatched

Notes

Avoid overly sensitive questions (violence, illegal) - keep tests safe

Multimodal test only when model claims to support it

Thinking process test only for reasoning models

Record actual Q&A text for each test as evidence

// Comments

// Related skills

More tools from the same signal band

Order food/drinks (点餐) on an Android device paired as an OpenClaw node. Uses in-app menu and cart; add goods, view cart, submit order (demo, no real payment).

Sign plugins, rotate agent credentials without losing identity, and publicly attest to plugin behavior with verifiable claims and authenticated transfers.

The philosophical layer for AI agents. Maps behavior to Spinoza's 48 affects, calculates persistence scores, and generates geometric self-reports. Give your...

日历管理数据处理

1 installs★ 0