Agent Learner
version: "2.0.0"
by bytesagain · published 2026-03-22
$ claw add gh:bytesagain/bytesagain-ba-agent-learner---
version: "2.0.0"
name: agent-learner
description: "Benchmark and compare agent prompts and evaluation results. Use when tuning strategies, evaluating outputs, or comparing configurations."
author: BytesAgain
homepage: https://bytesagain.com
source: https://github.com/bytesagain/ai-skills
---
# Agent Learner
An AI toolkit for configuring, benchmarking, comparing, and optimizing agent prompts and evaluation results. Agent Learner provides persistent, file-based logging for each command category with timestamped entries, summary statistics, multi-format export, and full-text search across all records.
Commands
| Command | Description |
|---------|-------------|
| `configure` | Configure agent settings — log configuration entries or view recent ones |
| `benchmark` | Benchmark agent performance — log benchmark results or view history |
| `compare` | Compare agent outputs — log comparison data or view recent comparisons |
| `prompt` | Prompt management — log prompt variations or view recent prompts |
| `evaluate` | Evaluate agent outputs — log evaluation results or view history |
| `fine-tune` | Fine-tune parameters — log fine-tuning sessions or view recent ones |
| `analyze` | Analyze agent behavior — log analysis entries or view recent analyses |
| `cost` | Cost tracking — log cost data or view recent cost entries |
| `usage` | Usage monitoring — log usage metrics or view recent usage data |
| `optimize` | Optimize configurations — log optimization runs or view history |
| `test` | Test agent behavior — log test results or view recent tests |
| `report` | Report generation — log report entries or view recent reports |
| `stats` | Show summary statistics across all log categories (entry counts, data size, first entry date) |
| `export <fmt>` | Export all data in json, csv, or txt format to the data directory |
| `search <term>` | Full-text search across all log files (case-insensitive) |
| `recent` | Show the 20 most recent entries from the activity history log |
| `status` | Health check — show version, data directory, total entries, disk usage, and last activity |
| `help` | Show the full help message with all available commands |
| `version` | Print the current version string |
Each data command (configure, benchmark, compare, etc.) works in two modes:
Data Storage
All data is stored in plain text files under the data directory:
Default data directory: `~/.local/share/agent-learner/`
Requirements
When to Use
1. **Benchmarking agent performance** — When you need to track and compare benchmark results across different agent configurations, models, or prompt strategies
2. **Prompt engineering iteration** — When you're testing multiple prompt variations and want to log each version with results for later comparison
3. **Cost and usage tracking** — When you need to monitor API costs and usage metrics over time to optimize spending
4. **Fine-tuning experiments** — When running fine-tuning sessions and you want to log parameters, results, and observations for reproducibility
5. **Cross-category analysis** — When you need to search across all logged data (benchmarks, prompts, evaluations, costs) to find patterns or specific entries
Examples
# Initialize and check status
agent-learner status
# Log a benchmark result
agent-learner benchmark "GPT-4o on MMLU: 88.7% accuracy, 1.2s avg latency"
# Log a prompt variation
agent-learner prompt "System: You are a helpful coding assistant. Always explain your reasoning step by step."
# Compare two configurations
agent-learner compare "GPT-4o vs Claude-3.5: GPT-4o 12% faster, Claude 5% more accurate on code tasks"
# Track costs
agent-learner cost "March batch: 12,450 tokens input, 3,200 tokens output, $0.47 total"
# View all recent benchmarks
agent-learner benchmark
# Search across all logs for a specific term
agent-learner search "accuracy"
# Export all data as JSON
agent-learner export json
# View summary statistics
agent-learner stats
# Show recent activity
agent-learner recentOutput
All commands return output to stdout. Export files are written to the data directory:
agent-learner export json # → ~/.local/share/agent-learner/export.json
agent-learner export csv # → ~/.local/share/agent-learner/export.csv
agent-learner export txt # → ~/.local/share/agent-learner/export.txtEvery command execution is logged to `$DATA_DIR/history.log` for auditing purposes.
---
Powered by BytesAgain | bytesagain.com | hello@bytesagain.com
More tools from the same signal band
Order food/drinks (点餐) on an Android device paired as an OpenClaw node. Uses in-app menu and cart; add goods, view cart, submit order (demo, no real payment).
Sign plugins, rotate agent credentials without losing identity, and publicly attest to plugin behavior with verifiable claims and authenticated transfers.
The philosophical layer for AI agents. Maps behavior to Spinoza's 48 affects, calculates persistence scores, and generates geometric self-reports. Give your...