Markdown Documentation Full-Text Search
name: md-docs-search
by carev01 · published 2026-03-22
$ claw add gh:carev01/carev01-md-docs-search---
name: md-docs-search
description: Full-text search across structured Markdown documentation archives using SQLite FTS5. Use when you need to search large collections of Markdown articles that are separated by "---" delimiters and contain source URLs (marked with "*Source:" pattern). Provides fast BM25-ranked search with automatic source URL extraction for citations. Ideal for research, documentation lookups, and knowledge base exploration. Requires indexing documentation first with `docs.py index`.
---
# Markdown Documentation Full-Text Search
Fast, indexed full-text search across Markdown documentation archives using SQLite FTS5 with BM25 relevance ranking.
When to Use
Document Format Expected
Articles separated by `---` delimiter with `*Source:` URL:
# Article Title
*Source: https://docs.example.com/path/to/article.html*
Article content here...
---
# Next Article Title
*Source: https://docs.example.com/another/article.html*
More content...Quick Start
# 1. Index the documentation (one-time or when docs change)
scripts/docs.py index ./docs
# 2. Search
scripts/docs.py search "kubernetes backup" --max 5
# 3. Check index status
scripts/docs.py statusPrimary Tool: docs.py
The unified CLI handles all operations:
Indexing
# Index documentation directory
scripts/docs.py index ./docs
# Force full rebuild
scripts/docs.py index ./docs --rebuild
# Custom database location
scripts/docs.py index ./docs --db /path/to/custom.dbSearching
# Basic search
scripts/docs.py search "kubernetes backup"
# Boolean operators
scripts/docs.py search "AWS AND S3 AND snapshot"
# Phrase search
scripts/docs.py search '"exact phrase match"'
# Prefix search
scripts/docs.py search "kube*"
# Exclude terms
scripts/docs.py search "backup NOT restore"
# Title-only search
scripts/docs.py search "kubernetes" --title-only
# Output formats
scripts/docs.py search "kubernetes" --format json
scripts/docs.py search "kubernetes" --format markdown
# More context around matches
scripts/docs.py search "kubernetes" --context 400
# Include full content in JSON
scripts/docs.py search "kubernetes" --format json --full-contentFTS5 Query Syntax
| Syntax | Meaning |
|--------|---------|
| `term1 term2` | Documents with term1 OR term2 (ranked) |
| `term1 AND term2` | Documents with both terms |
| `term1 OR term2` | Documents with either term |
| `"exact phrase"` | Exact phrase match |
| `prefix*` | Words starting with prefix |
| `term1 NOT term2` | term1 without term2 |
| `title:term` | Search only titles |
Getting Specific Articles
# Get article by partial URL or title
scripts/docs.py get "system_requirements" --full
# Find all matching articles
scripts/docs.py get "backup" --allStatus
# Check index statistics
scripts/docs.py statusWorkflow for Research Tasks
Discovery Phase
# Check what's indexed
scripts/docs.py status
# Explore topics with broad searches
scripts/docs.py search "<feature>" --max 20Research Phase
# Narrow down with boolean operators
scripts/docs.py search "<feature> AND <platform>"
# Find specific information
scripts/docs.py search "limitation OR restriction OR 'not supported'"Citation Phase
Every search result includes the `Source:` URL — use this in your reports:
According to documentation, [finding]...
Source: https://docs.example.com/path/to/article.htmlMulti-Source Setup
Each agent or project can have their own documentation and index:
~/docs/VendorA/
├── docs_part_01.md
├── docs.db # Index lives with docs
└── ...
~/docs/VendorB/
├── docs.md
├── docs.db
└── ...The `docs.py` script auto-detects the database location.
Advanced Scripts
For specialized needs:
Research Patterns
For common search patterns (feature research, architecture, security, etc.), see [references/search-patterns.md](references/search-patterns.md).
Example Session
# What's available?
scripts/docs.py status
# Output: Files indexed: 37, Articles indexed: 32065
# Find information
scripts/docs.py search "kubernetes backup" --max 5
# Narrow to specific platform
scripts/docs.py search "kubernetes AND AWS" --max 5
# Find limitations
scripts/docs.py search "limitation OR 'not supported'"
# Get full article for citation
scripts/docs.py get "system_requirements" --fullBest Practices
1. **Index once, search many times** — FTS5 is fast because it's indexed
2. **Use boolean operators** — `AND`, `OR`, `NOT` for precision
3. **Phrase search for exact terms** — `"exact match"` with quotes
4. **Always cite sources** — Include `Source:` URLs in reports
5. **Rebuild periodically** — Re-index when documentation updates
6. **Use JSON for analysis** — Pipe to `jq` or other tools for processing
More tools from the same signal band
Order food/drinks (点餐) on an Android device paired as an OpenClaw node. Uses in-app menu and cart; add goods, view cart, submit order (demo, no real payment).
Sign plugins, rotate agent credentials without losing identity, and publicly attest to plugin behavior with verifiable claims and authenticated transfers.
The philosophical layer for AI agents. Maps behavior to Spinoza's 48 affects, calculates persistence scores, and generates geometric self-reports. Give your...