HomeBrowseUpload
โ† Back to registry
โšก
// Skill profile

Boof ๐Ÿ‘

description: "Convert PDFs and documents to markdown, index them locally for RAG retrieval, and analyze them token-efficiently. Use when asked to: read/analyze/summarize a PDF, process a document, boo

by chiefsegundo ยท published 2026-03-22

ๅผ€ๅ‘ๅทฅๅ…ทๆ•ฐๆฎๅค„็†ๅŠ ๅฏ†่ดงๅธ
Total installs
0
Stars
โ˜… 0
Last updated
2026-03
// Install command
$ claw add gh:chiefsegundo/chiefsegundo-boof
View on GitHub
// Full documentation

---

name: boof

description: "Convert PDFs and documents to markdown, index them locally for RAG retrieval, and analyze them token-efficiently. Use when asked to: read/analyze/summarize a PDF, process a document, boof a file, extract information from papers/decks/NOFOs, or when you need to work with large documents without filling the context window. Supports batch processing and cross-document queries."

---

# Boof ๐Ÿ‘

Local-first document processing: PDF โ†’ markdown โ†’ RAG index โ†’ token-efficient analysis.

Documents stay local. Only relevant chunks go to the LLM. Maximum knowledge absorption, minimum token burn.

Powered by [opendataloader-pdf](https://github.com/opendataloader-project/opendataloader-pdf) โ€” #1 in PDF parsing benchmarks (0.90 overall, 0.93 table accuracy). CPU-only, no GPU required.

Quick Reference

Convert + index a document

bash {SKILL_DIR}/scripts/boof.sh /path/to/document.pdf

Convert with custom collection name

bash {SKILL_DIR}/scripts/boof.sh /path/to/document.pdf --collection my-project

Query indexed content

qmd query "your question" -c collection-name

Core Workflow

1. **Boof it:** Run `boof.sh` on a PDF. This converts it to markdown via opendataloader-pdf (local Java engine, no API, no GPU) and indexes it into QMD for semantic search.

2. **Query it:** Use `qmd query` to retrieve only the relevant chunks. Send those chunks to the LLM โ€” not the entire document.

3. **Analyze it:** The LLM sees focused, relevant excerpts. No wasted tokens, no lost-in-the-middle problems.

When to Use Each Approach

**"Analyze this specific aspect of the paper"** โ†’ Boof + query (cheapest, most focused)

**"Summarize this entire document"** โ†’ Boof, then read the markdown section by section. Summarize each section individually, then merge summaries. See [advanced-usage.md](references/advanced-usage.md).

**"Compare findings across multiple papers"** โ†’ Boof all papers into one collection, then query across them.

**"Find where the paper discusses X"** โ†’ `qmd search "X" -c collection` for exact match, `qmd query "X" -c collection` for semantic match.

Output Location

Converted markdown files are saved to `knowledge/boofed/` by default (override with `--output-dir`).

Setup

If `boof.sh` reports missing dependencies, see [setup-guide.md](references/setup-guide.md) for installation instructions (Java + opendataloader-pdf + QMD).

Environment

  • `ODL_ENV` โ€” Path to opendataloader-pdf Python venv (default: `~/.openclaw/tools/odl-env`)
  • `QMD_BIN` โ€” Path to qmd binary (default: `~/.bun/bin/qmd`)
  • `BOOF_OUTPUT_DIR` โ€” Default output directory (default: `~/.openclaw/workspace/knowledge/boofed`)
  • // Comments
    Sign in with GitHub to leave a comment.
    // Related skills

    More tools from the same signal band