LiteParse
name: liteparse
by alfred-intel-handler-source · published 2026-04-01
$ claw add gh:alfred-intel-handler-source/alfred-intel-handler-source-liteparse---
name: liteparse
description: Parse, extract text from, and screenshot PDF and document files locally using the LiteParse CLI (`lit`). Use when asked to extract text from a PDF, parse a Word/Excel/PowerPoint file, batch-process a folder of documents, or generate page screenshots for LLM vision workflows. Runs entirely offline — no cloud, no API key. Supports PDF, DOCX, XLSX, PPTX, images (jpg/png/webp), and more. Triggers on phrases like "extract text from this PDF", "parse this document", "get the text out of", "screenshot this PDF page", or any request to read/extract content from a file.
---
# LiteParse
Local document parser built on PDF.js + Tesseract.js. Zero cloud dependencies.
**Binary:** `lit` (installed globally via npm)
**Docs:** https://developers.llamaindex.ai/liteparse/
Quick Reference
# Parse a PDF to text (stdout)
lit parse document.pdf
# Parse to file
lit parse document.pdf -o output.txt
# Parse to JSON (includes bounding boxes)
lit parse document.pdf --format json -o output.json
# Specific pages only
lit parse document.pdf --target-pages "1-5,10,15-20"
# No OCR (faster, text-layer PDFs only)
lit parse document.pdf --no-ocr
# Batch parse a directory
lit batch-parse ./input-dir ./output-dir
# Screenshot pages (for vision model input)
lit screenshot document.pdf -o ./screenshots
lit screenshot document.pdf --target-pages "1,3,5" --dpi 300 -o ./screenshotsOutput Formats
| Format | Use case |
|--------|----------|
| `text` (default) | Plain text extraction, feeding into prompts |
| `json` | Structured output with bounding boxes, useful for layout-aware tasks |
OCR Behavior
Supported File Types
Works natively: **PDF**
Requires **LibreOffice** (`brew install --cask libreoffice`): .docx, .doc, .xlsx, .xls, .pptx, .ppt, .odt, .csv
Requires **ImageMagick** (`brew install imagemagick`): .jpg, .png, .gif, .bmp, .tiff, .webp
Installation Notes
Workflow Tips
Limitations
Reference
See `references/output-examples.md` for sample JSON/text output structure.
More tools from the same signal band
Order food/drinks (点餐) on an Android device paired as an OpenClaw node. Uses in-app menu and cart; add goods, view cart, submit order (demo, no real payment).
Sign plugins, rotate agent credentials without losing identity, and publicly attest to plugin behavior with verifiable claims and authenticated transfers.
The philosophical layer for AI agents. Maps behavior to Spinoza's 48 affects, calculates persistence scores, and generates geometric self-reports. Give your...