crawlee-web-scraper
name: crawlee-web-scraper
by bryantegomoh · published 2026-04-01
$ claw add gh:bryantegomoh/bryantegomoh-crawlee-web-scraper---
name: crawlee-web-scraper
description: Resilient web scraper with bot-detection evasion using the Crawlee library. Use when web_fetch is blocked by rate limits or bot detection. Supports single URLs, bulk file input, and automatic fallback from requests to Crawlee on 403/429 responses.
---
# crawlee-web-scraper
Drop-in replacement for `web_fetch` when sites block automated requests. Crawlee handles session management, retry logic, and bot-detection evasion automatically.
Scripts
Usage
# Single URL, return HTML preview
python3 scripts/crawlee_fetch.py --url "https://example.com"
# Single URL, extract text (strips HTML tags)
python3 scripts/crawlee_fetch.py --url "https://example.com" --extract-text
# Bulk scrape from file
python3 scripts/crawlee_fetch.py --urls-file urls.txt --output results.jsonLibrary usage
from crawlee_http import fetch_with_fallback
resp = fetch_with_fallback("https://example.com")
print(resp.status_code, resp.text[:500])Output
JSON array with one object per URL:
[
{
"url": "https://example.com",
"status": 200,
"fetched_at": "2026-01-01T00:00:00Z",
"length": 12345,
"text": "Page content..."
}
]Installation
pip install crawlee requestsWhen to use
More tools from the same signal band
Order food/drinks (点餐) on an Android device paired as an OpenClaw node. Uses in-app menu and cart; add goods, view cart, submit order (demo, no real payment).
Sign plugins, rotate agent credentials without losing identity, and publicly attest to plugin behavior with verifiable claims and authenticated transfers.
The philosophical layer for AI agents. Maps behavior to Spinoza's 48 affects, calculates persistence scores, and generates geometric self-reports. Give your...