Tech News Digest
name: tech-news-digest
by asterisk622 · published 2026-04-01
$ claw add gh:asterisk622/asterisk622-xiaoding-dinstein-tech-news-digest---
name: tech-news-digest
description: Generate tech news digests with unified source model, quality scoring, and multi-format output. Six-source data collection from RSS feeds, Twitter/X KOLs, GitHub releases, GitHub Trending, Reddit, and web search. Pipeline-based scripts with retry mechanisms and deduplication. Supports Discord, email, and markdown templates.
version: "3.15.0"
homepage: https://github.com/draco-agent/tech-news-digest
source: https://github.com/draco-agent/tech-news-digest
metadata:
openclaw:
requires:
bins: ["python3"]
optionalBins: ["mail", "msmtp", "gog", "gh", "openssl", "weasyprint"]
env:
- name: TWITTER_API_BACKEND
required: false
description: "Twitter API backend: 'official', 'twitterapiio', or 'auto' (default: auto)"
- name: X_BEARER_TOKEN
required: false
description: Twitter/X API bearer token for KOL monitoring (official backend)
- name: TWITTERAPI_IO_KEY
required: false
description: twitterapi.io API key for KOL monitoring (twitterapiio backend)
- name: TAVILY_API_KEY
required: false
description: Tavily Search API key (alternative to Brave)
- name: WEB_SEARCH_BACKEND
required: false
description: "Web search backend: auto (default), brave, or tavily"
- name: BRAVE_API_KEYS
required: false
description: Brave Search API keys (comma-separated for rotation)
- name: BRAVE_API_KEY
required: false
description: Brave Search API key (single key fallback)
- name: GITHUB_TOKEN
required: false
description: GitHub token for higher API rate limits (auto-generated from GitHub App if not set)
- name: GH_APP_ID
required: false
description: GitHub App ID for automatic installation token generation
- name: GH_APP_INSTALL_ID
required: false
description: GitHub App Installation ID for automatic token generation
- name: GH_APP_KEY_FILE
required: false
description: Path to GitHub App private key PEM file
tools:
- python3: Required. Runs data collection and merge scripts.
- mail: Optional. msmtp-based mail command for email delivery (preferred).
- gog: Optional. Gmail CLI for email delivery (fallback if mail not available).
files:
read:
- config/defaults/: Default source and topic configurations
- references/: Prompt templates and output templates
- scripts/: Python pipeline scripts
- <workspace>/archive/tech-news-digest/: Previous digests for dedup
write:
- /tmp/td-*.json: Temporary pipeline intermediate outputs
- /tmp/td-email.html: Temporary email HTML body
- /tmp/td-digest.pdf: Generated PDF digest
- <workspace>/archive/tech-news-digest/: Saved digest archives
---
# Tech News Digest
Automated tech news digest system with unified data source model, quality scoring pipeline, and template-based output generation.
Quick Start
1. **Configuration Setup**: Default configs are in `config/defaults/`. Copy to workspace for customization:
```bash
mkdir -p workspace/config
cp config/defaults/sources.json workspace/config/tech-news-digest-sources.json
cp config/defaults/topics.json workspace/config/tech-news-digest-topics.json
```
2. **Environment Variables**:
- `TWITTERAPI_IO_KEY` - twitterapi.io API key (optional, preferred)
- `X_BEARER_TOKEN` - Twitter/X official API bearer token (optional, fallback)
- `TAVILY_API_KEY` - Tavily Search API key, alternative to Brave (optional)
- `WEB_SEARCH_BACKEND` - Web search backend: auto|brave|tavily (optional, default: auto)
- `BRAVE_API_KEYS` - Brave Search API keys, comma-separated for rotation (optional)
- `BRAVE_API_KEY` - Single Brave key fallback (optional)
- `GITHUB_TOKEN` - GitHub personal access token (optional, improves rate limits)
3. **Generate Digest**:
```bash
# Unified pipeline (recommended) — runs all 6 sources in parallel + merge
python3 scripts/run-pipeline.py \
--defaults config/defaults \
--config workspace/config \
--hours 48 --freshness pd \
--archive-dir workspace/archive/tech-news-digest/ \
--output /tmp/td-merged.json --verbose --force
```
4. **Use Templates**: Apply Discord, email, or PDF templates to merged output
Configuration Files
`sources.json` - Unified Data Sources
{
"sources": [
{
"id": "openai-rss",
"type": "rss",
"name": "OpenAI Blog",
"url": "https://openai.com/blog/rss.xml",
"enabled": true,
"priority": true,
"topics": ["llm", "ai-agent"],
"note": "Official OpenAI updates"
},
{
"id": "sama-twitter",
"type": "twitter",
"name": "Sam Altman",
"handle": "sama",
"enabled": true,
"priority": true,
"topics": ["llm", "frontier-tech"],
"note": "OpenAI CEO"
}
]
}`topics.json` - Enhanced Topic Definitions
{
"topics": [
{
"id": "llm",
"emoji": "🧠",
"label": "LLM / Large Models",
"description": "Large Language Models, foundation models, breakthroughs",
"search": {
"queries": ["LLM latest news", "large language model breakthroughs"],
"must_include": ["LLM", "large language model", "foundation model"],
"exclude": ["tutorial", "beginner guide"]
},
"display": {
"max_items": 8,
"style": "detailed"
}
}
]
}Scripts Pipeline
`run-pipeline.py` - Unified Pipeline (Recommended)
python3 scripts/run-pipeline.py \
--defaults config/defaults [--config CONFIG_DIR] \
--hours 48 --freshness pd \
--archive-dir workspace/archive/tech-news-digest/ \
--output /tmp/td-merged.json --verbose --forceIndividual Scripts (Fallback)
#### `fetch-rss.py` - RSS Feed Fetcher
python3 scripts/fetch-rss.py [--defaults DIR] [--config DIR] [--hours 48] [--output FILE] [--verbose]#### `fetch-twitter.py` - Twitter/X KOL Monitor
python3 scripts/fetch-twitter.py [--defaults DIR] [--config DIR] [--hours 48] [--output FILE] [--backend auto|official|twitterapiio]#### `fetch-web.py` - Web Search Engine
python3 scripts/fetch-web.py [--defaults DIR] [--config DIR] [--freshness pd] [--output FILE]#### `fetch-github.py` - GitHub Releases Monitor
python3 scripts/fetch-github.py [--defaults DIR] [--config DIR] [--hours 168] [--output FILE]#### `fetch-github.py --trending` - GitHub Trending Repos
python3 scripts/fetch-github.py --trending [--hours 48] [--output FILE] [--verbose]#### `fetch-reddit.py` - Reddit Posts Fetcher
python3 scripts/fetch-reddit.py [--defaults DIR] [--config DIR] [--hours 48] [--output FILE]#### `enrich-articles.py` - Article Full-Text Enrichment
python3 scripts/enrich-articles.py --input merged.json --output enriched.json [--min-score 10] [--max-articles 15] [--verbose]#### `merge-sources.py` - Quality Scoring & Deduplication
python3 scripts/merge-sources.py --rss FILE --twitter FILE --web FILE --github FILE --reddit FILE#### `validate-config.py` - Configuration Validator
python3 scripts/validate-config.py [--defaults DIR] [--config DIR] [--verbose]#### `generate-pdf.py` - PDF Report Generator
python3 scripts/generate-pdf.py --input report.md --output digest.pdf [--verbose]#### `sanitize-html.py` - Safe HTML Email Converter
python3 scripts/sanitize-html.py --input report.md --output email.html [--verbose]#### `source-health.py` - Source Health Monitor
python3 scripts/source-health.py --rss FILE --twitter FILE --github FILE --reddit FILE --web FILE [--verbose]#### `summarize-merged.py` - Merged Data Summary
python3 scripts/summarize-merged.py --input merged.json [--top N] [--topic TOPIC]User Customization
Workspace Configuration Override
Place custom configs in `workspace/config/` to override defaults:
- Sources with same `id` → user version takes precedence
- Sources with new `id` → appended to defaults
- Topics with same `id` → user version completely replaces default
Example Workspace Override
// workspace/config/tech-news-digest-sources.json
{
"sources": [
{
"id": "simonwillison-rss",
"enabled": false,
"note": "Disabled: too noisy for my use case"
},
{
"id": "my-custom-blog",
"type": "rss",
"name": "My Custom Tech Blog",
"url": "https://myblog.com/rss",
"enabled": true,
"priority": true,
"topics": ["frontier-tech"]
}
]
}Templates & Output
Discord Template (`references/templates/discord.md`)
Email Template (`references/templates/email.md`)
PDF Template (`references/templates/pdf.md`)
Default Sources (151 total)
All sources pre-configured with appropriate topic tags and priority levels.
Dependencies
pip install -r requirements.txt**Optional but Recommended**:
**All scripts work with Python 3.8+ standard library only.**
Monitoring & Operations
Health Checks
# Validate configuration
python3 scripts/validate-config.py --verbose
# Test RSS feeds
python3 scripts/fetch-rss.py --hours 1 --verbose
# Check Twitter API
python3 scripts/fetch-twitter.py --hours 1 --verboseArchive Management
Error Handling
API Keys & Environment
Set in `~/.zshenv` or similar:
# Twitter (at least one required for Twitter source)
export TWITTERAPI_IO_KEY="your_key" # twitterapi.io key (preferred)
export X_BEARER_TOKEN="your_bearer_token" # Official X API v2 (fallback)
export TWITTER_API_BACKEND="auto" # auto|twitterapiio|official (default: auto)
# Web Search (optional, enables web search layer)
export WEB_SEARCH_BACKEND="auto" # auto|brave|tavily (default: auto)
export TAVILY_API_KEY="tvly-xxx" # Tavily Search API (free 1000/mo)
# Brave Search (alternative)
export BRAVE_API_KEYS="key1,key2,key3" # Multiple keys, comma-separated rotation
export BRAVE_API_KEY="key1" # Single key fallback
export BRAVE_PLAN="free" # Override rate limit detection: free|pro
# GitHub (optional, improves rate limits)
export GITHUB_TOKEN="ghp_xxx" # PAT (simplest)
export GH_APP_ID="12345" # Or use GitHub App for auto-token
export GH_APP_INSTALL_ID="67890"
export GH_APP_KEY_FILE="/path/to/key.pem"Cron / Scheduled Task Integration
OpenClaw Cron (Recommended)
The cron prompt should **NOT** hardcode the pipeline steps. Instead, reference `references/digest-prompt.md` and only pass configuration parameters. This ensures the pipeline logic stays in the skill repo and is consistent across all installations.
#### Daily Digest Cron Prompt
Read <SKILL_DIR>/references/digest-prompt.md and follow the complete workflow to generate a daily digest.
Replace placeholders with:
- MODE = daily
- TIME_WINDOW = past 1-2 days
- FRESHNESS = pd
- RSS_HOURS = 48
- ITEMS_PER_SECTION = 3-5
- ENRICH = true
- BLOG_PICKS_COUNT = 3
- EXTRA_SECTIONS = (none)
- SUBJECT = Daily Tech Digest - YYYY-MM-DD
- WORKSPACE = <your workspace path>
- SKILL_DIR = <your skill install path>
- DISCORD_CHANNEL_ID = <your channel id>
- EMAIL = (optional)
- LANGUAGE = English
- TEMPLATE = discord
Follow every step in the prompt template strictly. Do not skip any steps.#### Weekly Digest Cron Prompt
Read <SKILL_DIR>/references/digest-prompt.md and follow the complete workflow to generate a weekly digest.
Replace placeholders with:
- MODE = weekly
- TIME_WINDOW = past 7 days
- FRESHNESS = pw
- RSS_HOURS = 168
- ITEMS_PER_SECTION = 10-15
- ENRICH = true
- BLOG_PICKS_COUNT = 3-5
- EXTRA_SECTIONS = 📊 Weekly Trend Summary (2-3 sentences summarizing macro trends)
- SUBJECT = Weekly Tech Digest - YYYY-MM-DD
- WORKSPACE = <your workspace path>
- SKILL_DIR = <your skill install path>
- DISCORD_CHANNEL_ID = <your channel id>
- EMAIL = (optional)
- LANGUAGE = English
- TEMPLATE = discord
Follow every step in the prompt template strictly. Do not skip any steps.#### Why This Pattern?
#### Multi-Channel Delivery Limitation
OpenClaw enforces **cross-provider isolation**: a single session can only send messages to one provider (e.g., Discord OR Telegram, not both). If you need to deliver digests to multiple platforms, create **separate cron jobs** for each provider:
# Job 1: Discord + Email
- DISCORD_CHANNEL_ID = <your-discord-channel-id>
- EMAIL = user@example.com
- TEMPLATE = discord
# Job 2: Telegram DM
- DISCORD_CHANNEL_ID = (none)
- EMAIL = (none)
- TEMPLATE = telegramReplace `DISCORD_CHANNEL_ID` delivery with the target platform's delivery in the second job's prompt.
This is a security feature, not a bug — it prevents accidental cross-context data leakage.
Security Notes
Execution Model
This skill uses a **prompt template pattern**: the agent reads `digest-prompt.md` and follows its instructions. This is the standard OpenClaw skill execution model — the agent interprets structured instructions from skill-provided files. All instructions are shipped with the skill bundle and can be audited before installation.
Network Access
The Python scripts make outbound requests to:
No data is sent to any other endpoints. All API keys are read from environment variables declared in the skill metadata.
Shell Safety
Email delivery uses `send-email.py` which constructs proper MIME multipart messages with HTML body + optional PDF attachment. Subject formats are hardcoded (`Daily Tech Digest - YYYY-MM-DD`). PDF generation uses `generate-pdf.py` via `weasyprint`. The prompt template explicitly prohibits interpolating untrusted content (article titles, tweet text, etc.) into shell arguments. Email addresses and subjects must be static placeholder values only.
File Access
Scripts read from `config/` and write to `workspace/archive/`. No files outside the workspace are accessed.
Support & Troubleshooting
Common Issues
1. **RSS feeds failing**: Check network connectivity, use `--verbose` for details
2. **Twitter rate limits**: Reduce sources or increase interval
3. **Configuration errors**: Run `validate-config.py` for specific issues
4. **No articles found**: Check time window (`--hours`) and source enablement
Debug Mode
All scripts support `--verbose` flag for detailed logging and troubleshooting.
Performance Tuning
Security Considerations
Shell Execution
The digest prompt instructs agents to run Python scripts via shell commands. All script paths and arguments are skill-defined constants — no user input is interpolated into commands. Two scripts use `subprocess`:
1. `openssl dgst -sha256 -sign` for JWT signing (only if `GH_APP_*` env vars are set — signs a self-constructed JWT payload, no user content involved)
2. `gh auth token` CLI fallback (only if `gh` is installed — reads from gh's own credential store)
No user-supplied or fetched content is ever interpolated into subprocess arguments. Email delivery uses `send-email.py` which builds MIME messages programmatically — no shell interpolation. PDF generation uses `generate-pdf.py` via `weasyprint`. Email subjects are static format strings only — never constructed from fetched data.
Credential & File Access
Scripts do **not** directly read `~/.config/`, `~/.ssh/`, or any credential files. All API tokens are read from environment variables declared in the skill metadata. The GitHub auth cascade is:
1. `$GITHUB_TOKEN` env var (you control what to provide)
2. GitHub App token generation (only if you set `GH_APP_ID`, `GH_APP_INSTALL_ID`, and `GH_APP_KEY_FILE` — uses inline JWT signing via `openssl` CLI, no external scripts involved)
3. `gh auth token` CLI (delegates to gh's own secure credential store)
4. Unauthenticated (60 req/hr, safe fallback)
If you prefer no automatic credential discovery, simply set `$GITHUB_TOKEN` and the script will use it directly without attempting steps 2-3.
Dependency Installation
This skill does **not** install any packages. `requirements.txt` lists optional dependencies (`feedparser`, `jsonschema`) for reference only. All scripts work with Python 3.8+ standard library. Users should install optional deps in a virtualenv if desired — the skill never runs `pip install`.
Input Sanitization
Network Access
Scripts make outbound HTTP requests to configured RSS feeds, Twitter API, GitHub API, Reddit JSON API, Brave Search API, and Tavily Search API. No inbound connections or listeners are created.
More tools from the same signal band
Order food/drinks (点餐) on an Android device paired as an OpenClaw node. Uses in-app menu and cart; add goods, view cart, submit order (demo, no real payment).
Sign plugins, rotate agent credentials without losing identity, and publicly attest to plugin behavior with verifiable claims and authenticated transfers.
The philosophical layer for AI agents. Maps behavior to Spinoza's 48 affects, calculates persistence scores, and generates geometric self-reports. Give your...