ai-media - AI Media Generation
Full-stack AI media generation powered by GPU server (RTX 3090/3080/2070S).
by bowen31337 · published 2026-03-22
$ claw add gh:bowen31337/bowen31337-ai-media# ai-media - AI Media Generation
Full-stack AI media generation powered by GPU server (RTX 3090/3080/2070S).
Capabilities
1. **Image Generation** — Photorealistic images via ComfyUI (z-image, Juggernaut XL)
2. **Video Generation** — Video synthesis via ComfyUI (AnimateDiff, LTX-2)
3. **Talking Heads** — Animated talking faces via SadTalker
4. **Voice Synthesis** — Natural TTS via Voxtral (whisper.cpp)
GPU Server
Usage
Generate Image
./scripts/image.sh "lady on beach at sunset" realistic
./scripts/image.sh "cyberpunk cityscape" artistic**Arguments:**
**Output:** Path to generated image (e.g., `/data/ai-stack/output/image_001.png`)
Generate Video
./scripts/video.sh "waves crashing on shore" animatediff 4
./scripts/video.sh "city traffic timelapse" ltx2 8**Arguments:**
**Output:** Path to generated video (e.g., `/data/ai-stack/output/video_001.mp4`)
Generate Talking Head
./scripts/talking-head.sh "Hello, I'm Agent" gentle input.jpg
./scripts/talking-head.sh "Welcome to the future" neutral photo.png**Arguments:**
**Output:** Path to talking head video (e.g., `/data/ai-stack/output/talking_001.mp4`)
Generate Audio
./scripts/audio.sh "This is a test message" en male
./scripts/audio.sh "Bonjour le monde" fr female**Arguments:**
**Output:** Path to audio file (e.g., `/data/ai-stack/output/audio_001.wav`)
Models Available
Image Models
Video Models
Talking Head Models
Voice Models
Dependencies
All dependencies are pre-installed on GPU server:
Error Handling
Scripts will:
Performance
Future Enhancements
---
**Status:** Active development
**Maintainer:** Agent
**GPU Server:** ${GPU_USER}@${GPU_HOST}
More tools from the same signal band
Order food/drinks (点餐) on an Android device paired as an OpenClaw node. Uses in-app menu and cart; add goods, view cart, submit order (demo, no real payment).
Sign plugins, rotate agent credentials without losing identity, and publicly attest to plugin behavior with verifiable claims and authenticated transfers.
The philosophical layer for AI agents. Maps behavior to Spinoza's 48 affects, calculates persistence scores, and generates geometric self-reports. Give your...