Video STT Skill
name: video-stt
by damiencronw · published 2026-03-22
$ claw add gh:damiencronw/damiencronw-video-stt---
name: video-stt
description: "Extract audio from video URLs and transcribe using STT (Speech-to-Text). Supports local Whisper or cloud APIs. Use when: user provides a video URL and wants to know what is being said, transcribing YouTube videos, podcasts, or any video with audio."
metadata:
{
"openclaw": { "emoji": "🎬" },
"version": "1.0.0",
}
---
# Video STT Skill
从视频 URL 提取音频并转换为文字 (Speech-to-Text)
环境要求
快速开始
# 进入脚本目录
cd ~/.openclaw/workspace/skills/video-stt/scripts
# 运行转录
bash stt.sh "视频URL"使用方法
# 基本用法
bash stt.sh "https://youtube.com/watch?v=xxx"
# 指定输出文件
bash stt.sh "https://youtube.com/watch?v=xxx" -o output.txt
# 使用本地 Whisper 模型
bash stt.sh "https://youtube.com/watch?v=xxx" --local
# 使用云端 API
bash stt.sh "https://youtube.com/watch?v=xxx" --api openai支持的模型
本地 (免费)
云端 API
输出格式
默认输出纯文本,可选:
环境变量
# OpenAI (如果使用云端)
export OPENAI_API_KEY="sk-xxx"
# 或者使用硅基流动 (更便宜)
export SILICONFLOW_API_KEY="xxx"示例
# 转录 YouTube 视频
bash stt.sh "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
# 指定模型
bash stt.sh "https://youtube.com/watch?v=xxx" --model medium
# 保存为 SRT
bash stt.sh "https://youtube.com/watch?v=xxx" --format srtPython 依赖
使用 uv 管理 Python 环境:
# 创建虚拟环境
uv venv
uv pip install yt-dlp whisper ffmpeg-python
# 运行
uv run python stt.py "视频URL"More tools from the same signal band
Order food/drinks (点餐) on an Android device paired as an OpenClaw node. Uses in-app menu and cart; add goods, view cart, submit order (demo, no real payment).
Sign plugins, rotate agent credentials without losing identity, and publicly attest to plugin behavior with verifiable claims and authenticated transfers.
The philosophical layer for AI agents. Maps behavior to Spinoza's 48 affects, calculates persistence scores, and generates geometric self-reports. Give your...