HomeBrowseUpload
← Back to registry
// Skill profile

AI Video Remix Skill

name: ai-video-remix

by abu-shotai · published 2026-04-01

API集成
Total installs
0
Stars
★ 0
Last updated
2026-04
// Install command
$ claw add gh:abu-shotai/abu-shotai-ai-video-remix
View on GitHub
// Full documentation

---

name: ai-video-remix

description: AI-driven video remix generator that uses ShotAI semantic search + LLM planning + Remotion rendering to produce styled video compositions from a user's local video library. Use when the user asks to create a video remix, highlight reel, travel vlog, sports highlight, nature montage, or any styled video cut from their library. Triggers on requests like "帮我做一个混剪", "make a travel vlog from my library", "create a sports highlight", or "generate a video with my footage". Requires ShotAI (local MCP server) to be running. Works with any OpenAI-compatible LLM API or falls back to heuristic mode with no API key.

---

# AI Video Remix Skill

Generate styled video compositions from a local ShotAI video library using natural language.

Prerequisites

See [references/setup.md](references/setup.md) for full installation instructions, including:

  • ShotAI download and setup
  • ffmpeg installation
  • yt-dlp installation (for auto music)
  • Node.js dependencies
  • Quick Start

    cd /path/to/ai-video-editor
    cp .env.example .env    # fill in SHOTAI_URL, SHOTAI_TOKEN, and optionally AGENT_PROVIDER
    npm install
    npx tsx src/skill/cli.ts "帮我做一个旅行混剪"

    Pipeline (8 steps)

    1. **Agent: parseIntent** — LLM extracts theme, selects composition, optionally overrides music style

    2. **Agent: refineQueries** — LLM rewrites per-slot search terms to match library content

    3. **ShotAI: pickShots** — Semantic search per slot, scored by similarity+duration+mood, best shot selected

    4. **Music: resolveMusic** — yt-dlp YouTube search+download, or local MP3 if `--bgm` provided

    5. **ffmpeg: extractClip** — Each shot trimmed to independent `.mp4` clip file

    6. **Agent: annotateClips** — LLM assigns per-clip visual effect params (tone, dramatic, kenBurns, caption)

    7. **File Server** — HTTP server serves clips to Remotion renderer

    8. **Remotion: render** — Composition rendered to final MP4

    CLI Usage

    npx tsx src/skill/cli.ts "<request>" [options]
    
    Options:
      --composition <id>   Override composition (skip LLM selection)
      --bgm <path>         Local MP3 path (skip YouTube search)
      --output <dir>       Output directory (default: ./output)
      --lang <zh|en>       Output language: zh Chinese (default) / en English
                           Affects: video title, per-clip captions & location labels, attribution line
      --probe              Scan library first, let LLM plan slots from actual content

    Compositions

    | ID | Label | Best For |

    |----|-------|----------|

    | `CyberpunkCity` | 赛博朋克夜景 | Neon city, night scenes, sci-fi |

    | `TravelVlog` | 旅行 Vlog | Multi-city travel with location cards |

    | `MoodDriven` | 情绪驱动混剪 | Fast/slow emotion cuts |

    | `NatureWild` | 自然野生动物 | BBC nature documentary style |

    | `SwitzerlandScenic` | 瑞士风光 | Alpine/scenic travel with captions |

    | `SportsHighlight` | 体育集锦 | ESPN-style with goal captions |

    Modes

    **Standard mode** (default): LLM picks composition + generates search queries from registry templates.

    **Probe mode** (`--probe`): Scans library videos first (names, shot samples, mood/scene tags), then LLM generates custom slots tailored to what actually exists.

    Choose probe mode when: library content is unknown, user wants "best of my library", or standard slots return low-quality shots.

    Environment Variables

    See [references/config.md](references/config.md) for all environment variables and LLM provider setup.

    Troubleshooting & Quality Tuning

    See [references/tuning.md](references/tuning.md) for solutions to:

  • Clip boundary flicker / 1–2 frame flash at cuts
  • Red flash artifact in CyberpunkCity (GlitchFlicker on short clips)
  • Low-quality or off-topic shots
  • Music download failures
  • **Recommended `.env` defaults for best quality:**

    MIN_SCORE=0.5    # filter short/low-quality shots

    Writing ShotAI Search Queries

    ShotAI uses semantic search powered by AI-generated tags and embedding vectors. Query quality is the single biggest factor in shot relevance — invest time here.

    Query construction rules

    **Always write full sentences or rich phrases, never bare keywords.**

    The search engine understands semantic similarity (`"ocean"` matches `"sea"`, `"waves"`, `"shoreline"`), so richer context produces better recall.

    | Quality | Example | When to use |

    |---------|---------|-------------|

    | ⭐ Detailed description | `"A white seagull with spread wings gliding smoothly over calm blue ocean water, golden sunset light reflecting on the waves"` | Best precision — use for hero shots |

    | ⭐ Full sentence | `"A seagull flying gracefully over the ocean at sunset"` | Good balance of precision and recall |

    | Short phrase | `"seagull flying over ocean"` | Acceptable fallback |

    | Single keyword | `"seagull"` | Avoid — low precision, noisy results |

    What to include in a query

    Describe the **visual content** of the ideal shot across these dimensions:

  • **Subject**: what/who is in frame (`a lone hiker`, `city traffic at night`, `athlete celebrating`)
  • **Action**: what is happening (`walking slowly through fog`, `speeding through intersection`, `jumping with arms raised`)
  • **Environment**: location, setting, time of day (`rain-soaked Tokyo street`, `mountain meadow at golden hour`, `empty stadium under floodlights`)
  • **Mood / atmosphere**: emotional tone (`melancholic`, `tense`, `euphoric`, `serene`)
  • **Camera feel**: implied movement or framing (`wide establishing shot`, `tight close-up`, `slow pan`, `handheld shaky`)
  • Not all dimensions are needed every time — include whichever are most distinctive for the shot you want.

    The refineQueries step

    When the agent runs `refineQueries`, it rewrites the composition's default slot queries to better match the user's actual library. Apply these principles:

    1. **Start from the slot's semantic intent** — what emotional or narrative role does this shot play in the composition?

    2. **Incorporate any context from the user's request** — location names, event names, specific subjects mentioned

    3. **Expand synonyms** — if the slot says `"water"`, try `"river flowing through forest"` or `"lake reflecting mountains"` based on what the library likely contains

    4. **Avoid negations** — `"not indoors"` does not work; instead describe the positive version (`"outdoor daylight scene"`)

    5. **One query per slot** — make it specific rather than trying to cover multiple scenarios

    Examples: slot query → refined query

    Slot default: "city at night"
    User request: "帮我做一个东京旅行混剪"
    Refined:      "Neon-lit Tokyo street at night, pedestrians crossing under glowing signs, rain reflections on pavement"
    
    Slot default: "nature landscape"
    User request: "trip to Patagonia last month"
    Refined:      "Dramatic Patagonia mountain landscape, snow-capped peaks under stormy clouds, vast open wilderness"
    
    Slot default: "athlete in action"
    User request: "basketball highlight from last game"
    Refined:      "Basketball player driving to the hoop, explosive movement, crowd in background blurred"

    Adding a New Composition

    See [references/composition-guide.md](references/composition-guide.md) to add a new Remotion composition to the registry.

    Safety and Fallback

  • If `SHOTAI_URL` or `SHOTAI_TOKEN` is unset, display a warning: "ShotAI MCP server is not configured. Set `SHOTAI_URL` and `SHOTAI_TOKEN` in your `.env` file. Download ShotAI at https://www.shotai.io."
  • If the ShotAI MCP server returns an error (connection refused, HTTP 4xx/5xx), display the error message and stop — do not fabricate shot results.
  • Never fabricate video file paths, shot timestamps, or similarity scores.
  • If music download fails (yt-dlp error or network unreachable), suggest using `--bgm <local.mp3>` to provide a local audio file instead.
  • If Remotion render fails, display the error output and suggest checking Node.js version (18+) and that all clip files were extracted successfully.
  • If the LLM provider is unreachable, fall back to heuristic mode: use composition default queries directly without refinement, and skip `annotateClips` (use composition default effect params).
  • License

    MIT-0 — Free to use, modify, and redistribute. No attribution required.

    See https://spdx.org/licenses/MIT-0.html

    // Comments
    Sign in with GitHub to leave a comment.
    // Related skills

    More tools from the same signal band