HomeBrowseUpload
← Back to registry
// Skill profile

Nightingale Karaoke Skill

name: nightingale-karaoke

by adisinghstudent · published 2026-04-01

开发工具数据处理
Total installs
0
Stars
★ 0
Last updated
2026-04
// Install command
$ claw add gh:adisinghstudent/adisinghstudent-nightingale-karaoke
View on GitHub
// Full documentation

---

name: nightingale-karaoke

description: ML-powered Karaoke app in Rust using Bevy, WhisperX, and Demucs for stem separation, lyrics transcription, and pitch scoring.

triggers:

- "nightingale karaoke"

- "add karaoke to my music library"

- "build karaoke app with rust"

- "stem separation with demucs whisper"

- "nightingale bevy karaoke scoring"

- "ML karaoke from audio files"

- "configure nightingale karaoke profiles"

- "troubleshoot nightingale setup"

---

# Nightingale Karaoke Skill

> Skill by [ara.so](https://ara.so) — Daily 2026 Skills collection.

Nightingale is a self-contained, ML-powered karaoke application written in Rust (Bevy engine). It scans a local music folder, separates vocals from instrumentals (UVR Karaoke model or Demucs), transcribes lyrics with word-level timestamps (WhisperX), and plays back with synchronized highlighting, real-time pitch scoring, player profiles, and GPU shader / video backgrounds. Everything — ffmpeg, Python, PyTorch, ML models — is bootstrapped automatically on first launch.

---

Installation

Pre-built Binary (Recommended)

Download the latest release from the [Releases page](https://github.com/rzru/nightingale/releases) for your platform and run it.

**macOS only** — remove quarantine after extracting:

xattr -cr Nightingale.app

Build from Source

**Prerequisites:**

  • Rust 1.85+ (edition 2024)
  • Linux additionally needs: `libasound2-dev libudev-dev libwayland-dev libxkbcommon-dev`
  • git clone https://github.com/rzru/nightingale
    cd nightingale
    
    # Development build
    cargo build --release
    
    # Run directly
    ./target/release/nightingale

    Release Packaging

    # Linux / macOS
    scripts/make-release.sh
    
    # Windows (PowerShell)
    powershell -ExecutionPolicy Bypass -File scripts/make-release.ps1

    Outputs a `.tar.gz` (Linux/macOS) or `.zip` (Windows) ready for distribution.

    ---

    First Launch / Bootstrap

    On first run, Nightingale downloads and configures:

  • `ffmpeg` binary
  • `uv` (Python package manager)
  • Python 3.10 via uv
  • PyTorch + WhisperX + audio-separator in a virtual environment
  • UVR Karaoke ONNX model and WhisperX `large-v3` model
  • This takes **2–10 minutes** depending on network speed. A progress screen is shown in-app.

    To force re-bootstrap at any time:

    ./nightingale --setup

    Bootstrap completion is marked by `~/.nightingale/vendor/.ready`.

    ---

    CLI Flags

    | Flag | Description |

    |---|---|

    | `--setup` | Force re-run of the first-launch bootstrap (re-downloads vendor deps) |

    ---

    Keyboard & Gamepad Controls

    Navigation

    | Action | Keyboard | Gamepad |

    |---|---|---|

    | Move | Arrow keys | D-pad / Left stick |

    | Confirm | Enter | A (South) |

    | Back | Escape | B (East) / Start |

    | Switch panel | Tab | — |

    | Search | Type to filter | — |

    Playback

    | Action | Keyboard | Gamepad |

    |---|---|---|

    | Pause / Resume | Space | Start |

    | Exit to menu | Escape | B (East) |

    | Toggle guide vocals | G | — |

    | Guide volume up/down | + / - | — |

    | Cycle background | T | — |

    | Cycle video flavor | F | — |

    | Toggle microphone | M | — |

    | Next microphone | N | — |

    | Toggle fullscreen | F11 | — |

    ---

    Configuration

    Main Config

    Located at `~/.nightingale/config.json`. Edit directly or via in-app settings.

    {
      "music_folder": "/home/user/Music",
      "separator": "uvr",
      "guide_vocal_volume": 0.3,
      "background_theme": "plasma",
      "video_flavor": "nature",
      "default_profile": "Alice"
    }

    **`separator` options:** `"uvr"` (default, preserves backing vocals) | `"demucs"`

    **`background_theme` options:** `"plasma"`, `"aurora"`, `"waves"`, `"nebula"`, `"starfield"`, `"video"`, `"source_video"`

    **`video_flavor` options:** `"nature"`, `"underwater"`, `"space"`, `"city"`, `"countryside"`

    Profiles

    Located at `~/.nightingale/profiles.json`:

    {
      "profiles": [
        {
          "name": "Alice",
          "scores": {
            "blake3_hash_of_song": {
              "stars": 4,
              "score": 87250,
              "played_at": "2026-03-18T21:00:00Z"
            }
          }
        }
      ]
    }

    Pixabay Video Backgrounds (Dev)

    API key is embedded in release builds. For local development, create `.env` at project root:

    # .env
    PIXABAY_API_KEY=$PIXABAY_API_KEY

    The release script (`make-release.sh`) sources `.env` automatically.

    ---

    Data Storage Layout

    ~/.nightingale/
    ├── cache/              # Per-song stems, transcripts, lyrics (keyed by blake3 hash)
    ├── config.json         # App settings
    ├── profiles.json       # Player profiles and per-song scores
    ├── videos/             # Pre-downloaded Pixabay video backgrounds
    ├── sounds/             # Sound effects
    ├── vendor/
    │   ├── ffmpeg          # ffmpeg binary
    │   ├── uv              # uv binary
    │   ├── python/         # Python 3.10
    │   ├── venv/           # ML virtualenv (WhisperX, Demucs, audio-separator)
    │   ├── analyzer/       # Python analyzer scripts
    │   └── .ready          # Bootstrap completion marker
    └── models/
        ├── torch/          # Demucs model weights
        ├── huggingface/    # WhisperX large-v3 weights
        └── audio_separator/ # UVR Karaoke ONNX model

    Cache keys are **blake3 hashes** of the source file — re-analysis only triggers if the file changes or is manually invalidated.

    ---

    Supported File Formats

    **Audio:** `.mp3`, `.flac`, `.ogg`, `.wav`, `.m4a`, `.aac`, `.wma`

    **Video:** `.mp4`, `.mkv`, `.avi`, `.webm`, `.mov`, `.m4v`

    Video files: audio track is extracted, vocals separated, original video plays as background automatically.

    ---

    Hardware Acceleration

    PyTorch backend is auto-detected:

    | Backend | Device | Notes |

    |---|---|---|

    | CUDA | NVIDIA GPU | Fastest; ~2–5 min/song |

    | MPS | Apple Silicon | macOS; WhisperX alignment falls back to CPU |

    | CPU | Any | Always works; ~10–20 min/song |

    UVR Karaoke model uses ONNX Runtime with CUDA (NVIDIA) or CoreML (Apple Silicon) automatically.

    ---

    Processing Pipeline

    Audio/Video file
           │
           ▼
     UVR Karaoke (ONNX) or Demucs (PyTorch)
           │  vocals.ogg + instrumental.ogg
           ▼
     LRCLIB API  ──▶  Synced lyrics fetch (if available)
           │
           ▼
     WhisperX large-v3  ──▶  Transcription + word-level timestamps
           │
           ▼
     Bevy App (Rust)
       - Plays instrumental audio
       - Synchronized word highlighting
       - Real-time pitch detection & scoring
       - GPU shader / video backgrounds
       - Scoreboards per profile

    ---

    Code Patterns

    Adding a New Background Theme (Bevy System)

    // In your Bevy plugin, register a new background variant
    use bevy::prelude::*;
    
    #[derive(Component)]
    pub struct MyCustomBackground;
    
    pub fn spawn_custom_background(mut commands: Commands) {
        commands.spawn((
            MyCustomBackground,
            // ... your background components
        ));
    }
    
    pub struct CustomBackgroundPlugin;
    
    impl Plugin for CustomBackgroundPlugin {
        fn build(&self, app: &mut App) {
            app.add_systems(OnEnter(AppState::Playing), spawn_custom_background);
        }
    }

    Extending Config Deserialization

    use serde::{Deserialize, Serialize};
    
    #[derive(Debug, Clone, Serialize, Deserialize)]
    pub struct NightingaleConfig {
        pub music_folder: String,
        #[serde(default = "default_separator")]
        pub separator: StemSeparator,
        #[serde(default = "default_guide_volume")]
        pub guide_vocal_volume: f32,
    }
    
    #[derive(Debug, Clone, Serialize, Deserialize, Default)]
    #[serde(rename_all = "lowercase")]
    pub enum StemSeparator {
        #[default]
        Uvr,
        Demucs,
    }
    
    fn default_guide_volume() -> f32 { 0.3 }
    fn default_separator() -> StemSeparator { StemSeparator::Uvr }
    
    // Load config
    fn load_config() -> NightingaleConfig {
        let path = dirs::home_dir()
            .unwrap()
            .join(".nightingale/config.json");
        let raw = std::fs::read_to_string(&path).unwrap_or_default();
        serde_json::from_str(&raw).unwrap_or_default()
    }

    Triggering Re-analysis Programmatically

    use std::fs;
    use std::path::PathBuf;
    
    /// Remove cached stems/transcript for a song to force re-analysis
    fn invalidate_song_cache(song_hash: &str) {
        let cache_dir = dirs::home_dir()
            .unwrap()
            .join(".nightingale/cache")
            .join(song_hash);
    
        if cache_dir.exists() {
            fs::remove_dir_all(&cache_dir)
                .expect("Failed to remove cache directory");
            println!("Cache invalidated for {}", song_hash);
        }
    }

    Computing a Song's Blake3 Hash (for Cache Lookup)

    use blake3::Hasher;
    use std::fs::File;
    use std::io::{BufReader, Read};
    
    fn hash_file(path: &std::path::Path) -> String {
        let file = File::open(path).expect("Cannot open file");
        let mut reader = BufReader::new(file);
        let mut hasher = Hasher::new();
        let mut buf = [0u8; 65536];
        loop {
            let n = reader.read(&mut buf).unwrap();
            if n == 0 { break; }
            hasher.update(&buf[..n]);
        }
        hasher.finalize().to_hex().to_string()
    }

    Profile Score Update Pattern

    use serde::{Deserialize, Serialize};
    use std::collections::HashMap;
    
    #[derive(Debug, Serialize, Deserialize)]
    pub struct SongScore {
        pub stars: u8,
        pub score: u32,
        pub played_at: String,
    }
    
    #[derive(Debug, Serialize, Deserialize)]
    pub struct Profile {
        pub name: String,
        pub scores: HashMap<String, SongScore>, // key = blake3 hash
    }
    
    fn update_score(profile: &mut Profile, song_hash: &str, stars: u8, score: u32) {
        profile.scores.insert(song_hash.to_string(), SongScore {
            stars,
            score,
            played_at: chrono::Utc::now().to_rfc3339(),
        });
    }

    ---

    Troubleshooting

    Bootstrap Fails / Stuck on Setup Screen

    # Force re-bootstrap
    ./nightingale --setup
    
    # Or manually remove the vendor directory and restart
    rm -rf ~/.nightingale/vendor
    ./nightingale

    Song Analysis Hangs or Errors

    # Check the analyzer venv is healthy
    ~/.nightingale/vendor/venv/bin/python -c "import whisperx; print('ok')"
    
    # Re-bootstrap if broken
    ./nightingale --setup

    macOS "App is damaged" Error

    xattr -cr Nightingale.app

    GPU Not Being Used

  • **NVIDIA:** Ensure CUDA drivers are installed and `nvidia-smi` shows your GPU.
  • **Apple Silicon:** MPS is used automatically on macOS with Apple Silicon; WhisperX alignment falls back to CPU (normal behavior).
  • Check `~/.nightingale/vendor/venv` — if PyTorch installed the CPU-only build, re-bootstrap after installing CUDA drivers.
  • Cache Corruption / Wrong Lyrics

    # Find the blake3 hash of your file (build a small tool or use b3sum)
    b3sum /path/to/song.mp3
    
    # Remove that song's cache
    rm -rf ~/.nightingale/cache/<hash>

    Then re-open the song in Nightingale to re-analyze.

    Audio Playback Issues (Linux)

    Ensure ALSA/PulseAudio/PipeWire is running. Install missing deps:

    sudo apt install libasound2-dev libudev-dev libwayland-dev libxkbcommon-dev

    Video Backgrounds Not Loading

    Video backgrounds are pre-downloaded during setup via the Pixabay API. For development builds, ensure `.env` contains a valid `PIXABAY_API_KEY`. If videos are missing in a release build, run `--setup` to re-trigger the download.

    ---

    Platform Targets

    | Platform | Target Triple |

    |---|---|

    | Linux x86_64 | `x86_64-unknown-linux-gnu` |

    | Linux aarch64 | `aarch64-unknown-linux-gnu` |

    | macOS ARM | `aarch64-apple-darwin` |

    | macOS Intel | `x86_64-apple-darwin` |

    | Windows x86_64 | `x86_64-pc-windows-msvc` |

    Cross-compile with:

    rustup target add aarch64-unknown-linux-gnu
    cargo build --release --target aarch64-unknown-linux-gnu

    ---

    License

    GPL-3.0-or-later. See [LICENSE](https://github.com/rzru/nightingale/blob/main/LICENSE).

    // Comments
    Sign in with GitHub to leave a comment.
    // Related skills

    More tools from the same signal band