⚡

// Skill profile

Audio Transcription with Sber Salute Speech

Name: Audio Transcription with Sber Salute Speech
Author: chorus12

name: salute-speech

by chorus12 · published 2026-03-22

开发工具数据处理

Total installs

Stars

★ 0

Last updated

2026-03

// Install command

$ claw add gh:chorus12/chorus12-salute-speech

View on GitHub

// Full documentation

---

name: salute-speech

description: >

Transcribe audio files using Sber Salute Speech async API.

Russian-first STT with support for ru-RU, en-US, kk-KZ, ky-KG, uz-UZ.

metadata: { "openclaw": { "requires": { "bins": ["uv"], "env": ["SALUTE_AUTH_DATA"] }, "primaryEnv": "SALUTE_AUTH_DATA" } }

---

# Audio Transcription with Sber Salute Speech

Transcribe audio/video files to text with timestamps via Salute Speech async REST API.

Requirements

**API Key**: Environment variable `SALUTE_AUTH_DATA` must be set (Base64-encoded `client_id:client_secret` or raw authorization key from https://developers.sber.ru/studio/).

**SSL note**: The script disables SSL verification by default (`verify_ssl=False`) because Sber's certificate chain is non-standard. This is expected.

Supported formats & encodings

| Audio encoding | Content-Type | Typical extensions |

|---------------|-------------|--------------------|

| `MP3` | `audio/mpeg` | `.mp3` |

| `PCM_S16LE` | `audio/wav` | `.wav` |

| `OPUS` | `audio/ogg` | `.ogg`, `.opus` |

| `FLAC` | `audio/flac` | `.flac` |

| `ALAW` | `audio/alaw` | `.alaw` |

| `MULAW` | `audio/mulaw` | `.mulaw` |

Supported languages

`ru-RU`, `en-US`, `kk-KZ` (Kazakh), `ky-KG` (Kyrgyz), `uz-UZ` (Uzbek).

Workflow

1. **Identify input files** — from user request.

2. **Read API key** from host environment.

3. **Run transcription** — execute `salute_transcribe.py` with `uv` and appropriate arguments.

4. **Deliver results** — present to user human-readable transcript with timestamps to the user and give a direct link to files.

Usage

uv run --with requests {baseDir}/salute_transcribe.py \
  --file /path/to/audio.mp3 \
  --output_dir ~/.openclaw/workspace/transcriptions \
  --lang ru-RU

Arguments

|----------|----------|---------|-------------|

| `--file` | **Yes** | — | Path to audio/video file |

| `--audio-encoding` | No | `MP3` | Codec: `MP3`, `PCM_S16LE`, `OPUS`, `FLAC`, `ALAW`, `MULAW` |

| `--hyp-count` | No | `1` | Number of alternative hypotheses: `1` or `2` |

| `--max-wait-time` | No | `300` | Max seconds to wait for async result |

| `--print` | No | off | Also print transcription to stdout |

Content-Type mapping

When the file extension doesn't match `audio/mpeg`, adjust `content_type` in the script or add logic. Current default is `audio/mpeg` (MP3). For `.wav` files use `audio/wav`, etc.

Output files

For input file `meetingABC.mp3` the script produces:

| File | Description |

|------|-------------|

| `meetingABC_recognition_orig.json` | Raw API response (full JSON with all hypotheses, timing, confidence) |

| `meetingABC_pretty.txt` | Formatted human-readable transcript with timestamps |

Output text format

[00:01 - 00:20]:
Ну, даже если сосредоточиться на идее узкой щели.

[00:20 - 00:45]:
Следующий фрагмент текста здесь.

Notes

Token is valid for ~30 minutes; the script fetches a new one each run.

Large files (>1 hour) may need `--max-wait-time` increased beyond 300s.

The `callcenter` model is optimized for telephony audio (8kHz, mono).

Profanity filter is disabled by default (`enable_profanity_filter=False`).

The script uses **normalized text** by default (numbers as digits, abbreviations expanded). Raw text is also available in the JSON output.

// Comments

// Related skills

More tools from the same signal band

Order food/drinks (点餐) on an Android device paired as an OpenClaw node. Uses in-app menu and cart; add goods, view cart, submit order (demo, no real payment).

Sign plugins, rotate agent credentials without losing identity, and publicly attest to plugin behavior with verifiable claims and authenticated transfers.

The philosophical layer for AI agents. Maps behavior to Spinoza's 48 affects, calculates persistence scores, and generates geometric self-reports. Give your...

日历管理数据处理

1 installs★ 0