Model Studio Qwen ASR (Non-Realtime)
name: alicloud-ai-audio-asr
by cinience · published 2026-03-22
$ claw add gh:cinience/cinience-alicloud-ai-audio-asr---
name: alicloud-ai-audio-asr
description: Transcribe non-realtime speech with Alibaba Cloud Model Studio Qwen ASR models (`qwen3-asr-flash`, `qwen-audio-asr`, `qwen3-asr-flash-filetrans`). Use when converting recorded audio files to text, generating transcripts with timestamps, or documenting DashScope/OpenAI-compatible ASR request and response fields.
version: 1.0.0
---
Category: provider
# Model Studio Qwen ASR (Non-Realtime)
Validation
mkdir -p output/alicloud-ai-audio-asr
python -m py_compile skills/ai/audio/alicloud-ai-audio-asr/scripts/transcribe_audio.py && echo "py_compile_ok" > output/alicloud-ai-audio-asr/validate.txtPass criteria: command exits 0 and `output/alicloud-ai-audio-asr/validate.txt` is generated.
Output And Evidence
Use Qwen ASR for recorded audio transcription (non-realtime), including short audio sync calls and long audio async jobs.
Critical model names
Use one of these exact model strings:
Selection guidance:
Prerequisites
python3 -m venv .venv
. .venv/bin/activateNormalized interface (asr.transcribe)
Request
Response
Quick start (official HTTP API)
Sync transcription (OpenAI-compatible protocol):
curl -sS --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen3-asr-flash",
"messages": [
{
"role": "user",
"content": [
{
"type": "input_audio",
"input_audio": {
"data": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"
}
}
]
}
],
"stream": false,
"asr_options": {
"enable_itn": false
}
}'Async long-file transcription (DashScope protocol):
curl -sS --location 'https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'X-DashScope-Async: enable' \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen3-asr-flash-filetrans",
"input": {
"file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"
}
}'Poll task result:
curl -sS --location "https://dashscope.aliyuncs.com/api/v1/tasks/<task_id>" \
--header "Authorization: Bearer $DASHSCOPE_API_KEY"Local helper script
Use the bundled script for URL/local-file input and optional async polling:
python skills/ai/audio/alicloud-ai-audio-asr/scripts/transcribe_audio.py \
--audio "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" \
--model qwen3-asr-flash \
--language-hints zh,en \
--print-responseLong-file mode:
python skills/ai/audio/alicloud-ai-audio-asr/scripts/transcribe_audio.py \
--audio "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" \
--model qwen3-asr-flash-filetrans \
--async \
--waitOperational guidance
Output location
Workflow
1) Confirm user intent, region, identifiers, and whether the operation is read-only or mutating.
2) Run one minimal read-only query first to verify connectivity and permissions.
3) Execute the target operation with explicit parameters and bounded scope.
4) Verify results and save output/evidence files.
References
More tools from the same signal band
Order food/drinks (点餐) on an Android device paired as an OpenClaw node. Uses in-app menu and cart; add goods, view cart, submit order (demo, no real payment).
Sign plugins, rotate agent credentials without losing identity, and publicly attest to plugin behavior with verifiable claims and authenticated transfers.
The philosophical layer for AI agents. Maps behavior to Spinoza's 48 affects, calculates persistence scores, and generates geometric self-reports. Give your...