HomeBrowseUpload
← Back to registry
// Skill profile

Yyds.Auto — Android RPA Skill for AI Agents

name: yyds-auto

by chenanzong · published 2026-04-01

开发工具图像生成
Total installs
0
Stars
★ 0
Last updated
2026-04
// Install command
$ claw add gh:chenanzong/chenanzong-yyds-auto
View on GitHub
// Full documentation

---

name: yyds-auto

description: Control Android devices via MCP — tap, swipe, OCR, screenshot, UI automation, shell, file management, and AI agent orchestration for Android RPA.

version: 1.0.0

metadata:

openclaw:

requires:

env:

- YYDS_DEVICE_HOST

- YYDS_DEVICE_PORT

bins:

- node

anyBins:

- adb

primaryEnv: YYDS_DEVICE_HOST

emoji: "\U0001F4F1"

homepage: https://yydsauto.com

os:

- windows

- macos

- linux

install:

- kind: node

package: yyds-auto-mcp

bins: [yyds-auto-mcp]

---

# Yyds.Auto — Android RPA Skill for AI Agents

> Let LLMs directly control Android devices through the MCP protocol.

Yyds.Auto is a production-grade Android RPA (Robotic Process Automation) platform that exposes **60 MCP tools** covering the full spectrum of Android device automation — from pixel-level touch injection and OCR to UI hierarchy inspection, file management, and on-device AI agent orchestration.

What Can It Do?

| Category | Tools | Capabilities |

|----------|-------|-------------|

| 📱 Device Info | 4 | Device model, screen size, IMEI, foreground app, network status |

| 👆 Touch & Input | 8 | Tap, swipe, long press, drag, text input, clipboard, key press |

| 📸 Screenshot | 2 | Screenshot as base64 image (LLM can see it directly), save to device |

| 🌲 UI Automation | 5 | UI hierarchy dump, find elements by attributes, element relations, wait & scroll |

| 🔍 OCR & Image | 8 | Screen OCR, tap-on-text, template matching, pixel color, image comparison |

| 💻 Shell | 1 | Execute shell commands with ROOT/SHELL privileges |

| 📦 App Management | 8 | Launch/stop apps, list installed, install/uninstall APK, open URL, toast |

| 📁 File Operations | 7 | List, read, write, delete, rename files and directories on device |

| 🐍 Script Projects | 5 | List/start/stop Python projects, execute Python code snippets |

| 📚 Pip Management | 4 | List, install, uninstall, inspect Python packages |

| 🤖 AI Agent | 8 | Configure and run an on-device AI agent with natural language instructions |

Architecture

AI Agent (Claude / GPT / Gemini / Cursor / Windsurf / ...)
  ↓ MCP Protocol (stdio, JSON-RPC)
yyds-auto-mcp (Node.js, this skill)
  ↓ HTTP REST (JSON, port 61140)
yyds.py engine (Android, aiohttp server)
  ↓ IPC
yyds.auto engine (Android, kernel-level UI automation)

The MCP server communicates with the on-device engine via HTTP REST. When connected via USB, ADB port forwarding is set up automatically. Remote devices over WiFi/LAN are also supported.

Prerequisites

1. **Android device** with [Yyds.Auto](https://yydsauto.com) installed and the engine running

2. **Connection**: USB (auto ADB forward) or WiFi (same LAN)

3. **Node.js** >= 18

Quick Start

Install the MCP Server

npm install -g yyds-auto-mcp

Connect to a USB Device (auto-detected)

# Default: 127.0.0.1:61140, ADB forward set up automatically
yyds-auto-mcp

Connect to a Remote Device

YYDS_DEVICE_HOST=192.168.1.100 YYDS_DEVICE_PORT=61140 yyds-auto-mcp

Claude Desktop Configuration

Add to `claude_desktop_config.json`:

{
  "mcpServers": {
    "yyds-auto": {
      "command": "npx",
      "args": ["-y", "yyds-auto-mcp"],
      "env": {
        "YYDS_DEVICE_HOST": "127.0.0.1",
        "YYDS_DEVICE_PORT": "61140"
      }
    }
  }
}

Cursor / Windsurf / VS Code Configuration

Add the same MCP server configuration in your editor's MCP settings.

Environment Variables

| Variable | Default | Description |

|----------|---------|-------------|

| `YYDS_DEVICE_HOST` | `127.0.0.1` | Device IP address |

| `YYDS_DEVICE_PORT` | `61140` | Engine port number |

| `YYDS_DEVICE_SERIAL` | *(first device)* | Specify ADB device serial |

| `YYDS_ADB_PATH` | *(auto-detect)* | Custom ADB binary path |

Tool Reference

Device Information

  • **`device_info`** — Comprehensive device info: engine version, screen size, IMEI, foreground app
  • **`get_foreground_app`** — Current foreground app & Activity
  • **`get_screen_size`** — Device screen resolution
  • **`is_network_online`** — Check network connectivity
  • Touch & Input

  • **`tap`** *(x, y, count?, interval?)* — Tap at coordinates (supports multi-tap)
  • **`swipe`** *(x1, y1, x2, y2, duration?)* — Swipe gesture
  • **`long_press`** *(x, y, duration?)* — Long press at coordinates
  • **`drag`** *(x1, y1, x2, y2, duration?)* — Drag from point A to B
  • **`input_text`** *(text)* — Input text into the focused field
  • **`set_clipboard`** / **`get_clipboard`** — Clipboard operations
  • **`press_key`** *(key)* — Press a key (home, back, enter, etc.)
  • Screenshot

  • **`take_screenshot`** *(quality?)* — Returns base64 JPEG image (LLM directly interprets it)
  • **`save_screenshot`** *(path?)* — Save screenshot to device storage
  • UI Automation

  • **`dump_ui_hierarchy`** — Full UI tree (auto-trimmed when >15KB to save tokens)
  • **`find_ui_elements`** *(text?, resourceId?, className?, clickable?, ...)* — Find elements by attributes
  • **`get_element_relation`** *(hashcode, type?)* — Get parent/children/sibling of an element
  • **`wait_for_element`** *(text?, resourceId?, timeout?)* — Wait until an element appears
  • **`scroll_to_find`** *(text?, direction?, maxScrolls?)* — Scroll until an element is found
  • OCR & Image

  • **`screen_ocr`** *(x?, y?, w?, h?)* — Recognize text on screen (region supported)
  • **`tap_text`** *(text, index?)* — OCR + tap on the matching text
  • **`image_ocr`** *(path)* — Recognize text from an image file
  • **`find_image_on_screen`** *(templates, threshold?)* — Template matching
  • **`get_pixel_color`** *(x, y)* — Get pixel color at coordinates
  • **`compare_images`** *(image1, image2)* — Image similarity comparison
  • **`wait_for_screen_change`** *(timeout?, threshold?)* — Wait for the screen to change
  • Shell

  • **`run_shell`** *(command)* — Execute shell commands with elevated privileges
  • App Management

  • **`launch_app`** / **`stop_app`** *(packageName)* — Start/stop apps
  • **`list_installed_apps`** — List all non-system installed apps
  • **`is_app_running`** *(packageName)* — Check if an app is running
  • **`open_url`** *(url)* — Open URL in browser
  • **`show_toast`** *(message)* — Display a toast notification
  • **`install_apk`** / **`uninstall_app`** — Install/uninstall apps
  • File Operations

  • **`list_files`** / **`read_file`** / **`write_file`** — Browse, read, write files
  • **`file_exists`** / **`delete_file`** / **`rename_file`** / **`create_directory`**
  • Script Projects

  • **`list_projects`** / **`project_status`** — View Python projects
  • **`start_project`** / **`stop_project`** — Control project execution
  • **`run_python_code`** *(code)* — Execute Python code snippets on the device
  • Pip Management

  • **`pip_list`** / **`pip_install`** / **`pip_uninstall`** / **`pip_show`**
  • AI Agent

  • **`agent_run`** *(instruction)* — Run an on-device AI agent with natural language
  • **`agent_stop`** / **`agent_status`** — Control and monitor the agent
  • **`agent_get_config`** / **`agent_set_config`** — Configure AI provider & model
  • **`agent_get_providers`** / **`agent_get_models`** — List available providers & models
  • **`agent_test_connection`** — Verify AI model connectivity
  • Key Features

    🔄 Auto-Reconnect

    USB connection drops are handled gracefully — when the device disconnects, the MCP server automatically re-establishes ADB port forwarding and retries the request.

    🚀 Auto-Bootstrap

    On first connection via USB, the server automatically sets up ADB forwarding and starts the engine on the device if it's not already running.

    🧠 Smart UI Dump

    UI hierarchy dumps over 15KB are automatically trimmed to keep only actionable elements (those with text, resource-id, content-desc, or clickable/scrollable attributes), reducing LLM token usage.

    🎯 Kernel-Level Touch

    Touch events are injected at the Linux kernel level, making them work in any app including games, locked-down apps, and areas that block accessibility-based input.

    Example Prompts

    Once connected, try these prompts with your AI agent:

  • *"Take a screenshot and describe what's on the screen"*
  • *"Open WeChat and send 'Hello' to the first chat"*
  • *"Find all buttons on the screen and list their labels"*
  • *"OCR the screen and find any phone numbers"*
  • *"Swipe up 3 times to scroll through the feed"*
  • *"Install the APK at /sdcard/Download/app.apk"*
  • *"Run this Python code on the device: `print('Hello from Android!')`"*
  • Links

  • 🌐 **Website**: [yydsauto.com](https://yydsauto.com)
  • 📦 **npm**: [yyds-auto-mcp](https://www.npmjs.com/package/yyds-auto-mcp)
  • 🐙 **GitHub**: [yyds-auto-mcp](https://github.com/ChenAnZong/yyds-auto-mcp)
  • // Comments
    Sign in with GitHub to leave a comment.
    // Related skills

    More tools from the same signal band