Wavespeed desktop
A cross-platform desktop application for running AI models from [WaveSpeedAI](https://wavespeed.ai), as well as many free local AI models including Z-Image.
Open-source, cross-platform application for running 1000+ AI models — image generation, video generation, face swap, digital human, motion control, and more. Features a visual workflow editor for building AI pipelines, Featured Models with smart variant switching, and 12 free creative tools. Available for **Windows**, **macOS**, **Linux**, and **Android**. The project is written primarily in TypeScript, distributed under the MIT License license, first published in 2025. Key topics include: ai, ai-image-generation, ai-image-generator, image, video.
WaveSpeed
Open-source, cross-platform application for running 1000+ AI models — image generation, video generation, face swap, digital human, motion control, and more. Features a visual workflow editor for building AI pipelines, Featured Models with smart variant switching, and 12 free creative tools. Available for Windows, macOS, Linux, and Android.
Android App
The Android app shares the same React codebase as the desktop version, giving you access to the AI Playground, Featured Models, Creative Studio, and all 1000+ models from your phone.
- Full AI Playground with multi-tab support and all input types including camera capture
- Featured Models with smart variant switching
- Model browser with search, filter, and sort
- Creative Studio tools (face enhancement, background removal, image eraser, segment anything, media conversion)
- History, My Assets, templates, and auto-save
- 18 languages, dark/light theme, Android 5.0+
Creative Studio
12 free AI-powered creative tools that run entirely in your browser. No API key required, no usage limits, completely free. Also available as a standalone web app at wavespeed.ai/studio — fully responsive, works on desktop, tablet, and mobile browsers.
| Tool | Description |
|---|---|
| Image Enhancer | Upscale images 2x–4x using ESRGAN with slim, medium, and thick quality options |
| Video Enhancer | Frame-by-frame video upscaling with real-time progress and ETA |
| Face Enhancer | Detect faces with YOLO v8 and enhance with GFPGAN v1.4 (WebGPU accelerated) |
| Face Swapper | Swap faces using InsightFace (SCRFD + ArcFace + Inswapper) with optional GFPGAN post-processing |
| Background Remover | Remove backgrounds instantly — outputs foreground, background, and mask with individual downloads |
| Image Eraser | Remove unwanted objects with LaMa inpainting, smart crop and blend (WebGPU accelerated) |
| Segment Anything | Interactive object segmentation with point prompts using SlimSAM |
| Video Converter | Convert between MP4, WebM, AVI, MOV, MKV with codec and quality options |
| Audio Converter | Convert between MP3, WAV, AAC, FLAC, OGG with bitrate control |
| Image Converter | Batch convert between JPG, PNG, WebP, GIF, BMP with quality settings |
| Media Trimmer | Trim video and audio by selecting start and end times |
| Media Merger | Merge multiple video or audio files into one |
Visual Workflow Editor
Node-based pipeline builder for designing and executing complex AI workflows. Chain any combination of AI models, free tools, and media processing steps into automated pipelines.
Features
- AI Playground: Multi-tab playground with dynamic forms, batch processing (2-16x), mask drawing, LoRA support, abort control, and auto-randomized seeds
- Featured Models: Curated model families with smart variant switching — auto-selects the best variant based on inputs and toggles (Seedream 4.5, Seedance 1.5 Pro, Wan Spicy, InfiniteTalk, Kling 2.6, Nano Banana Pro, etc.)
- Model Browser: Fuzzy search, sort by popularity/name/price/type, favorites filter
- Visual Workflow Editor: Node-based pipeline builder with 20+ node types
- Triggers (directory scan, HTTP API), AI tasks, 12 free tool nodes, processing (concat, select), group/subgraph, I/O nodes
- Run all / run node / continue / retry / cancel / batch runs (1-99x), real-time execution monitor with cost tracking
- Group/subgraph containers with exposed I/O, breadcrumb navigation, and workflow import
- HTTP API mode: expose workflows as REST endpoints via built-in HTTP server — works as a skill server for OpenClaw and other AI agents
- Directory batch processing: auto-execute per media file in a folder
- Prompt optimizer, guided tour, result caching, circuit breaker, cycle detection
- Cost estimation & daily budget, import/export (JSON + SQLite), multi-tab, undo/redo, customizable output naming
- Free Tools: 12 AI-powered creative tools (no API key) — see Creative Studio above
- Z-Image: Local image generation via stable-diffusion.cpp with model downloads, progress, and logs
- Templates: Playground + workflow templates with presets, i18n search, import/export, and usage tracking
- History & Assets: Recent predictions (24h), saved outputs with tags/favorites/search, auto-save to local folder
- Media Input: File upload (drag & drop), camera capture, video/audio recording
- 18 languages, dark/light/auto theme, auto updates (stable + nightly), cross-platform (Windows, macOS, Linux, Android)
- Cross-Platform: Available for Windows, macOS, Linux, and Android
Installation
Quick Download
Desktop
Mobile
Or browse all releases on the Releases page.
Platform Instructions
<details> <summary><b>Windows</b></summary>- Download
.exe(installer) or.zip(portable) - Run the installer and follow the prompts, or extract the zip file
- Launch "WaveSpeed Desktop" from Start Menu or the extracted folder
- Download
.dmgfor your chip (Apple Silicon or Intel) - Open the
.dmgfile and drag the app to Applications - Launch the app from Applications
- Download
.AppImageor.deb - For AppImage: Make it executable (
chmod +x *.AppImage) and run it - For .deb: Install with
sudo dpkg -i *.deb
- Download the
.apkfile - Open the file on your Android device
- If prompted about "Unknown sources", allow installation from this source
- Install and launch the app
- Requires Android 5.0 (API 21) or higher
Nightly Builds
Note: Nightly builds may be unstable. Use the stable releases for production use.
Development
Prerequisites
- Node.js 20+
- npm
Setup
bash# Clone the repository git clone https://github.com/WaveSpeedAI/wavespeed-desktop.git cd wavespeed-desktop # Install dependencies npm install # Install pre-commit hooks (requires pre-commit: pip install pre-commit) pre-commit install # Start development server npm run dev
Scripts
| Script | Description |
|---|---|
npm run dev | Start development server with hot reload |
npm run dev:web | Start web-only dev server (no Electron) |
npm run build | Build the application |
npm run build:web | Build web-only version (no Electron) |
npm run build:win | Build for Windows |
npm run build:mac | Build for macOS |
npm run build:linux | Build for Linux |
npm run build:all | Build for all platforms |
npm run dist | Build and package for distribution |
npm run format | Format code with Prettier |
npm run format:check | Check code formatting |
Mobile Development
The mobile app is located in the mobile/ directory and shares code with the desktop app.
bash# Navigate to mobile directory cd mobile # Install dependencies npm install # Start development server npm run dev # Build and sync to Android npm run build && npx cap sync android # Open in Android Studio npx cap open android
See mobile/README.md for detailed mobile development guide.
Project Structure
wavespeed-desktop/
├── data/templates/ # Preset workflow templates (AI generation, image/video/audio processing)
├── electron/ # Electron main process
│ ├── main.ts # Main process entry
│ ├── preload.ts # Preload script (IPC bridge)
│ ├── lib/ # Local generation (sdGenerator for stable-diffusion.cpp)
│ └── workflow/ # Workflow backend
│ ├── db/ # SQLite database (workflow, node, edge, execution, budget, template repos)
│ ├── engine/ # Execution engine (DAG runner, scheduler, cache, circuit breaker)
│ ├── ipc/ # IPC handlers (workflow, execution, history, cost, storage, http-server)
│ ├── nodes/ # Node handlers (AI task, free tools, I/O, triggers, processing, control)
│ ├── services/ # HTTP server, model list, retry, service locator, template loader
│ └── utils/ # File storage, hashing, save-to-assets
├── src/
│ ├── api/ # API client
│ ├── components/ # React components
│ │ ├── ffmpeg/ # FFmpeg components
│ │ ├── layout/ # Layout components
│ │ ├── playground/ # Playground components
│ │ ├── shared/ # Shared components
│ │ ├── templates/ # Template components
│ │ └── ui/ # shadcn/ui components
│ ├── hooks/ # Custom React hooks
│ ├── i18n/ # Internationalization (18 languages)
│ ├── lib/ # Utilities (fuzzy search, schema-to-form, smart form config, etc.)
│ ├── pages/ # Page components
│ ├── stores/ # Zustand stores
│ ├── types/ # TypeScript types
│ ├── workers/ # Web Workers (upscaler, face enhancer/swapper, background remover, image eraser, segmentation, ffmpeg)
│ └── workflow/ # Workflow frontend
│ ├── browser/ # Browser-only workflow API (web mode without Electron)
│ ├── components/ # Canvas, node palette, config panel, results panel, run monitor, prompt optimizer
│ ├── hooks/ # Workflow-specific hooks (undo/redo, group adoption, free tool listener)
│ ├── ipc/ # Type-safe IPC client
│ ├── lib/ # Cycle detection, free tool runner, model converter, topological sort
│ ├── stores/ # Workflow, execution, UI stores (Zustand)
│ └── types/ # Workflow type definitions
├── mobile/ # Mobile app (Android)
│ ├── src/ # Mobile-specific overrides
│ ├── android/ # Android native project
│ └── capacitor.config.ts
├── .github/workflows/ # GitHub Actions (desktop + mobile)
└── build/ # Build resources
Tech Stack
Desktop
- Framework: Electron + electron-vite
- Frontend: React 18 + TypeScript
- Styling: Tailwind CSS + shadcn/ui
- State Management: Zustand
- HTTP Client: Axios
- Workflow Canvas: React Flow
- Workflow Database: sql.js (SQLite in-process)
- AI/ML (Free Tools): @huggingface/transformers, onnxruntime-web, @tensorflow/tfjs, upscaler (ESRGAN)
- Media Processing: @ffmpeg/core, mp4-muxer, webm-muxer
- 3D Preview: @google/model-viewer
- Local Generation: stable-diffusion.cpp (via sdGenerator)
Mobile
- Framework: Capacitor 6
- Frontend: React 18 + TypeScript (shared with desktop)
- Styling: Tailwind CSS + shadcn/ui (shared)
- Platform: Android 5.0+
Configuration
- Launch the application
- Go to Settings
- Enter your WaveSpeedAI API key
- Start using the Playground!
Get your API key from WaveSpeedAI
API Reference
The application uses the WaveSpeedAI API v3:
| Endpoint | Method | Description |
|---|---|---|
/api/v3/models | GET | List available models |
/api/v3/{model} | POST | Run a prediction |
/api/v3/predictions/{id}/result | GET | Get prediction result |
/api/v3/predictions | POST | Get prediction history |
/api/v3/media/upload/binary | POST | Upload files |
/api/v3/balance | GET | Get account balance |
The built-in workflow HTTP server also exposes:
| Endpoint | Method | Description |
|---|---|---|
/api/health | GET | Health check |
/api/workflows/{id}/run | POST | Trigger a workflow execution via API |
/api/workflows/{id}/schema | GET | Get workflow input/output schema |
/schema | GET | Get active workflow schema |
POST / (any path) | POST | Run the active workflow |
Add an HTTP Trigger node to your workflow to define the API input schema (each field becomes an output port), and optionally add an HTTP Response node to customize the response. Start the server from the workflow canvas — it listens on a configurable port (default 3100) with CORS enabled.
This turns any workflow into a callable REST endpoint, making it easy to integrate with OpenClaw or other AI agent frameworks as a skill server. For example, an OpenClaw agent can call GET /api/workflows/{id}/schema to discover the workflow's input/output contract, then POST /api/workflows/{id}/run with the required fields to execute the pipeline and receive results.
Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
License
MIT License - see LICENSE for details.
Links
Contributors
Showing top 8 contributors by commit count.
