Your AI-Powered Desktop Agent for Windows
Meet Marvin — your paranoid android assistant who can see your screen, control your desktop, browse the web, and execute complex multi-step tasks with just a conversation.
- Windows UI Control — Click, type, scroll, and navigate using AI-powered UI analysis
- Screen Vision — Marvin can see and understand what's on your screen
- Smart Action Planning — Multi-step task execution with real-time feedback
- Chat-Based Commands — Just tell Marvin what you want in plain English
- Voice Dictation — Real-time speech-to-text with Voxtral transcription
- Rich Markdown Responses — Code highlighting, LaTeX math, and more
- Headless Browser Automation — Navigate, search, click, and extract data from websites
- Specialized Research Agents — Trip planning, product research, and more
- Multi-Engine Search — Google and Bing support
- File System Operations — List, create, move, rename, and delete files
- Shell Command Execution — Run PowerShell commands with approval safeguards
- Large File Finder — Identify disk space hogs across your system
- Approval System — High-risk operations require explicit user consent
- Risk-Based Classification — Tools are categorized by potential impact
- Sandboxed Execution — Recommended to run in Windows Sandbox for demos
- Python 3.12+
- Windows 10/11 (for full desktop automation)
- Mistral API Key — Get one here
# Clone the repository
git clone https://github.com/yourusername/MarvOS.git
cd MarvOS
# Create a virtual environment (recommended)
python -m venv .venv
.venv\Scripts\activate
# Install with all extras
pip install -e ".[dev,windows]"
# Install Playwright browsers (for web browsing)
playwright installSet your Mistral API key as an environment variable:
# PowerShell
$env:MISTRAL_API_KEY = "your-api-key-here"
# Or add to your profile permanently
[Environment]::SetEnvironmentVariable("MISTRAL_API_KEY", "your-api-key", "User")python -m app.mainA sleek desktop window will open with Marvin ready to assist! 🎉
MarvOS/
├── app/
│ ├── main.py # Application entry point (pywebview)
│ ├── server.py # FastAPI backend with WebSocket support
│ ├── config.py # Model and app configuration
│ ├── agent/ # AI agent orchestration
│ │ ├── planner.py # Prompt engineering & planning
│ │ ├── runner.py # Task execution loop
│ │ └── specialized_agent_manager.py # Research agents
│ ├── hands/ # Desktop automation backends
│ │ ├── windows_use_adapter.py # Windows UI Automation
│ │ └── stub_hands.py # Safe fallback for development
│ ├── llm/ # LLM integration
│ │ └── mistral_gateway.py # Mistral API wrapper
│ ├── tools/ # Available agent tools
│ │ ├── registry.py # Tool definitions & routing
│ │ ├── filesystem.py # File operations
│ │ ├── shell.py # Command execution
│ │ └── web_browser.py # Playwright browser control
│ └── storage/ # Persistence layer
│ └── db.py # JSON-based storage
├── ui/ # Frontend assets
│ ├── index.html # Main application shell
│ ├── app.js # UI logic & WebSocket client
│ └── app.css # Styling
└── marvin-assets/ # Mascot artwork
Control how Marvin interacts with your desktop:
| Mode | Description |
|---|---|
auto (default) |
Windows automation on Windows, stub elsewhere |
windows-use |
Full Windows UI Automation |
stub |
No automation (safe for development) |
Set via environment variable:
$env:MARVOS_HANDS = "stub"MarvOS uses multiple specialized Mistral models:
| Purpose | Model | Use Case |
|---|---|---|
| Chat | labs-mistral-small-creative |
Conversational responses |
| Agent | devstral-small-latest |
Tool calling & reasoning |
| Vision | mistral-medium-latest |
Screenshot analysis |
| Transcription | voxtral-mini-transcribe-realtime-2602 |
Real-time voice input |
| Research | mistral-large-latest |
Specialized agent tasks |
MarvOS implements a tiered risk system for operations:
| Risk Level | Approval | Examples |
|---|---|---|
| 🟢 Low | Auto-approved | Read files, observe UI |
| 🟡 Medium | Prompted | Write files, shell commands |
| 🔴 High | Required | Delete files, system changes |
⚠️ Important: For demos and testing, we strongly recommend running MarvOS inside Windows Sandbox or a virtual machine to prevent unintended system modifications.
pytest# Format and lint with Ruff
ruff check app/
ruff format app/- Backend: FastAPI with async/await patterns
- Frontend: Vanilla JavaScript with WebSocket events
- State: JSON file-based persistence via
platformdirs
MarvOS is built on the shoulders of giants. We gratefully acknowledge these open-source projects:
| Package | License | Description |
|---|---|---|
| FastAPI | MIT | High-performance async web framework |
| Uvicorn | BSD-3-Clause | Lightning-fast ASGI server |
| Pydantic | MIT | Data validation using Python type hints |
| pywebview | BSD-3-Clause | Cross-platform desktop webview wrapper |
| Pillow (PIL) | HPND | Python imaging library |
| platformdirs | MIT | Cross-platform app directory paths |
| python-multipart | Apache-2.0 | Streaming multipart form parser |
| Package | License | Description |
|---|---|---|
| mistralai | Apache-2.0 | Official Mistral AI Python client |
| Playwright | Apache-2.0 | Cross-browser automation library |
| uiautomation | Apache-2.0 | Windows UI Automation wrapper |
| fuzzywuzzy | GPL-2.0 | Fuzzy string matching |
| Package | License | Description |
|---|---|---|
| pytest | MIT | Python testing framework |
| httpx | BSD-3-Clause | Async HTTP client for testing |
| Ruff | MIT | Fast Python linter and formatter |
| Library | License | Description |
|---|---|---|
| marked.js | MIT | Markdown parser and compiler |
| DOMPurify | Apache-2.0 | XSS sanitizer for HTML |
| highlight.js | BSD-3-Clause | Syntax highlighting for code |
| KaTeX | MIT | Fast math typesetting |
| Font | License | Usage |
|---|---|---|
| Syne | OFL | Headlines & branding |
| DM Sans | OFL | Body text |
| JetBrains Mono | OFL | Code & monospace |
| Silkscreen | OFL | Pixel-art elements |
| Tool | License | Description |
|---|---|---|
| setuptools | MIT | Python packaging |
| wheel | MIT | Built-package format |
Marvin is our pixel-art mascot — a friendly (if slightly paranoid) android inspired by a certain depressed robot from a well-known sci-fi series. Despite his existential concerns, he's surprisingly helpful at automating your desktop tasks!
This project is proprietary software. All rights reserved.
For licensing inquiries, please contact the maintainers.
We welcome contributions! Please see our contributing guidelines (coming soon) for details on:
- Code style and conventions
- Testing requirements
- Pull request process
- Issues: GitHub Issues
- Discussions: GitHub Discussions
Built with 💚 and existential dread by the MarvOS Team
"Here I am, brain the size of a planet, and they ask me to organize their Downloads folder..."
— Marvin, probably


