Skip to content

matei-anghel/MarvOS

Repository files navigation

Marvin - MarvOS Mascot

🤖 MarvOS

Your AI-Powered Desktop Agent for Windows

Python 3.12+ Windows Powered by Mistral Proprietary License

Meet Marvin — your paranoid android assistant who can see your screen, control your desktop, browse the web, and execute complex multi-step tasks with just a conversation.


Chat Interface
Home Page

Terminal Use Capability
Terminal Use

Windows Use Capability
Windows Use

✨ Features

🎯 Intelligent Desktop Automation

  • Windows UI Control — Click, type, scroll, and navigate using AI-powered UI analysis
  • Screen Vision — Marvin can see and understand what's on your screen
  • Smart Action Planning — Multi-step task execution with real-time feedback

💬 Natural Conversation Interface

  • Chat-Based Commands — Just tell Marvin what you want in plain English
  • Voice Dictation — Real-time speech-to-text with Voxtral transcription
  • Rich Markdown Responses — Code highlighting, LaTeX math, and more

🌐 Web Browsing & Research

  • Headless Browser Automation — Navigate, search, click, and extract data from websites
  • Specialized Research Agents — Trip planning, product research, and more
  • Multi-Engine Search — Google and Bing support

🛠️ Powerful Tools

  • File System Operations — List, create, move, rename, and delete files
  • Shell Command Execution — Run PowerShell commands with approval safeguards
  • Large File Finder — Identify disk space hogs across your system

🔒 Safety First

  • Approval System — High-risk operations require explicit user consent
  • Risk-Based Classification — Tools are categorized by potential impact
  • Sandboxed Execution — Recommended to run in Windows Sandbox for demos

🚀 Quick Start

Prerequisites

  • Python 3.12+
  • Windows 10/11 (for full desktop automation)
  • Mistral API KeyGet one here

Installation

# Clone the repository
git clone https://github.com/yourusername/MarvOS.git
cd MarvOS

# Create a virtual environment (recommended)
python -m venv .venv
.venv\Scripts\activate

# Install with all extras
pip install -e ".[dev,windows]"

# Install Playwright browsers (for web browsing)
playwright install

Configuration

Set your Mistral API key as an environment variable:

# PowerShell
$env:MISTRAL_API_KEY = "your-api-key-here"

# Or add to your profile permanently
[Environment]::SetEnvironmentVariable("MISTRAL_API_KEY", "your-api-key", "User")

Running MarvOS

python -m app.main

A sleek desktop window will open with Marvin ready to assist! 🎉


🎨 Architecture

MarvOS/
├── app/
│   ├── main.py              # Application entry point (pywebview)
│   ├── server.py            # FastAPI backend with WebSocket support
│   ├── config.py            # Model and app configuration
│   ├── agent/               # AI agent orchestration
│   │   ├── planner.py       # Prompt engineering & planning
│   │   ├── runner.py        # Task execution loop
│   │   └── specialized_agent_manager.py  # Research agents
│   ├── hands/               # Desktop automation backends
│   │   ├── windows_use_adapter.py  # Windows UI Automation
│   │   └── stub_hands.py    # Safe fallback for development
│   ├── llm/                 # LLM integration
│   │   └── mistral_gateway.py  # Mistral API wrapper
│   ├── tools/               # Available agent tools
│   │   ├── registry.py      # Tool definitions & routing
│   │   ├── filesystem.py    # File operations
│   │   ├── shell.py         # Command execution
│   │   └── web_browser.py   # Playwright browser control
│   └── storage/             # Persistence layer
│       └── db.py            # JSON-based storage
├── ui/                      # Frontend assets
│   ├── index.html           # Main application shell
│   ├── app.js               # UI logic & WebSocket client
│   └── app.css              # Styling
└── marvin-assets/           # Mascot artwork

⚙️ Configuration Options

Hands Backend

Control how Marvin interacts with your desktop:

Mode Description
auto (default) Windows automation on Windows, stub elsewhere
windows-use Full Windows UI Automation
stub No automation (safe for development)

Set via environment variable:

$env:MARVOS_HANDS = "stub"

AI Models

MarvOS uses multiple specialized Mistral models:

Purpose Model Use Case
Chat labs-mistral-small-creative Conversational responses
Agent devstral-small-latest Tool calling & reasoning
Vision mistral-medium-latest Screenshot analysis
Transcription voxtral-mini-transcribe-realtime-2602 Real-time voice input
Research mistral-large-latest Specialized agent tasks

🛡️ Safety & Permissions

MarvOS implements a tiered risk system for operations:

Risk Level Approval Examples
🟢 Low Auto-approved Read files, observe UI
🟡 Medium Prompted Write files, shell commands
🔴 High Required Delete files, system changes

⚠️ Important: For demos and testing, we strongly recommend running MarvOS inside Windows Sandbox or a virtual machine to prevent unintended system modifications.


🧪 Development

Running Tests

pytest

Code Quality

# Format and lint with Ruff
ruff check app/
ruff format app/

Project Structure

  • Backend: FastAPI with async/await patterns
  • Frontend: Vanilla JavaScript with WebSocket events
  • State: JSON file-based persistence via platformdirs

📦 Open Source Acknowledgments

MarvOS is built on the shoulders of giants. We gratefully acknowledge these open-source projects:

🐍 Python Core

Package License Description
FastAPI MIT High-performance async web framework
Uvicorn BSD-3-Clause Lightning-fast ASGI server
Pydantic MIT Data validation using Python type hints
pywebview BSD-3-Clause Cross-platform desktop webview wrapper
Pillow (PIL) HPND Python imaging library
platformdirs MIT Cross-platform app directory paths
python-multipart Apache-2.0 Streaming multipart form parser

🤖 AI & Automation

Package License Description
mistralai Apache-2.0 Official Mistral AI Python client
Playwright Apache-2.0 Cross-browser automation library
uiautomation Apache-2.0 Windows UI Automation wrapper
fuzzywuzzy GPL-2.0 Fuzzy string matching

🧰 Development Tools

Package License Description
pytest MIT Python testing framework
httpx BSD-3-Clause Async HTTP client for testing
Ruff MIT Fast Python linter and formatter

🌐 Frontend Libraries (CDN)

Library License Description
marked.js MIT Markdown parser and compiler
DOMPurify Apache-2.0 XSS sanitizer for HTML
highlight.js BSD-3-Clause Syntax highlighting for code
KaTeX MIT Fast math typesetting

🎨 Fonts

Font License Usage
Syne OFL Headlines & branding
DM Sans OFL Body text
JetBrains Mono OFL Code & monospace
Silkscreen OFL Pixel-art elements

🏗️ Build System

Tool License Description
setuptools MIT Python packaging
wheel MIT Built-package format

🎭 Meet Marvin

Marvin Full Body

Marvin is our pixel-art mascot — a friendly (if slightly paranoid) android inspired by a certain depressed robot from a well-known sci-fi series. Despite his existential concerns, he's surprisingly helpful at automating your desktop tasks!


📜 License

This project is proprietary software. All rights reserved.

For licensing inquiries, please contact the maintainers.


🤝 Contributing

We welcome contributions! Please see our contributing guidelines (coming soon) for details on:

  • Code style and conventions
  • Testing requirements
  • Pull request process

📬 Support


Built with 💚 and existential dread by the MarvOS Team

"Here I am, brain the size of a planet, and they ask me to organize their Downloads folder..."
— Marvin, probably

About

MarvOS is the intelligent layer between you and the infinite complexity of Windows. Meet Marvin - he has a brain the size of a planet, and you’re asking him to traverse the digital bureaucracy of your operating system.

Resources

Stars

Watchers

Forks

Contributors