Note
โ PRODUCTION READY - Clean Architecture
This repository features a clean, production-ready architecture with proper dependency injection and comprehensive testing:
- Zero test_mode anti-pattern - Completely eliminated across entire codebase
- Real functionality testing - All features tested with proper mocking and isolation
- Security features - Comprehensive safety checks with full test coverage
- Cross-platform support - Tested on Windows, Linux, WSL2, and headless environments
Ready for production use with clean dependency injection architecture and 100% working functionality.
A powerful MCP (Model Context Protocol) server that provides computer use tools for Claude and other MCP-compatible clients. Control your computer programmatically with comprehensive safety checks and visual analysis.
- 14 Computer Control Tools: Screenshot, click, type, key press, scroll, drag, wait, automate, plus 6 X server management tools
- X Server Management: Built-in support for X server installation, configuration, and lifecycle management
- Visual Analysis: Deep analysis of screen content and intelligent action planning
- Safety Checks: Comprehensive protection against dangerous commands and sensitive data exposure
- Cross-Platform Support: Works on Linux, macOS, and Windows (with WSL2)
- WSL2 Integration: Automatic X11 forwarding configuration for Windows Subsystem for Linux
- Virtual Display Support: Headless operation with Xvfb for CI/CD environments
- MCP Native: Seamless integration with Claude and other MCP-compatible clients
This is a custom MCP server that needs to be installed from source:
# Clone the repository
git clone https://github.com/Sundeepg98/computer-use-mcp.git
cd computer-use-mcp
# Install dependencies
pip install -r requirements.txt
# Test the server (optional)
python3 start_mcp_server.py
# Add to Claude Code
claude mcp add -s user computer-use -- python3 $(pwd)/start_mcp_server.pyWe plan to publish this as an npm and pip package in the future:
# Coming soon:
# npm install -g computer-use-mcp
# pip install computer-use-mcpComputer Use MCP requires an X server for screenshot and interaction capabilities. The package includes automatic X server detection and management.
Run the included setup script to check your X server configuration:
./setup_xserver.shFor Windows Subsystem for Linux users:
-
Install X Server on Windows (choose one):
-
Configure X Server:
- Launch with "Multiple windows" mode
- Disable access control
- Allow connections from WSL2
-
Set DISPLAY in WSL2:
export DISPLAY=$(cat /etc/resolv.conf | grep nameserver | awk '{print $2}'):0.0
-
Add to ~/.bashrc for persistence:
echo 'export DISPLAY=$(cat /etc/resolv.conf | grep nameserver | awk '{print $2}'):0.0' >> ~/.bashrc
For CI/CD or headless environments, use Xvfb:
# Install Xvfb
sudo apt-get install xvfb
# Start virtual display
Xvfb :99 -screen 0 1920x1080x24 &
export DISPLAY=:99Or use the built-in MCP tools:
install_xserver- Install required packagesstart_xserver- Start virtual displayxserver_status- Check status
# Clone the repository first
git clone https://github.com/Sundeepg98/computer-use-mcp.git
cd computer-use-mcp
# Add to Claude Code using official command
claude mcp add -s user computer-use -- python3 $(pwd)/start_mcp_server.pyIf you prefer manual configuration, add to your Claude Code configuration:
{
"mcpServers": {
"computer-use": {
"type": "stdio",
"command": "python3",
"args": ["/path/to/computer-use-mcp/start_mcp_server.py"],
"env": {}
}
}
}Note: Replace /path/to/computer-use-mcp/ with the actual path where you cloned the repository.
After adding the MCP server, verify it's working:
# Check MCP server status
claude mcp list
# You should see:
# computer-use: python3 /path/to/start_mcp_server.py - โ Connected
# Get detailed server info
claude mcp get computer-usefrom computer_use_mcp import ComputerUseServer
server = ComputerUseServer()
server.run()Capture and analyze the current screen with visual analysis.
{
"tool": "screenshot",
"arguments": {
"analyze": "Find the submit button"
}
}Click at specific coordinates or on described elements.
{
"tool": "click",
"arguments": {
"x": 500,
"y": 300,
"button": "left"
}
}Type text with automatic safety validation.
{
"tool": "type",
"arguments": {
"text": "Hello, World!"
}
}Press keyboard keys or combinations.
{
"tool": "key",
"arguments": {
"key": "Enter"
}
}Scroll in any direction.
{
"tool": "scroll",
"arguments": {
"direction": "down",
"amount": 5
}
}Click and drag between two points.
{
"tool": "drag",
"arguments": {
"start_x": 100,
"start_y": 100,
"end_x": 300,
"end_y": 300
}
}Wait for a specified duration.
{
"tool": "wait",
"arguments": {
"seconds": 2.5
}
}Automate complex tasks with intelligent planning.
{
"tool": "automate",
"arguments": {
"task": "Fill out the login form and submit"
}
}Install required X server packages for display support.
{
"tool": "install_xserver",
"arguments": {}
}Start a virtual X server with custom resolution.
{
"tool": "start_xserver",
"arguments": {
"display_num": 99,
"width": 1920,
"height": 1080
}
}Stop a running X server by display name.
{
"tool": "stop_xserver",
"arguments": {
"display": ":99"
}
}Configure X11 forwarding for WSL2 to Windows host.
{
"tool": "setup_wsl_xforwarding",
"arguments": {}
}Get status of X servers and display configuration.
{
"tool": "xserver_status",
"arguments": {}
}Test the current display configuration.
{
"tool": "test_display",
"arguments": {}
}The server includes comprehensive safety checks:
- Command Blocking: Prevents dangerous system commands (rm -rf, format, etc.)
- Credential Protection: Detects and masks passwords, API keys, and tokens
- PII Detection: Identifies and protects SSNs, credit cards, emails
- Path Traversal Prevention: Blocks access to sensitive system directories
- URL Validation: Prevents malicious URL schemes (javascript:, file://, etc.)
Add to your .mcp.json as shown in Configuration section.
# Using the launcher script
import subprocess
subprocess.run(['python3', 'start_mcp_server.py'])
# Or directly with the MCP module
import sys
sys.path.append('src')
from mcp.mcp_server import main
main()# Run the MCP server directly
python3 start_mcp_server.py
# Or test tools via JSON-RPC (for debugging)
echo '{"jsonrpc": "2.0", "id": 1, "method": "tools/list", "params": {}}' | python3 start_mcp_server.pyRun the comprehensive test suite:
# All tests
python3 -m pytest tests/
# Specific test categories
python3 -m pytest tests/test_safety.py
python3 -m pytest tests/test_visual.py
python3 -m pytest tests/test_mcp_protocol.py
# With coverage
python3 -m pytest --cov=src tests/computer-use-mcp/
โโโ src/mcp/ # Clean MCP implementation
โ โโโ mcp_server.py # Main MCP server
โ โโโ computer_use_refactored.py # Core with dependency injection
โ โโโ factory_refactored.py # Factory pattern for DI
โ โโโ safety_checks.py # Safety validation
โ โโโ visual_analyzer.py # Visual analysis
โ โโโ abstractions/ # Protocol definitions
โ โโโ implementations/ # Platform implementations
โ โโโ screenshot/ # Screenshot providers
โ โโโ input/ # Input providers
โโโ start_mcp_server.py # Server launcher
โโโ tests/ # Comprehensive test suite
โ โโโ test_mcp_protocol.py # MCP protocol tests
โ โโโ test_safety_security.py # Security tests
โ โโโ test_refactored_example.py # DI pattern tests
โโโ examples/
โ โโโ basic_usage.py # Simple examples
โ โโโ advanced_automation.py # Complex workflows
โโโ docs/
โโโ API.md # API documentation
โโโ REFACTORING_GUIDE.md # Architecture details
- Never run with elevated privileges unless absolutely necessary
- Review automation scripts before execution
- Use environment variables for sensitive data
- Enable additional safety checks for production use
- Monitor logs for suspicious activity
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see LICENSE for details.
- Anthropic for the MCP protocol
- The open source community for invaluable contributions
- Screenshot may require X server configuration on Linux
- WSL2 requires additional display setup
- Some keyboard shortcuts may not work in all applications
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: sundeepg8@gmail.com
Built with Clean Architecture - Production Ready with Dependency Injection