Skip to content

Commit 9b9c817

Browse files
authored
feat(drone): modernize drone for CLI + Rerun + replay (#1520)
* feat(drone): add CLI registration and RerunBridge support Phase 1: CLI Registration + Rerun Integration - Create blueprint directory structure (dimos/robot/drone/blueprints/) - Add drone_basic blueprint (DroneConnectionModule + DroneCameraModule + RerunBridge) - Add drone_agentic blueprint (full stack with tracking + agent + web) - Register both blueprints in all_blueprints.py - Replace FoxgloveBridge with RerunBridgeModule - Add replay support via global_config.replay flag - Maintain existing stream names for backward compatibility Blueprints now support: - dimos run drone-basic [--replay] - dimos run drone-agentic [--replay] * feat(drone): remove deprecated main() and mark old Drone class deprecated Phase 2: Module Cleanup - Remove main() function from drone.py (CLI now handles this) - Remove __main__ block - Add deprecation warning to old Drone(Robot) class - Recommend using drone_basic or drone_agentic blueprints instead The old class-based Robot pattern is being phased out in favor of blueprint composition. Existing code using Drone() will still work but should migrate to the new blueprints. * feat(drone): delete deprecated Drone class and add ClockSyncConfigurator - Remove deprecated Drone(Robot) class entirely from drone.py - Delete drone.py file (functionality moved to blueprints) - Update __init__.py to remove Drone export - Add ClockSyncConfigurator to both blueprints (fixes LCM autoconf errors) - Update test_drone.py to skip TestDroneFullIntegration (tested deprecated class) - All remaining tests pass (22 passed, 1 skipped) The Drone class-based pattern is fully deprecated. Use drone_basic or drone_agentic blueprints instead. * fix(drone): remove invalid autoconf=True from LCM() in rerun config - Changed LCM(autoconf=True) to LCM() in both blueprints - Matches Go2 blueprint pattern - Fixes TypeError when loading blueprints - Verified: dimos --replay run drone-basic works (modules deploy successfully) * fix(drone): restore agentic blueprint to match original drone_agentic() signature - Replace conditional viewer logic with direct FoxgloveBridge.blueprint() - Use WebsocketVisModule.blueprint() instead of websocket_vis() alias - Preserve exact module composition: DroneConnectionModule, DroneCameraModule, DroneTrackingModule, WebsocketVisModule, FoxgloveBridge, GoogleMapsSkillContainer, OsmSkill, agent, web_input - Preserve exact remappings: (DroneTrackingModule, video_input → video), (DroneTrackingModule, cmd_vel → movecmd_twist) - Keep function-based _make_drone_agentic() with all original default params - Module-level drone_agentic instance for CLI registry compatibility - Export DRONE_SYSTEM_PROMPT in __all__ * feat(drone): add split Rerun blueprint (camera + 3D view) Opens camera feed automatically on startup instead of requiring manual panel navigation. Horizontal split: Camera (1/3) + 3D world (2/3). * fix(drone): correct Rerun camera origin to world/video DroneCameraModule subscribes on 'video' stream, not 'color_image'. The Rerun entity path is world/video. * fix(google-maps): gracefully handle missing API key GoogleMapsSkillContainer now logs a warning instead of crashing when GOOGLE_MAPS_API_KEY is not set. The module deploys but returns 'not configured' for skill calls. * fix(rerun): convert BGR/BGRA to RGB before sending to Rerun Some Rerun versions don't render BGR color_model correctly, causing blue-tinted video. Convert BGR→RGB and BGRA→RGBA in _format_to_rerun() with numpy channel swap. * Revert "fix(rerun): convert BGR/BGRA to RGB before sending to Rerun" This reverts commit eda0390. * fix(drone): correct replay BGR label and guard OsmSkill RemoteIn - FakeDJIVideoStream: re-tag replay frames from BGR to RGB (GStreamer outputs RGB but Aug 2025 recording used default BGR label) - OsmSkill: guard .subscribe() with hasattr check for RemoteIn compatibility when distributed across workers * fix(drone): use primitive types in move() skill for Pydantic schema compat Replace Vector3 parameter with x/y/z floats so Agent.on_system_modules() can generate JSON schema for the skill. Vector3 is an IsInstanceSchema which Pydantic cannot serialize. * fix(drone): remove n_workers=4, add viewer/replay support to agentic blueprint - Remove wrapper function, flatten to module-level autoconnect - Remove .global_config(n_workers=4) that caused OsmSkill RemoteIn crash - Remove .configurators(ClockSyncConfigurator()) - Add conditional viz: Rerun when --viewer rerun, Foxglove when --viewer foxglove - Add --replay support (connection_string='replay') * fix(drone): rewrite agentic blueprint to match Go2 pattern Clean autoconnect() composition with _vis sub-blueprint for conditional viewer selection. No _modules list, no .insert() hack. * refactor(drone): compose agentic on top of basic blueprint - drone_agentic now imports drone_basic and layers tracking + skills + agent - Removed n_workers=4 and ClockSyncConfigurator from drone_basic - Removed duplicate vis/replay logic from drone_agentic - Removed fill_mode='wireframe' (invalid Rerun FillMode) - Matches Go2 composition pattern * docs(drone): update README for blueprint architecture, fix mypy - Rewrite README for CLI-based usage (dimos run drone-basic/agentic) - Document blueprint composition pattern - Add indoor/outdoor mode, replay, Rerun/Foxglove visualization - Keep RosettaDrone setup verbatim - Add return type annotations to fix mypy no-untyped-def * fix(rerun): fix rate limiter _last_log init and test cleanup - Lazy-init _last_log in _on_message instead of start() so tests can call _on_message without calling start() first - Fix test mock to use spec=RerunConvertible so messages pass through the visual override pipeline - Add bridge.stop() cleanup to prevent thread leak warnings * revert: remove bridge.py rate limiter changes from drone PR These changes (rate limiter + test) were incorrectly added by the subagent in the drone modernization branch. They belong in a separate PR. * botched rebase revert
1 parent 5bebf73 commit 9b9c817

File tree

14 files changed

+394
-603
lines changed

14 files changed

+394
-603
lines changed

dimos/agents/skills/google_maps_skill_container.py

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,15 @@ class GoogleMapsSkillContainer(Module):
3434

3535
def __init__(self) -> None:
3636
super().__init__()
37-
self._client = GoogleMaps()
37+
try:
38+
self._client = GoogleMaps()
39+
except ValueError:
40+
from dimos.utils.logging_config import setup_logger
41+
42+
setup_logger().warning(
43+
"GOOGLE_MAPS_API_KEY not set — GoogleMapsSkillContainer disabled"
44+
)
45+
self._client = None # type: ignore[assignment]
3846
self._started = True
3947
self._max_valid_distance = 20000 # meters
4048

@@ -72,6 +80,8 @@ def where_am_i(self, context_radius: int = 200) -> str:
7280

7381
result = None
7482
try:
83+
if self._client is None:
84+
return "Google Maps is not configured (missing API key)."
7585
result = self._client.get_location_context(location, radius=context_radius)
7686
except Exception:
7787
return "There is an issue with the Google Maps API."
@@ -102,6 +112,9 @@ def get_gps_position_for_queries(self, queries: list[str]) -> str:
102112

103113
for query in queries:
104114
try:
115+
if self._client is None:
116+
latlon = None
117+
continue
105118
latlon = self._client.get_position(query, location)
106119
except Exception:
107120
latlon = None

dimos/agents/skills/osm.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,12 @@ def __init__(self) -> None:
3838

3939
def start(self) -> None:
4040
super().start()
41-
self._disposables.add(self.gps_location.subscribe(self._on_gps_location)) # type: ignore[arg-type]
41+
if hasattr(self.gps_location, "subscribe"):
42+
self._disposables.add(self.gps_location.subscribe(self._on_gps_location)) # type: ignore[arg-type]
43+
else:
44+
logger.warning(
45+
"OsmSkill: gps_location stream does not support direct subscribe (RemoteIn)"
46+
)
4247

4348
def stop(self) -> None:
4449
super().stop()

dimos/robot/all_blueprints.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,8 @@
5050
"demo-object-scene-registration": "dimos.perception.demo_object_scene_registration:demo_object_scene_registration",
5151
"demo-osm": "dimos.mapping.osm.demo_osm:demo_osm",
5252
"demo-skill": "dimos.agents.skills.demo_skill:demo_skill",
53+
"drone-agentic": "dimos.robot.drone.blueprints.agentic.drone_agentic:drone_agentic",
54+
"drone-basic": "dimos.robot.drone.blueprints.basic.drone_basic:drone_basic",
5355
"dual-xarm6-planner": "dimos.manipulation.manipulation_blueprints:dual_xarm6_planner",
5456
"keyboard-teleop-piper": "dimos.robot.manipulators.piper.blueprints:keyboard_teleop_piper",
5557
"keyboard-teleop-xarm6": "dimos.robot.manipulators.xarm.blueprints:keyboard_teleop_xarm6",

dimos/robot/drone/README.md

Lines changed: 130 additions & 93 deletions
Original file line numberDiff line numberDiff line change
@@ -1,48 +1,58 @@
11
# DimOS Drone Module
22

3-
Complete integration for DJI drones via RosettaDrone MAVLink bridge with visual servoing and autonomous tracking capabilities.
3+
DJI drone integration via RosettaDrone MAVLink bridge, with visual servoing, autonomous tracking, and LLM agent control.
44

55
## Quick Start
66

7-
### Test the System
87
```bash
9-
# Test with replay mode (no hardware needed)
10-
python dimos/robot/drone/drone.py --replay
8+
# Replay mode (no hardware needed)
9+
dimos --replay run drone-basic
1110

12-
# Real drone - indoor (IMU odometry)
13-
python dimos/robot/drone/drone.py
11+
# Agentic mode with replay
12+
dimos --replay run drone-agentic
1413

15-
# Real drone - outdoor (GPS odometry)
16-
python dimos/robot/drone/drone.py --outdoor
14+
# Real drone — indoor (velocity-based odometry)
15+
dimos run drone-basic
16+
17+
# Real drone — outdoor (GPS-based odometry)
18+
dimos run drone-basic --set outdoor=true
19+
20+
# Agentic with LLM control
21+
dimos run drone-agentic
1722
```
1823

19-
### Python API Usage
20-
```python
21-
from dimos.robot.drone.drone import Drone
24+
To interact with the agent, run `dimos humancli` in a separate terminal.
2225

23-
# Connect to drone
24-
drone = Drone(connection_string='udp:0.0.0.0:14550', outdoor=True) # Use outdoor=True for GPS
25-
drone.start()
26+
## Blueprints
2627

27-
# Basic operations
28-
drone.arm()
29-
drone.takeoff(altitude=5.0)
30-
drone.move(Vector3(1.0, 0, 0), duration=2.0) # Forward 1m/s for 2s
28+
### `drone-basic`
29+
Connection + camera + visualization. The foundation layer.
3130

32-
# Visual tracking
33-
drone.tracking.track_object("person", duration=120) # Track for 2 minutes
31+
| Module | Purpose |
32+
|--------|---------|
33+
| `DroneConnectionModule` | MAVLink communication, movement skills |
34+
| `DroneCameraModule` | Camera intrinsics, image processing |
35+
| `WebsocketVisModule` | Web-based visualization |
36+
| `RerunBridgeModule` / `FoxgloveBridge` | 3D viewer (selected by `--viewer`) |
3437

35-
# Land and cleanup
36-
drone.land()
37-
drone.stop()
38-
```
38+
**Indoor vs Outdoor:** By default, the drone uses velocity integration for odometry (indoor mode). For outdoor flights with GPS, set `outdoor=true` — this switches to GPS-only positioning which is more reliable in open environments but less precise for close-range maneuvers.
39+
40+
### `drone-agentic`
41+
Composes on top of `drone-basic`, adding autonomous capabilities:
42+
43+
| Module | Purpose |
44+
|--------|---------|
45+
| `DroneTrackingModule` | Visual servoing & object tracking |
46+
| `GoogleMapsSkillContainer` | GPS-based navigation skills |
47+
| `OsmSkill` | OpenStreetMap queries |
48+
| `Agent` | LLM agent (default: GPT-4o) |
49+
| `WebInput` | Web/CLI interface for human commands |
3950

4051
## Installation
4152

42-
### Python Package
53+
### Python (included with DimOS)
4354
```bash
44-
# Install DimOS with drone support
45-
pip install -e .[drone]
55+
pip install -e ".[drone]"
4656
```
4757

4858
### System Dependencies
@@ -56,11 +66,13 @@ sudo apt-get install -y gstreamer1.0-tools gstreamer1.0-plugins-base \
5666
sudo apt-get install liblcm-dev
5767
```
5868

59-
### Environment Setup
69+
### Environment
6070
```bash
61-
export DRONE_IP=0.0.0.0 # Listen on all interfaces
62-
export DRONE_VIDEO_PORT=5600
63-
export DRONE_MAVLINK_PORT=14550
71+
# Required for agentic blueprint
72+
export OPENAI_API_KEY=sk-...
73+
74+
# Optional
75+
export GOOGLE_MAPS_API_KEY=... # For GoogleMapsSkillContainer
6476
```
6577

6678
## RosettaDrone Setup (Critical)
@@ -124,26 +136,35 @@ DJI Drone ← Wireless → DJI Controller ← USB → Android Device ← WiFi
124136

125137
### Module Structure
126138
```
127-
drone.py # Main orchestrator
128-
├── connection_module.py # MAVLink communication & skills
129-
├── camera_module.py # Video processing
130-
├── tracking_module.py # Visual servoing & object tracking
131-
├── mavlink_connection.py # Low-level MAVLink protocol
132-
└── dji_video_stream.py # GStreamer video capture
139+
dimos/robot/drone/
140+
├── blueprints/
141+
│ ├── basic/drone_basic.py # Base blueprint (connection + camera + vis)
142+
│ └── agentic/drone_agentic.py # Agentic blueprint (composes on basic)
143+
├── connection_module.py # MAVLink communication & skills
144+
├── camera_module.py # Camera processing & intrinsics
145+
├── drone_tracking_module.py # Visual servoing & object tracking
146+
├── drone_visual_servoing_controller.py # PID-based visual servoing
147+
├── mavlink_connection.py # Low-level MAVLink protocol
148+
└── dji_video_stream.py # GStreamer video capture + replay
133149
```
134150

135151
### Communication Flow
136152
```
137153
DJI Drone → RosettaDrone → MAVLink UDP → connection_module → LCM Topics
138-
→ Video UDP → dji_video_stream → tracking_module
154+
→ Video UDP → dji_video_stream → tracking_module
139155
```
140156

141157
### LCM Topics
142-
- `/drone/odom` - Position and orientation
143-
- `/drone/status` - Armed state, battery
144-
- `/drone/video` - Camera frames
145-
- `/drone/tracking/cmd_vel` - Tracking velocity commands
146-
- `/drone/tracking/overlay` - Visualization with tracking box
158+
- `/video` — Camera frames (`sensor_msgs.Image`)
159+
- `/odom` — Position and orientation (`geometry_msgs.PoseStamped`)
160+
- `/movecmd_twist` — Velocity commands (`geometry_msgs.Twist`)
161+
- `/gps_location` — GPS coordinates (`LatLon`)
162+
- `/gps_goal` — GPS navigation target (`LatLon`)
163+
- `/tracking_status` — Tracking module state
164+
- `/follow_object_cmd` — Object tracking commands
165+
- `/color_image` — Processed camera image
166+
- `/camera_info` — Camera intrinsics
167+
- `/camera_pose` — Camera pose in world frame
147168

148169
## Visual Servoing & Tracking
149170

@@ -160,7 +181,6 @@ drone.tracking.stop_tracking()
160181
```
161182

162183
### PID Tuning
163-
Configure in `drone.py` initialization:
164184
```python
165185
# Indoor (gentle, precise)
166186
x_pid_params=(0.001, 0.0, 0.0001, (-0.5, 0.5), None, 30)
@@ -180,35 +200,66 @@ Parameters: `(Kp, Ki, Kd, (min_output, max_output), integral_limit, deadband_pix
180200

181201
## Available Skills
182202

203+
All skills are exposed to the LLM agent via the `@skill` decorator on `DroneConnectionModule`:
204+
183205
### Movement & Control
184-
- `move(vector, duration)` - Move with velocity vector
185-
- `takeoff(altitude)` - Takeoff to altitude
186-
- `land()` - Land at current position
187-
- `arm()/disarm()` - Arm/disarm motors
188-
- `fly_to(lat, lon, alt)` - Fly to GPS coordinates
206+
- `move(x, y, z, duration)` — Move with velocity (m/s)
207+
- `takeoff(altitude)` — Takeoff to altitude
208+
- `land()` — Land at current position
209+
- `arm()` / `disarm()` — Arm/disarm motors
210+
- `set_mode(mode)` — Set flight mode (GUIDED, LOITER, etc.)
211+
- `fly_to(lat, lon, alt)` — Fly to GPS coordinates
189212

190213
### Perception
191-
- `observe()` - Get current camera frame
192-
- `follow_object(description, duration)` - Follow object with servoing
214+
- `observe()` — Get current camera frame
215+
- `follow_object(description, duration)` — Follow object with visual servoing
216+
- `is_flying_to_target()` — Check if navigating to GPS target
217+
218+
## Replay Mode
219+
220+
Replay data includes:
221+
- **2,148 video frames** (640×360 RGB, ~71s at 30fps)
222+
- **4,098 MAVLink telemetry frames** (~136s)
223+
224+
Stored as `TimedSensorStorage` pickle files in `data/drone/`. Downloaded automatically on first use.
225+
226+
```bash
227+
# Basic replay
228+
dimos --replay run drone-basic
229+
230+
# Agentic replay (requires OPENAI_API_KEY)
231+
dimos --replay run drone-agentic
232+
```
233+
234+
## Visualization
235+
236+
### Rerun Viewer (Recommended)
237+
```bash
238+
dimos --viewer rerun run drone-basic
239+
```
240+
Split layout with camera feed + 3D world view. Includes static drone body visualization and LCM transport integration.
241+
242+
### Foxglove Studio
243+
```bash
244+
dimos --viewer foxglove run drone-basic
245+
```
246+
Connect Foxglove Studio to `ws://localhost:8765` to see:
247+
- Live video with tracking overlay
248+
- 3D drone position
249+
- Telemetry plots
250+
- Transform tree
193251

194-
### Tracking Module
195-
- `track_object(name, duration)` - Track and follow object
196-
- `stop_tracking()` - Stop current tracking
197-
- `get_status()` - Get tracking status
252+
### Web Visualization
253+
Always available at `http://localhost:7779` via `WebsocketVisModule`.
198254

199255
## Testing
200256

201-
### Unit Tests
202257
```bash
258+
# Unit tests
203259
pytest -s dimos/robot/drone/
204-
```
205260

206-
### Replay Mode (No Hardware)
207-
```python
208-
# Use recorded data for testing
209-
drone = Drone(connection_string='replay')
210-
drone.start()
211-
# All operations work with recorded data
261+
# Replay integration test
262+
dimos --replay run drone-basic
212263
```
213264

214265
## Troubleshooting
@@ -228,47 +279,33 @@ drone.start()
228279
- Increase lighting for better detection
229280
- Adjust PID gains for environment
230281
- Check `max_lost_frames` in tracking module
231-
- Monitor with Foxglove on `ws://localhost:8765`
282+
283+
### Agent Not Responding
284+
- Check `OPENAI_API_KEY` is set
285+
- Run `dimos humancli` to send commands
286+
- Check logs for `on_system_modules` errors
232287

233288
### Wrong Movement Direction
234289
- Don't modify coordinate conversions
235290
- Verify with: `pytest test_drone.py::test_ned_to_ros_coordinate_conversion`
236291
- Check camera orientation assumptions
237292

238-
## Advanced Features
293+
## Network Ports
239294

240-
### Coordinate Systems
295+
| Port | Protocol | Purpose |
296+
|------|----------|---------|
297+
| 14550 | UDP | MAVLink commands/telemetry |
298+
| 5600 | UDP | Video stream |
299+
| 7779 | WebSocket | DimOS web visualization |
300+
| 8765 | WebSocket | Foxglove bridge |
301+
| 7667 | UDP | LCM messaging |
302+
303+
## Coordinate Systems
241304
- **MAVLink/NED**: X=North, Y=East, Z=Down
242305
- **ROS/DimOS**: X=Forward, Y=Left, Z=Up
243306
- Automatic conversion handled internally
244307

245-
### Foxglove Visualization
246-
Connect Foxglove Studio to `ws://localhost:8765` to see:
247-
- Live video with tracking overlay
248-
- 3D drone position
249-
- Telemetry plots
250-
- Transform tree
251-
252-
## Network Ports
253-
- **14550**: MAVLink UDP
254-
- **5600**: Video stream UDP
255-
- **8765**: Foxglove WebSocket
256-
- **7667**: LCM messaging
257-
258-
## Development
259-
260-
### Adding New Skills
261-
Add to `connection_module.py` with `@skill` decorator:
262-
```python
263-
@skill
264-
def my_skill(self, param: float) -> str:
265-
"""Skill description for LLM."""
266-
# Implementation
267-
return "Result"
268-
```
269-
270-
### Modifying PID Control
271-
Edit gains in `drone.py` `_deploy_tracking()`:
308+
## Modifying PID Control
272309
- Increase Kp for faster response
273310
- Add Ki for steady-state error
274311
- Increase Kd for damping

dimos/robot/drone/__init__.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,6 @@
2121
submod_attrs={
2222
"camera_module": ["DroneCameraModule"],
2323
"connection_module": ["DroneConnectionModule"],
24-
"drone": ["Drone"],
2524
"mavlink_connection": ["MavlinkConnection"],
2625
},
2726
)

0 commit comments

Comments
 (0)