Skip to content

AIR-DISCOVER/FreeAskWorld

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

64 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

FreeAskWorld Logo

FreeAskWorld Simulator (AAAI26 Oral)

An Interactive and Closed-Loop Simulator for Human-Centric Embodied AI

arXiv HuggingFace Dataset Apache License Closed-Loop Framework FreeAD Project FreeAskAgent Project

FreeAskWorld is an interactive simulation framework that integrates large language models (LLMs) for high-level planning and socially grounded interaction in embodied AI.

System Overview

People Simulation Framework

FreeAskWorld Homepage


Project Milestones

  • ๐Ÿ“ Paper Publication: Published the main research paper describing FreeAskWorld.
  • ๐Ÿ“Š Data Processing Code Release: Released code for preprocessing, data cleaning, and annotation pipelines.
  • ๐ŸŽฅ Presentation Video: Released project presentation video.
  • ๐Ÿ› ๏ธ Simulator Code Release: Publish the core simulation code for developers and external collaborators.
  • ๐Ÿค– Agent Robot Integration: Integrate agent interfaces (OpenClaw / Codex / Claude / custom) to access and interact with robots inside the FreeAskWorld simulation environment.
  • ๐Ÿ“š Usage Tutorial: Create a comprehensive tutorial for using the FreeAskWorld simulator, including setup, configuration, and example workflows.
  • ๐Ÿง‘โ€๐Ÿ’ป API Documentation: Provide thorough documentation of the simulatorโ€™s API for seamless integration and extension.
  • ๐ŸŽฎ Steam Release: Prepare and publish the FreeAskWorld simulator on Steam for broader accessibility.

Agent integration note: the current recommended local path is ROS2-first, because the local Unity simulator is configured in ROS2 mode on 127.0.0.1:10000. See docs/agent_ros2_integration.md. The ROS2 path now includes a live-capable rclpy transport when external ROS2 runtime dependencies are present, while still preserving scaffold-only fallback behavior. For live ROS2 usage, a repo-local .ros2_venv or equivalent ROS-compatible Python environment may be needed to avoid ROS Humble rclpy ABI mismatch, and the wrapper at scripts/agent_ros2_cli.sh now auto-activates .ros2_venv when present. The additive closed_loop websocket Agent bridge remains available, but for this Unity setup it should be treated as experimental and future-facing rather than the primary runtime path.

Agent config entry paths (for OpenClaw / Claude Code / Codex / custom agent adapters):

  • ROS2 integration doc: docs/agent_ros2_integration.md
  • ROS2 package: integrations/agent_ros2/
  • ROS2 wrapper CLI: scripts/agent_ros2_cli.sh
  • Legacy closed-loop agent bridge prototypes are archived under archived/closed_loop-agent/ and are not part of the current recommended control path.

Help auto install Agent ROS2 Integration. Other AI agents can discover these paths and install/configure automatically.

Canonical entry command for auto-discovery:

  • bash scripts/agent_ros2_cli.sh --help

First-time setup notes (read this before install)

FreeAskWorld now includes a repo-owned local runtime path for ROS2-first live testing.

Requirements

Option 1 โ€” Agent install (recommended)

In an agent session, use this exact instruction:

Install all envs by scripts/setup_envs.sh

Option 2 โ€” User command line install

From the repo root, run:

cd ~/research/FreeAskWorld && bash scripts/setup_envs.sh

What scripts/setup_envs.sh does:

  • checks whether a repo-local environment already exists
  • reuses it if present instead of blindly creating a new one
  • otherwise creates .ros2_venv with Python 3.10
  • installs the minimal Python packages needed for live testing
  • warns clearly if ROS2 Humble is not installed yet
  • points you to the manual ROS2 setup guide if system ROS2 is missing

After setup:

source .ros2_venv/bin/activate
scripts/start_local_runtime.sh
scripts/status_local_runtime.sh
curl http://127.0.0.1:8787/healthz
STEP_SECONDS=2 OBSERVE_SECONDS=1 scripts/run_live_smoke.sh
scripts/stop_local_runtime.sh

run_live_smoke.sh now visibly executes all major player actions for a few seconds each and prints step-by-step results plus observation summaries.

Shortest interactive player-control examples:

scripts/player_cmd.sh status
scripts/player_cmd.sh observe 1
scripts/player_cmd.sh forward 0.5
scripts/player_cmd.sh left 30
scripts/player_cmd.sh right 30
scripts/player_cmd.sh around
scripts/player_cmd.sh stop
scripts/player_cmd.sh wait 1
scripts/player_cmd.sh ask "Where is the target?"
scripts/player_cmd.sh action '{"action":"move_forward","parameters":{"distance_m":0.5}}'

Expected behavior for the minimal checks above:

  • --help prints CLI usage.
  • status --output-json runs even without Unity connected.
  • In scaffold-only mode, transport_ready: false does not by itself mean the repo is broken.

For live validation, use:

scripts/start_local_runtime.sh
STEP_SECONDS=2 OBSERVE_SECONDS=1 scripts/run_live_smoke.sh

This visibly runs the main player actions (forward, left, right, around, wait, ask, stop), prints each step result, captures observations between steps, and writes a JSON report.

If --ros2-live fails immediately on a fresh machine, check these first:

  • ROS2 Humble is installed manually; use docs/ros2_setup.md.
  • .ros2_venv exists and includes at least numpy plus the local runtime Python deps.
  • The Unity/ROS2 backend is actually running and reachable.
  • The machine allows DDS/UDP/shared-memory transport required by ROS2 middleware.
  • The ROS log directory is writable (for example, set ROS_LOG_DIR=/tmp/roslog if needed).

๐ŸŽฅ Demos

Simulator Presentation Demonstrates the main functions of this simulator.


๐Ÿ“ฅ Download Simulator Presentation Video

Simulator APP Presentation Demonstrates the main functions of this simulator.


๐Ÿ“ฅ Download APP Presentation Video

ROS2 Example Demonstrates the ROS2 RGBD SLAM in our simulator.


๐Ÿ“ฅ Download ROS2 Example Video

๐Ÿ“Œ Introduction

As embodied intelligence progresses, simulation platforms must evolve beyond low-level physics toward human-centric, socially interactive environments.
FreeAskWorld introduces:

  • A closed-loop interactive simulator
  • A scalable human-agent world modeling framework
  • A modular data generation pipeline
  • A new benchmark: Direction Inquiry Task, extending VLN to active question-asking & guidance following

This repo contains simulator code and baseline models from our AAAI 2026 paper.


โœจ Key Features

Feature Description
๐Ÿค– LLM-Powered Agents Intention modeling, reasoning, natural dialog, instruction generation
๐Ÿšถ Realistic Humans Personalized profiles, schedules, motion & navigation styles
๐ŸŒฆ๏ธ Dynamic World Weather, lighting, traffic, and scene randomization
๐Ÿ” Closed-Loop Sync WebSocket-based state exchange for real-time model interaction
๐Ÿงฉ Direction Inquiry Task Agents ask for help, interpret human guidance, adapt plans
๐Ÿ“ฆ Large-Scale Data 6 tasks ยท 16 object categories ยท 63,429 frames ยท 17+ hours
๐Ÿ”„ Data Generation Pipeline Modular pipeline for generating embodied ai data

Synthetic Data Generation

docs/OccupancyMapGenerationContrast.jpg docs/SyhteticDataPic.jpg

We used Unity Perception (Borkman et al. 2021) to build a rich and diverse synthetic dataset that includes multiple annotation types and data modalities. The dataset is designed to support a wide range of vision, navigation, and humanโ€“computer interaction tasks, and contains both dense per-frame annotations and global scene-level metadata. The main components are:

  • Visual annotations: 2D/3D bounding boxes, instance segmentation, and semantic segmentation.
  • Geometric annotations: depth maps and surface normal maps for scene geometry.
  • Visual observations: panoramic RGB images and six 90ยฐ perspective views.
  • Interaction data: natural language instructions, dialog histories, and agent trajectories.
  • Spatial representations: 2D occupancy heatmaps for mapping and localization.
  • Environment metadata: map boundaries, semantic regions, and other contextual information.

The dataset covers 16 common object categories (e.g., vehicles, pedestrians, street furniture). By combining 2D occupancy heatmaps (encoding static layout) with 3D bounding boxes (capturing dynamic entity positions) and the provided world coordinates, we can accurately reconstruct simulated scenes to create a comprehensive digital twin. This reconstructed environment supports open-loop evaluations similar to nuScenes (Caesar et al. 2020), and is particularly suited for unstructured environments as in FreeAD (Peng et al. 2025). The dataset enables a broad spectrum of downstream tasks including navigation planning, behavior prediction, and humanโ€“computer interaction studies.

The figures below illustrate occupancy map generation and sample synthetic data:

Occupancy Map Generation Contrast

Synthetic Data Examples

๐Ÿš€ Getting Started

Quick Start (recommended for first-time users)

git clone https://github.com/AIR-DISCOVER/FreeAskWorld
cd FreeAskWorld

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# Minimal smoke check
python -m integrations.agent_ros2.cli --help
python -m integrations.agent_ros2.cli status --output-json

# Recommended wrapper for ROS2 live mode
bash scripts/agent_ros2_cli.sh --help

If you only want to verify that the repo-level agent interface is wired correctly, the smoke checks above are the fastest starting point. If you want full live interaction with the Unity simulator, continue with the ROS2 runtime notes below.

For a setup guide matching the currently working local ROS2 environment, see docs/ros2_setup.md. A shorter player-control wrapper is also available at scripts/player_cmd.sh. Environment setup is unified under scripts/setup_envs.sh.

For the current local Unity configuration, use the ROS2-first Agent integration scaffold described in docs/agent_ros2_integration.md and implemented under integrations/agent_ros2. This matches the simulator's ROS2 mode on 127.0.0.1:10000.

For the additive agent compatibility bridge on top of the existing closed_loop websocket stack, see closed_loop/README.md. It adds HTTP, CLI, and MCP-friendly access without replacing the current Unity-facing protocol or baseline behavior.

How to Run

๐Ÿ“Š Proactive VLN Results

Models fine-tuned on FreeAskWorld demonstrate enhanced semantic understanding and interaction competency. However, a significant gap to human performance remains, especially in high-level reasoning and social navigation.

Closed-Loop Navigation Performance (Table 4 from Paper)

Method TL (m) SR (%) SPL NE (m) OSR (%) ONE (m) NDI
Human (no asking) 47.5 40.2 38.2 18.3 41.3 11.3 0.0
Human (asking) 59.9 82.6 71.2 3.49 82.6 1.63 0.78
ETPNav 31.2 0.0 0.0 32.9 0.0 28.7 0.0
BEVBert 14.6 0.0 0.0 31.0 0.0 29.0 0.0
ETPNav-FT 33.6 0.0 0.0 31.6 1.1 27.1 0.0
BEVBert-FT 18.7 0.0 0.0 30.0 0.0 28.5 0.0

Licence

FreeAskWorld is licensed under the Apache 2.0 License.

About

[AAAI 2026 Oral] FreeAskWorld is an interactive simulation framework that integrates large language models (LLMs) for high-level planning and socially grounded interaction in embodied AI.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors