feat: add Qwen3.5-9B task config and version bumps by dzorlu · Pull Request #277 · fleet-ai/SkyRL

dzorlu · 2026-03-04T02:38:41Z

Summary

Add task YAML (openenv-fleet-grpo-qwen3_5-9b.yaml) for Qwen/Qwen3.5-9B — a natively multimodal model (early fusion, GatedDeltaNet hybrid attention) as a drop-in replacement for Qwen3-VL-8B
Bump vLLM from ==0.13.0 → >=0.16.1.dev0 (nightly required; Qwen3.5 support landed after 0.16.0 branch cut, first stable will be v0.17.0)
Bump transformers from >=4.51.0 → >=4.57.0 (Qwen3.5 model class registration)
Task YAML uses --extra-index-url https://wheels.vllm.ai/nightly in both setup and run to resolve nightly wheels

Risk areas (no pre-emptive changes — test first)

collective_rpc for weight sync (vllm_engine.py:338-342) — internal API, may break with 4-minor-version vLLM jump
output_processor.request_states (vllm_engine.py:318-326) — internal API
OpenAI serving imports (vllm_engine.py:16-43) — existing try/except should absorb changes

Verified (no changes needed)

model_wrapper.py:100 — hasattr(model_config, "vision_config") correctly detects Qwen3.5-9B as VL
generators/utils.py — Chat template uses same <|im_start|> format

Test plan

uv sync --extra vllm --extra-index-url https://wheels.vllm.ai/nightly resolves successfully
AutoConfig.from_pretrained("Qwen/Qwen3.5-9B") has vision_config
vllm serve Qwen/Qwen3.5-9B starts without error on nightly
Launch training run with the new task YAML on a test cluster
Verify weight sync (collective_rpc) works end-to-end
Switch to vllm>=0.17.0 once stable release ships (~mid March 2026)

🤖 Generated with Claude Code

Add task YAML for Qwen/Qwen3.5-9B (natively multimodal, early fusion) and bump dependencies to support it: - vllm: ==0.13.0 → >=0.16.1.dev0 (nightly required until v0.17.0) - transformers: >=4.51.0 → >=4.57.0 (Qwen3.5 model class support) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vLLM nightly (0.16.1rc1.dev) requires torch==2.10.0, conflicting with the previous torch==2.9.0 pin. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The global bump broke sglang resolution since sglang==0.4.8.post1 pins transformers==4.52.3. Move the bump into the vllm extra where it's needed — uv resolves conflicting extras in separate splits. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Revert pyproject.toml vllm extra to stable pins (vllm==0.13.0, torch==2.9.0) so uv sync resolves cleanly across all extras - Override with nightly via uv pip install in the task YAML only - Use python -m instead of uv run --isolated in run section to avoid re-resolving against pyproject.toml - Set MAX_ATTEMPTS=1 in workflow to fail fast on setup errors Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

uv sync installs torchvision for torch==2.9.0, then the nightly override bumps torch to 2.10.0 causing ABI mismatch (torchvision::nms operator not found). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Latest nightly (.dev202) only has aarch64 wheel. unsafe-best-match lets uv check both nightly and PyPI indexes to find a version with x86_64 wheels. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

8x B200/H200 exhausted everywhere. Add 4x fallbacks so SkyPilot can grab whatever is available. Restore retry logic since provisioning failures are transient. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

torchvision installed by uv sync (for torch 2.9.0) was not being properly overridden because the pip install line only had the vllm nightly index. torchvision needs the pytorch cu128 index to get a build matching torch 2.10.0. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

uv pip install was seeing torchvision as already satisfied (installed by uv sync for torch 2.9.0) and skipping the upgrade. Split into two steps and use --reinstall-package to force re-resolution from cu128. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

uv pip's resolution was keeping the old torchvision (built for torch 2.9.0) even with --reinstall-package. Switch to pip with --force-reinstall --no-deps from cu128 index to guarantee matching torch+torchvision pair. Falls back to nightly cu128 if stable doesn't have the version. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Use vllm's recommended install method which automatically handles torch+torchvision ABI compatibility instead of manual version juggling. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

uv run re-syncs the venv from pyproject.toml before executing, which reverts torch from 2.10.0 (installed by vllm nightly) back to 2.9.0 (pinned in pyproject.toml). This causes the torchvision ABI mismatch (operator torchvision::nms does not exist) at runtime even though setup correctly installs matching versions. Changed both `uv run python` calls to plain `python` since the venv is already activated and has the correct packages installed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vLLM nightly pulls in Ray 2.44+ which removed ray.experimental.collective.util.get_address_and_port. Fall back to a simple socket-based implementation that does the same thing: get node IP via ray.util and bind to a free port. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

flash-attn 2.8.3 only supports torch<=2.9, but vLLM nightly requires torch 2.10. No prebuilt wheels exist for torch 2.10+cu130. - Uninstall flash-attn after vllm nightly install - Use SDPA attention (PyTorch's built-in F.scaled_dot_product_attention which includes FlashAttention v2 as an internal backend) - Disable sample packing (requires flash_attention_2 attn impl) - vLLM uses FlashInfer for its attention kernels, not flash-attn Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

uv sync installs flashinfer-jit-cache 0.5.3+cu128, but vllm nightly upgrades flashinfer to 0.6.4 without updating the JIT cache package. Remove the stale cache so flashinfer 0.6.4 regenerates it. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

transformers 4.57.6 (latest stable) doesn't have qwen3_5 in its auto model mapping. FSDP workers call AutoConfig.from_pretrained() which requires the model type to be registered. Install from HF main branch until a stable release (>=4.58.0) includes Qwen3.5 support. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ce retries to 1 AutoModelForVision2Seq was removed in transformers 5.0 (main branch). Replace with AutoModelForImageTextToText, falling back to the old name for older transformers versions. Also reduce retry attempts to 1 for faster debugging iteration. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

accelerate's register_empty_parameter passes _is_hf_initialized to Parameter.__new__() which torch 2.10 doesn't accept. Installing accelerate from git main fixes this compatibility issue. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

accelerate's register_empty_parameter passes param.__dict__ (including _is_hf_initialized from transformers) as kwargs to Parameter.__new__(), which torch 2.10+ rejects. Patch Parameter.__new__ to accept and ignore extra kwargs. Also revert accelerate-from-source (code patch handles it). Ref: verl-project/verl#4522 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

transformers 5.0/main returns a set for _no_split_modules instead of a list. Convert to list before subscripting with [0]. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vLLM nightly uses pidfd_getfd for inter-process CUDA memory sharing during collective_rpc weight sync. This syscall requires ptrace permissions blocked in containerized environments (RunPod). Set PR_SET_PTRACER via sitecustomize.py so all Python processes (Ray workers, vLLM engines) get the permission on startup. Ref: verl-project/verl#3377 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ainer compat pidfd_getfd syscall is blocked by seccomp in containerized envs (RunPod). Two fixes: - VLLM_ENABLE_V1_MULTIPROCESSING=0: keeps engine in single process, avoids CUDA IPC - PYTORCH_CUDA_ALLOC_CONF=expandable_segments:False: avoids pidfd_getfd in allocator IPC path Also removes failed sitecustomize.py prctl workaround (seccomp blocks at kernel level). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…rnels) FlashInfer needs nvcc to JIT-compile GDN attention kernels for Qwen3.5. SkyPilot containers may not have /usr/local/cuda; detect from common CUDA paths and fall back to apt-get nvidia-cuda-toolkit. Also removes unnecessary VLLM_ENABLE_V1_MULTIPROCESSING=0 from YAML (already set by vllm_engine.py) and restores expandable_segments:True. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

After apt-get install nvidia-cuda-toolkit, nvcc is at /usr/bin/nvcc but CUDA_HOME was never set. Derive CUDA_HOME from nvcc binary path as fallback. Fixes bash unbound variable error with set -euo pipefail. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…uption) apt-get nvidia-cuda-toolkit was breaking the Python venv (wandb import failed after install). Instead: detect nvcc via find, fall back to pip nvidia-cuda-nvcc-cu12. Adds diagnostics if nvcc can't be found. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…-cu12) The find command couldn't locate nvcc in site-packages. Use Python's nvidia.cuda_nvcc module path directly to derive CUDA_HOME. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…da_nvcc nvidia.cuda_nvcc is a namespace package with __file__=None. Use __path__[0] to get the package directory path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

nvcc binary from pip package may not be in PATH or may lack execute permission. Use full path and add ls/chmod for diagnostics. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ns nvcc nvidia-cuda-nvcc-cu12 only has ptxas/headers, NOT nvcc binary. The CUDA 13 package (nvidia-cuda-nvcc) includes the full compiler. torch 2.10+cu130 from vLLM nightly needs CUDA 13 nvcc for FlashInfer JIT. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…n torch 2.10) vLLM asserts that expandable_segments:True is not set when using its memory pool. See pytorch/pytorch#147851. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

FlashInfer JIT runs inside Ray actor processes which have different working directories. Relative CUDA_HOME paths (`.venv/...`) cause `/bin/sh: 1: .venv/.../nvcc: not found` errors in Ninja builds. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Pip CUDA packages (nvidia-cuda-nvcc) have incomplete headers - cuda_fp16.h references nv/target which isn't found. Install cuda-nvcc-12-8 from NVIDIA's official apt repo which provides a complete, working toolkit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

NVIDIA's CUDA repo uses x86_64 directory naming, not dpkg's amd64. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

FlashInfer's fmha_gen kernels need cublasLt.h (from libcublas-dev-12-8) and nvrtc.h (from cuda-nvrtc-dev-12-8) in addition to cuda-nvcc-12-8. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Transformers from source (post-5.0) changed apply_chat_template to return BatchEncoding by default instead of List[int]. This caused: - "unsupported operand type(s) for +: 'BatchEncoding' and 'list'" - "TypeError: new(): invalid data type 'str'" (fatal crash) Also: require 8x GPU, increase max_input_length to 131K, bump gpu_memory_utilization to 0.85. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

forums-homes is a tool_use env incorrectly tagged as computer_use in v5, causing init failures (no 'computer' tool) during CUA training. Removed 175 forums-homes tasks (112 CU + 63 TU) and uploaded as v51 to S3. - Upload v51 dataset to s3://fleet-internal-datasets/v51/openenv/ - Add forums-homes to EXCLUDED_ENVS in prepare_dataset.py - Update GHA workflow default to v51, restore MAX_ATTEMPTS=4 - Add Dataset Versions section to fleet-training.md Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

forums-homes removal is handled at the dataset level (v51) not in code. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Restored all_tool_use.json and all_tasks.json to match v5 exactly. Only all_computer_use.json has forums-homes removed (112 tasks). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Added vast, primeintellect to provider list. Use H200:8 (not H200-SXM:8) to match the naming convention from the 8b-8gpu task YAML. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- RunPod/Lambda/Vast: H200-SXM:8 (not H200:8) - PrimeIntellect: H200:8 - Vast: no B200, only H200-SXM:8 - Nebius: B200:8 and H200-SXM:8 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove PrimeIntellect ($0 wallet), Vast (no B200/H200-SXM), Nebius (not enabled on runner). Only RunPod (H200-SXM:8, B200:8) and Lambda (B200:8) are viable. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The nebius SDK was not being installed on the GHA runner, causing `sky check nebius` to fail. Added [nebius] extra to skypilot-nightly. Restored Vast, Nebius, and PrimeIntellect GPU providers in task YAML. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The prime CLI only sets api_key, not team_id. Without team_id, API calls hit the personal context ($0) instead of team ($250). Write config.json directly with both fields. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove xlam-70b, glm-4.7-flash, and harbor-grpo-qwen3-8b — no longer used. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

H200/B200 availability is tight — H100s are much more widely available and 9B fits fine on 8x H100-80GB. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

PrimeIntellect provisioning is unreliable — remove from both H200 and H100 entries. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

v52 filters computer_use to instacart (120), walmart (145), zillow (348) — the 3 envs with shortest avg trajectories, for training signal with small models. Tool-use dataset carried over from v51 unchanged. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

H100s don't support Qwen3.5's GatedDeltaNet kernels (FlashInfer JIT). Only B200/H200 GPUs work for Qwen3.5 training. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

6x B200/H200 preferred, 8x fallback. All params are dynamic via $SKYPILOT_NUM_GPUS_PER_NODE so no other changes needed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vLLM uses paged attention (no padding), so this just raises the ceiling on conversation length. Safe with batched=false. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

72 eval trajectories with 50%+ timeout rate was taking >1hr before training could start. 12 * 3 samples = 36 trajectories should be much faster. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Tracks generate time vs env.step time per trajectory, logged to WandB: - timing/total_generate_secs, timing/total_env_step_secs - timing/avg_generate_per_turn_secs, timing/avg_env_step_per_turn_secs - timing/pct_env_step (% of trajectory time spent in env interaction) - timing/num_turns Also reduces eval_batch_size from 24 to 12 for 9B config. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… GPUs - MAX_INPUT_LENGTH 262144→131072: caps max training sequence length to reduce activation memory during loss.backward() (OOM at 29.62 GiB allocation) - Reorder GPU preferences: 8x before 6x for better FSDP memory sharding Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Deniz and others added 30 commits March 3, 2026 18:38

feat: add Qwen3.5-9B to training workflow dropdown

f1f2933

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: bump torch 2.9.0 → 2.10.0 for vllm nightly compat

492f4bc

vLLM nightly (0.16.1rc1.dev) requires torch==2.10.0, conflicting with the previous torch==2.9.0 pin. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: include torchvision in nightly override to match torch==2.10.0

06af5d3

uv sync installs torchvision for torch==2.9.0, then the nightly override bumps torch to 2.10.0 causing ABI mismatch (torchvision::nms operator not found). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: request 8x B200/H200 GPUs for better availability

537bd33

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: add --index-strategy unsafe-best-match for nightly vllm install

75fcfe0

Latest nightly (.dev202) only has aarch64 wheel. unsafe-best-match lets uv check both nightly and PyPI indexes to find a version with x86_64 wheels. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: add 4x GPU fallbacks and restore provisioning retries

197db20

8x B200/H200 exhausted everywhere. Add 4x fallbacks so SkyPilot can grab whatever is available. Restore retry logic since provisioning failures are transient. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: use --torch-backend=auto for vllm nightly install

e3ea187

Use vllm's recommended install method which automatically handles torch+torchvision ABI compatibility instead of manual version juggling. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: handle set type from _no_split_modules in FSDP2 wrapping

9fc16a3

transformers 5.0/main returns a set for _no_split_modules instead of a list. Convert to list before subscripting with [0]. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: use Python import to locate pip-installed nvcc (nvidia-cuda-nvcc…

798ea76

…-cu12) The find command couldn't locate nvcc in site-packages. Use Python's nvidia.cuda_nvcc module path directly to derive CUDA_HOME. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: use __path__ instead of __file__ for namespace package nvidia.cu…

99d2e7b

…da_nvcc nvidia.cuda_nvcc is a namespace package with __file__=None. Use __path__[0] to get the package directory path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: use full path for nvcc and add chmod/diagnostics

c98c35a

nvcc binary from pip package may not be in PATH or may lack execute permission. Use full path and add ls/chmod for diagnostics. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Deniz and others added 29 commits March 4, 2026 04:11

fix: remove expandable_segments (incompatible with vLLM memory pool i…

6f4c129

…n torch 2.10) vLLM asserts that expandable_segments:True is not set when using its memory pool. See pytorch/pytorch#147851. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: use x86_64 (not amd64) in NVIDIA apt repo URL

0d5ba92

NVIDIA's CUDA repo uses x86_64 directory naming, not dpkg's amd64. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: install cublas and nvrtc dev packages for FlashInfer JIT

d33d6ca

FlashInfer's fmha_gen kernels need cublasLt.h (from libcublas-dev-12-8) and nvrtc.h (from cuda-nvrtc-dev-12-8) in addition to cuda-nvcc-12-8. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

revert: remove forums-homes from EXCLUDED_ENVS

78ca300

forums-homes removal is handled at the dataset level (v51) not in code. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: v51 only removes forums-homes from computer_use, not tool_use

6825b5d

Restored all_tool_use.json and all_tasks.json to match v5 exactly. Only all_computer_use.json has forums-homes removed (112 tasks). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: add all cloud providers and H200:8 fallback for GPU provisioning

3d2900a

Added vast, primeintellect to provider list. Use H200:8 (not H200-SXM:8) to match the naming convention from the 8b-8gpu task YAML. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

style: run black on model_wrapper.py

589b4fc

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: correct GPU accelerator names per cloud provider

cf06cac

- RunPod/Lambda/Vast: H200-SXM:8 (not H200:8) - PrimeIntellect: H200:8 - Vast: no B200, only H200-SXM:8 - Nebius: B200:8 and H200-SXM:8 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: trim GPU providers to RunPod + Lambda only

81efa3c

Remove PrimeIntellect ($0 wallet), Vast (no B200/H200-SXM), Nebius (not enabled on runner). Only RunPod (H200-SXM:8, B200:8) and Lambda (B200:8) are viable. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: set PrimeIntellect team_id for team wallet billing

2c1db8d

The prime CLI only sets api_key, not team_id. Without team_id, API calls hit the personal context ($0) instead of team ($250). Write config.json directly with both fields. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chore: remove unused task choices from workflow

23b9b18

Remove xlam-70b, glm-4.7-flash, and harbor-grpo-qwen3-8b — no longer used. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat: add H100 fallback to 9B GPU resources

f24855d

H200/B200 availability is tight — H100s are much more widely available and 9B fits fine on 8x H100-80GB. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: remove PrimeIntellect from GPU providers

0f9d290

PrimeIntellect provisioning is unreliable — remove from both H200 and H100 entries. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove H100 fallback from Qwen3.5-9B config

2c31aa8

H100s don't support Qwen3.5's GatedDeltaNet kernels (FlashInfer JIT). Only B200/H200 GPUs work for Qwen3.5 training. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs: add CU environment breakdown to dataset version table

09ea033

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add 6x GPU option for RunPod (9B config)

0a219fa

6x B200/H200 preferred, 8x fallback. All params are dynamic via $SKYPILOT_NUM_GPUS_PER_NODE so no other changes needed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Increase MAX_INPUT_LENGTH to 256K for 9B config

ca1b9d5

vLLM uses paged attention (no padding), so this just raises the ceiling on conversation length. Safe with batched=false. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Reduce eval_batch_size to 12 for 9B config

7fd6f45

72 eval trajectories with 50%+ timeout rate was taking >1hr before training could start. 12 * 3 samples = 36 trajectories should be much faster. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Revert GPU ordering: 6x preferred for RunPod availability

d53107b

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Prefer 8x GPUs over 6x for better FSDP memory sharding

884f99f

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add Qwen3.5-9B task config and version bumps#277

feat: add Qwen3.5-9B task config and version bumps#277
dzorlu wants to merge 59 commits intofeat/vl-multimodal-supportfrom
feat/qwen3.5-9b

dzorlu commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dzorlu commented Mar 4, 2026

Summary

Risk areas (no pre-emptive changes — test first)

Verified (no changes needed)

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant