Skip to content

OOM issue during training #24

@my3636

Description

@my3636

Hi, thanks for releasing this great project!

I get an out-of-memory (OOM) when running run_scripts/padt_ovd_3b_sft.sh with:

  • 2 × RTX 4090D (24GB)
  • per_device_train_batch_size = 1

Could you please clarify:

Is the LLM (Qwen2.5-VL-3B-Instruct) fine-tuned (fully/partially) or frozen during training?

What hyperparameters should be adjusted to make this runnable, or what is the minimum GPU memory / GPU requirement?

Thanks in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions