Skip to content

failed to run on colab: ModulesToSaveWrapper has no attribute embed_tokens #621

@Vostredamus

Description

@Vostredamus

I am trying to train a lora in colab based on the example provided in the repository. My current yml is as follows:

base_model: /content/drive/MyDrive/models/ehartforddolphin-2.2.1-mistral-7b
base_model_config: /content/drive/MyDrive/models/ehartforddolphin-2.2.1-mistral-7b
model_type: MistralForCausalLM
tokenizer_type: LlamaTokenizer
is_mistral_derived_model: true

load_in_8bit: false
load_in_4bit: false
strict: false

datasets:

  • path: /content/drive/MyDrive/samantha.json
    type: sharegpt
    conversation: chatml

dataset_prepared_path: last_run_prepared
val_set_size: 0.05
output_dir: /content/drive/MyDrive/samantha

adapter: lora
lora_model_dir:

sequence_len: 8192
sample_packing: true
pad_to_sequence_len: true

lora_r: 512
lora_alpha: 512
lora_dropout: 0.05
lora_target_modules: q_proj, k_proj, v_proj, o_proj, gate_proj
lora_target_linear: true
lora_modules_to_save: embed_tokens, lm_head

wandb_project:
wandb_entity:
wandb_watch:
wandb_name:
wandb_log_model:

mlflow_experiment_name: test-test

gradient_accumulation_steps: 1
micro_batch_size: 1
num_epochs: 4
optimizer: paged_adamw_32bit
lr_scheduler: cosine
learning_rate: 0.0002

train_on_inputs: false
group_by_length: false
bf16: false
fp16: true
tf32: false

gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: false

warmup_steps: 50
eval_steps: 0.05
eval_table_size:
eval_table_max_new_tokens:
saves_per_epoch: 10
save_steps:
debug:
deepspeed:
weight_decay: 0.1
fsdp:
fsdp_config:
special_tokens:
bos_token: ""
eos_token: "<|im_end|>"
unk_token: ""
tokens:

  • "<|im_start|>"
  • "<|im_end|>"

Many thanks in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions