Skip to content

Updating model parameters while collecting gradients for fisher in gfwsvd/collect_bert_task_grads.py #1

@karmazov

Description

@karmazov

The overridden def training_step() in line 76 of collect_bert_task_grads.py, backwards the loss to the model with self.accelerator.backward(loss) (line 95). As far as I have read the hugging face trainer class. This loss is later used in Trainer.train() to update the model weights. Thus the model weights are changing after every mini_batch, which might lead to incorrect fisher calculation. I have tried running the collect_bert_task_grads.py but the trainer doesn't work due to unexpected argument. Though I am not worried about that, but want to confirm was the model weights getting trained while collecting gradients to calculate fisher information matrix?
If Yes won't it make the fisher calculation wrong?

Here are references to the Hugging face trainer which calls the optimizer.step() function:

function definition of inner_training_loop() --> Link
call to your overridden training set inside inner_training_loop() --> Link

calling optimizer.step() on the model --> Link

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions