Updating model parameters while collecting gradients for fisher in gfwsvd/collect_bert_task_grads.py

The overridden def training_step() in line 76 of collect_bert_task_grads.py, backwards the loss to the model with self.accelerator.backward(loss) (line 95). As far as I have read the hugging face trainer class. This loss is later used in Trainer.train() to update the model weights. Thus the model weights are changing after every mini_batch, which might lead to incorrect fisher calculation. I have tried running the collect_bert_task_grads.py but the trainer doesn't work due to unexpected argument. Though I am not worried about that, but want to confirm was the model weights getting trained while collecting gradients to calculate fisher information matrix? 
If Yes won't it make the fisher calculation wrong?

Here are references to the Hugging face trainer which calls the optimizer.step() function:

function definition of inner_training_loop() --> [Link](https://github.com/huggingface/transformers/blob/v4.52.3/src/transformers/trainer.py#L2247
)
call to your overridden training set inside inner_training_loop() --> [Link](https://github.com/huggingface/transformers/blob/v4.52.3/src/transformers/trainer.py#L2555)

calling optimizer.step() on the model --> [Link](https://github.com/huggingface/transformers/blob/v4.52.3/src/transformers/trainer.py#L2606)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updating model parameters while collecting gradients for fisher in gfwsvd/collect_bert_task_grads.py #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Updating model parameters while collecting gradients for fisher in gfwsvd/collect_bert_task_grads.py #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions