Skip to content

Unexpected shape issue in Hessian-Vector computation #8

@stalhabukhari

Description

@stalhabukhari

Hi!

Thank you making the source code of your work available. I tried to use the library for an application involving a 3D network architecture, and ran into the following issue:

********** Commencing Hessian Computation **********
Traceback (most recent call last):
  File "hessian_analysis.py", line 181, in <module>
    hessianObj.analyze(model_checkpoint_filepath)
  File "/media/ee/DATA/Repositories/PyHessian/hessian_analysis.py", line 70, in analyze
    top_eigenvalues, top_eigenvectors  = hessian_comp.eigenvalues(top_n=self.top_n)
  File "/media/ee/DATA/Repositories/PyHessian/pyhessian/hessian.py", line 167, in eigenvalues
    Hv = hessian_vector_product(self.gradsH, self.params, v)
  File "/media/ee/DATA/Repositories/PyHessian/pyhessian/utils.py", line 88, in hessian_vector_product
    retain_graph=True)
  File "/home/ee/anaconda3/envs/torch13/lib/python3.6/site-packages/torch/autograd/__init__.py", line 197, in grad
    grad_outputs_ = _make_grads(outputs, grad_outputs_)
  File "/home/ee/anaconda3/envs/torch13/lib/python3.6/site-packages/torch/autograd/__init__.py", line 32, in _make_grads
    if not out.shape == grad.shape:
AttributeError: 'float' object has no attribute 'shape'

Interestingly, the issue does not occur at the first call to back-propagation via loss.backward(), rather occurs at the call to torch.autograd.grad().

I believe that the float object in question is the 0. manually inserted when param.grad is None in the following routine:

def get_params_grad(model):
"""
get model parameters and corresponding gradients
"""
params = []
grads = []
for param in model.parameters():
if not param.requires_grad:
continue
params.append(param)
grads.append(0. if param.grad is None else param.grad + 0.)
return params, grads

If I am right, it is even more mind-boggling that a type float is able to pass the check for data-type in PyTorch (I mistakenly mixed outputs and inputs arguments of torch.autograd.grad). Kindly guide about what I can do here.

P.S. hessian_analysis.py is a wrapper I wrote around the library, for my use-case. I verified the wrapper by running a 2-layer neural network for a regression task.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions