-
Notifications
You must be signed in to change notification settings - Fork 127
Description
Hi!
Thank you making the source code of your work available. I tried to use the library for an application involving a 3D network architecture, and ran into the following issue:
********** Commencing Hessian Computation **********
Traceback (most recent call last):
File "hessian_analysis.py", line 181, in <module>
hessianObj.analyze(model_checkpoint_filepath)
File "/media/ee/DATA/Repositories/PyHessian/hessian_analysis.py", line 70, in analyze
top_eigenvalues, top_eigenvectors = hessian_comp.eigenvalues(top_n=self.top_n)
File "/media/ee/DATA/Repositories/PyHessian/pyhessian/hessian.py", line 167, in eigenvalues
Hv = hessian_vector_product(self.gradsH, self.params, v)
File "/media/ee/DATA/Repositories/PyHessian/pyhessian/utils.py", line 88, in hessian_vector_product
retain_graph=True)
File "/home/ee/anaconda3/envs/torch13/lib/python3.6/site-packages/torch/autograd/__init__.py", line 197, in grad
grad_outputs_ = _make_grads(outputs, grad_outputs_)
File "/home/ee/anaconda3/envs/torch13/lib/python3.6/site-packages/torch/autograd/__init__.py", line 32, in _make_grads
if not out.shape == grad.shape:
AttributeError: 'float' object has no attribute 'shape'
Interestingly, the issue does not occur at the first call to back-propagation via loss.backward(), rather occurs at the call to torch.autograd.grad().
I believe that the float object in question is the 0. manually inserted when param.grad is None in the following routine:
Lines 61 to 72 in c2e49d2
| def get_params_grad(model): | |
| """ | |
| get model parameters and corresponding gradients | |
| """ | |
| params = [] | |
| grads = [] | |
| for param in model.parameters(): | |
| if not param.requires_grad: | |
| continue | |
| params.append(param) | |
| grads.append(0. if param.grad is None else param.grad + 0.) | |
| return params, grads |
If I am right, it is even more mind-boggling that a type (I mistakenly mixed float is able to pass the check for data-type in PyTorchoutputs and inputs arguments of torch.autograd.grad). Kindly guide about what I can do here.
P.S. hessian_analysis.py is a wrapper I wrote around the library, for my use-case. I verified the wrapper by running a 2-layer neural network for a regression task.