Unexpected `shape` issue in Hessian-Vector computation

Hi!

Thank you making the source code of your work available. I tried to use the library for an application involving a 3D network architecture, and ran into the following issue:

```
********** Commencing Hessian Computation **********
Traceback (most recent call last):
  File "hessian_analysis.py", line 181, in <module>
    hessianObj.analyze(model_checkpoint_filepath)
  File "/media/ee/DATA/Repositories/PyHessian/hessian_analysis.py", line 70, in analyze
    top_eigenvalues, top_eigenvectors  = hessian_comp.eigenvalues(top_n=self.top_n)
  File "/media/ee/DATA/Repositories/PyHessian/pyhessian/hessian.py", line 167, in eigenvalues
    Hv = hessian_vector_product(self.gradsH, self.params, v)
  File "/media/ee/DATA/Repositories/PyHessian/pyhessian/utils.py", line 88, in hessian_vector_product
    retain_graph=True)
  File "/home/ee/anaconda3/envs/torch13/lib/python3.6/site-packages/torch/autograd/__init__.py", line 197, in grad
    grad_outputs_ = _make_grads(outputs, grad_outputs_)
  File "/home/ee/anaconda3/envs/torch13/lib/python3.6/site-packages/torch/autograd/__init__.py", line 32, in _make_grads
    if not out.shape == grad.shape:
AttributeError: 'float' object has no attribute 'shape'
```

Interestingly, the issue does not occur at the *first call* to back-propagation via `loss.backward()`, rather occurs at the call to `torch.autograd.grad()`.

I believe that the `float` object in question is the `0.` manually inserted when `param.grad is None` in the following routine:

https://github.com/amirgholami/PyHessian/blob/c2e49d2a735107a5d7ce2917d357d7a39b409fa4/pyhessian/utils.py#L61-L72

~~If I am right, it is even more mind-boggling that a type `float` is able to pass the [check for data-type](https://github.com/pytorch/pytorch/blob/e6779d4357ae94cc9f9fedb83a87eb6126016769/torch/autograd/__init__.py#L31) in PyTorch~~ *(I mistakenly mixed `outputs` and `inputs` arguments of `torch.autograd.grad`)*. Kindly guide about what I can do here.

P.S. `hessian_analysis.py` is a wrapper I wrote around the library, for my use-case. I verified the wrapper by running a 2-layer neural network for a regression task.

	def get_params_grad(model):
	"""
	get model parameters and corresponding gradients
	"""
	params = []
	grads = []
	for param in model.parameters():
	if not param.requires_grad:
	continue
	params.append(param)
	grads.append(0. if param.grad is None else param.grad + 0.)
	return params, grads

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpected `shape` issue in Hessian-Vector computation #8

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Unexpected shape issue in Hessian-Vector computation #8

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Unexpected `shape` issue in Hessian-Vector computation #8