Unintuitive reduction of mini-batch loss for NLLLoss

I find the reduction method that was chosen for the NLLLoss quite unintutive.

<img width="712" alt="screen shot 2018-07-26 at 18 33 10" src="https://user-images.githubusercontent.com/14793026/43275573-774953a0-9102-11e8-8f84-8d8f7955cd9a.png">

This introduces a weird interdependence of the chosen class weights with the chose batch size (and more: the influence of the class weights depend on which ground-truth classes are present in the mini-batch)
Extreme case with the current implementation: with batch size one, it does not matter which class weights I choose, my net will always see the same gradients.

In other words: I would expect `F.nll_loss(..., reduce=True) == torch.mean(F.nll_los(..., reduce=False))` but this does not hold true when using different class weights.

In the documentation of the CrossEntropyLoss it also says the following

<img width="647" alt="screen shot 2018-07-26 at 18 45 47" src="https://user-images.githubusercontent.com/14793026/43276191-2a0489c8-9104-11e8-9348-23b16ad5543b.png">

Especially the sentence "The losses are averaged across observations for each minibatch." is very misleading with the current implementation if you are using class weights.

I can only guess that the reason this implementation was chosen is s.t. your loss value doesn't change when you change the class weights (which makes multiple runs with different class weights more comparable when you're just looking at the loss), but it seems to come at a cost of a very unintuitive treatment of class weights, that in my opinion is not worth it.

cc @jlin27 @mruberry @albanD @jbschlosser

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unintuitive reduction of mini-batch loss for NLLLoss #9882

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unintuitive reduction of mini-batch loss for NLLLoss #9882

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions