Horrifying grad bug present in v0.21.0-pre.1/2

A simple, handwritten token transformer I wrote for some mathematics research has found a grad bug in v0.21.0-pre.1. Good/bad loss plot, for identical model settings between v0.20.1 and v0.21.0-pre.1:

<img width="3200" height="1800" alt="Image" src="https://github.com/user-attachments/assets/3377be9d-39ec-4334-a6ae-c131159cfe9f" />

I'm in the process of trying to hunt down what the problem is, but thought I should let you know.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Horrifying grad bug present in v0.21.0-pre.1/2 #4596

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Horrifying grad bug present in v0.21.0-pre.1/2 #4596

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions