Describe the bug
After enabling gather_rounds for checkpoint load performance, we routinely have numerical instability in the workload.
Recommend removing this feature since it is under tested and unstable. We've run into this repeatedly during hero runs.