Skip to content

Releases: microsoft/nnscaler

v0.8

22 Aug 03:08
1585c15

Choose a tag to compare

Release Notes

Features

  • cli: Added support for loading auto-merged checkpoints when world size changes. Introduced iterable dataset support and stateful dataloader. @0xWJ
  • AutoDist: Added support for pipeline stage configuration (pipeline_nstages option). @liuzhe-lz

Breaking Changes

  • Hooks System:

    • Added new hooks for logging metrics. @0xWJ
    • Simplified existing hook parameters (some parameters removed). @0xWJ

Bug Fixes

  • AutoDist:

    • Fixed issues with recompute modules and re_profile estimation. @zyeric
    • Corrected split info calculation. @zyeric
    • Fixed data parallel test stability. @0xWJ
    • Ensured all GPUs are used when model is small. @liuzhe-lz
    • Removed reliance on cppimport (replaced with pybind11). @zyeric
  • Torch/PyTorch:

    • Fixed compatibility with torch.compile in autograd/tracer. @0xWJ
    • Fixed torch.arange operator issue. @0xWJ
    • Fixed torch.load compatibility for PyTorch 2.6 (weights_only default). @0xWJ
  • Trainer: Fixed argument resolution when using command-line overrides. @0xWJ

  • AutoDist autocast: Corrected behavior under mixed precision. @zyeric

  • Parser: Fixed issues with function registration and apex integration. @zyeric

  • Adapter Generation: Fixed dij index bug (resolves #36 ) @zyeric

  • Example: Fixed annotation bug in deepseek coder v2 lite (resolves #37 ). @zyeric

Improvements

  • AutoDist:

    • Refined recompute modules. @zyeric
    • Enhanced robustness when SPMD follow fails. @zyeric
  • Examples:

    • Refined Ring-Attention example. @zyeric
    • Added Diff-Transformer example. @yyl9510
    • Added LongRope example. @J-shang

v0.6

10 Jan 02:53
2368540

Choose a tag to compare

Merge v0.6 (#24)

v0.5

27 Nov 07:14
cd36c4b

Choose a tag to compare

Merge v0.5 (#14)