WIP: SFT (local backend) by Kovbo · Pull Request #530 · OpenPipe/ART

Kovbo · 2026-01-22T02:48:06Z

No description provided.

Move batching and shuffling logic from SFTConfig into iterator functions. train_sft now accepts Iterable[List[Trajectory]] instead of individual trajectories, simplifying the API and making batch management more explicit. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

angkywilliam · 2026-01-23T23:58:47Z

src/art/types.py



+class SFTConfig(pydantic.BaseModel):
+    learning_rate: float = 1e-4


Remove custom_lr_schedule
Make learning_rate: float | list[float]

angkywilliam · 2026-01-24T00:11:00Z

src/art/dev/train.py

+            Used to identify where assistant turns begin (train on responses only).
+    """
+
+    instruction_part: str


We probably can keep this class as empty?
Unsure if instruction_part and response_part is a good fit for experimental feature

angkywilliam · 2026-01-24T00:11:58Z

src/art/local/backend.py

+            batch_size = 2  # Default to 2 for SFT
+
+        # Determine learning rates
+        if config.custom_lr_schedule and len(config.custom_lr_schedule) > 0:


Refactor/Remove custom_lr_schedule.learning_rate is float | list[float]

Add validation for num_learning_rate == num_batches

angkywilliam · 2026-01-24T00:21:17Z

src/art/unsloth/service.py

+
+        # Save checkpoint after training
+        # Name checkpoint by final training step: starting_step + num_batches
+        final_step = get_step_from_dir(self.output_dir) + len(sft_batches)


Checkpoint step should be still incremented by 1.
Checkpoint step != Gradient step

angkywilliam · 2026-01-24T00:26:04Z

src/art/utils/model_config.py

+        response_part="<|im_start|>assistant\n",
+    ),
+    # Qwen 3 models (with thinking tokens)
+    "Qwen/Qwen3-8B": ModelConfig(


How we decide to support all of this model?

Prefer to keep it simple and start with model that's widely use in OpenPipe Platform and ART?

Research Qwen chat template, iirc <think></think> only show up at the last turn. We may need to remove <think></think> in response_part in Qwen.

I kept only OpenPipe/Qwen3-14B-Instruct for now because it’s the only model with a custom chat template. All other mainstream models should be recognized by the detect_chat_template_parts function.

Also, I don’t feel strongly about this, but I did some research and didn’t find good arguments for using different default learning rates for different models. The general consensus online seems to be to start with 2e-4 with a linear/cosine scheduler.

angkywilliam · 2026-01-24T02:30:39Z

src/art/utils/sft.py

+        progress_bar.close()
+
+
+def iterate_file(


Have iterate_file take in epoch
See the following PR for reference

angkywilliam · 2026-01-24T02:32:15Z

src/art/utils/sft.py

+                yield _parse_jsonl_line(line)
+
+
+async def train_sft_from_file(


Modify this so user can have the training continue running after closing their laptop.

Iterate_file(file, epoch)

Calculate lr

Call train_sft
3.1 Write to local disk
3.2 Upload to wandb artifact
3.3 Call train_sft API
3.4 Monitor training status

Resolved conflicts: - pyproject.toml: kept tinker deps and newer weave version from sft branch - src/art/backend.py: kept Protocol signatures from main, added _train_sft method - src/art/serverless/backend.py: kept SFT imports (Trajectory, SFTConfig) Co-Authored-By: Claude Opus 4.5 <[email protected]>

angkywilliam · 2026-02-03T23:37:44Z

src/art/local/backend.py

+        from ..utils.model_config import get_instruction_response_parts
+
+        # Get instruction/response parts (from config or auto-detect)
+        instruction_part = dev_config.get("instruction_part", None)


Remove this? dev_config no longer have instruction_part and response_part

angkywilliam · 2026-02-03T23:38:27Z

src/art/utils/model_config.py

+"""Model-specific configuration for chat templates and training defaults."""
+
+
+def detect_chat_template_parts(


We need to add support to handle OpenPipe/Qwen3-14B-Instruct?

angkywilliam · 2026-02-03T23:49:50Z

src/art/utils/sft.py

+        return
+
+    # Prepare dataset: shuffle per epoch, concatenate, and calculate learning rates
+    all_trajectories, learning_rates = prepare_sft_dataset(


Won’t this blow up memory for large files?
E.g 1GB file train for 10 epoch will take at least 10GB itself outside of the system overhead itself.

We need to maintain the trajectory as iterator (streaming) from reading from file to calling train_sft to ensure we don't materialize the entire 10GB of memory. Using iterate_file with epochs should address this.

Yeah, it was not yet optimized for large file streaming, just added it in the latest PR.

A summary of how it works:

Count rows without loading data
get_file_row_count(file_path) → Scans file, counts non-empty lines
This gives us row_count to calculate the LR schedule without loading data into memory.

Calculate learning rate schedule upfront

total_trajectories = row_count × epochs total_batches = ceil(total_trajectories / batch_size) warmup_steps = total_batches × warmup_ratio full_schedule = create_lr_schedule(total_batches, peak_lr, ...) learning_rates = full_schedule[initial_step:] # Slice for resuming

The full LR schedule (one value per batch) is pre-computed as a list.

Create a streaming trajectory generator
trajectories = iterate_file(file_path, epochs, shuffle_buffer_size, initial_skip)
This returns a generator (not a list). It:

Reads the file line by line

Uses buffer-based shuffling (fills buffer of 10k items, randomly pops one)

Repeats for each epoch with a different random seed (seed + epoch)

Pass generator + LR list to backend
await model.train_sft(trajectories, config) # trajectories is a generator

Backend batches and tokenizes on-the-fly
create_sft_batches() generator:

Collects trajectories into batches of size batch_size

Tokenizes each batch immediately

Yields SFTBatch objects

Producer thread feeds queue
It may look ugly, but in the local backend we run training in a subprocess, and we cannot send a generator to this subprocess. We can't just call unsloth/service.train_sft() and pass a generator of batches directly.
There are two options:

Break down unsloth service.train_sft into smaller functions (setup, training, cleanup), iterate over batches on the client side and send each batch object to a training function individually.

Use a queue — create a queue, put batches into it from a producer thread, and pass the queue to train_sft.
I went with the second approach.

Service trains batch by batch

while batch := queue.get(): forward pass → loss → backward pass → optimizer step yield metrics

src/art/serverless/backend.py

angkywilliam and others added 30 commits November 13, 2025 13:11

SFT data iterator

498d3df

Add SFT LR utils

3bd818f

train_sft skeleton

66ec620

SFT Shape 0.1

4aeda2f

Add shuffle to SFTConfig

4ff152b

change SFT args order

b6f0380

Tokenize SFT Batch

9138b07

Add num_trainable_tokens to SFTBatch

18a7897

draft train_sft

90bf94b

Flatten trajectory for train_sft

12e2142

Tokenize SFT Batches support flat list and add padding

4ea6c5e

Fix max_length duplicate name issue

f7bb203

Remove unused file

d59e524

remove unused typing

7f6309a

sft iterator

5ec5575

SFT Iterator

d6688cf

Use Unsloth for train on response

6c63af5

Merge branch 'main' of github.com:OpenPipe/ART into sft

d2b39d5

refactoring

ca5177b

implement local backend SFT training

c3a06b4

Add SFT to Local Backend

9cf747d

avg loss

28205cb

refactor, sft works good

64454b1

Merge branch 'sft' of github.com:OpenPipe/ART into sft

739eb45

Merge remote-tracking branch 'origin/main' into sft

9918f65

remove logging

fb706f9

move tokenizer, update backend

08d87d1

update lr schedule and tests

0573bc8

refactor sft training from file

904c3ff

Kovbo added 5 commits January 22, 2026 01:23

make local random

9544df9

refactor backend

c6b2874

refactor

834b37e

Merge branch 'main' of github.com:OpenPipe/ART into sft

736f259

update example

84e6ceb

Kovbo requested a review from angkywilliam January 22, 2026 02:48

Kovbo added 4 commits January 22, 2026 03:25

Pyright fix

e2ea1ec

remove iterate file epochs, refactor

0fa52f8

refactor

e43cbea

Merge branch 'main' of github.com:OpenPipe/ART into sft-local-backend

2fae9c8

Kovbo marked this pull request as ready for review January 22, 2026 21:42

Kovbo added 4 commits January 22, 2026 23:19

add serverless endpoint

d336f18

Rename training_folder_url to training_data_url

c9f63fe

update defaults, change reporting

61ff551

update lables

997b69f

angkywilliam reviewed Jan 24, 2026

View reviewed changes

Kovbo and others added 4 commits January 26, 2026 19:45

make sft to produce only one checkpoint step

e67accd

refactor train from file

3238810

refactor

393495f

Kovbo changed the title ~~SFT (local backend)~~ WIP: SFT (local backend) Feb 2, 2026

Kovbo added 3 commits February 2, 2026 20:51

Refactor SFTTrainConfig

ae21b5b

refactor

4daedeb

Merge remote-tracking branch 'origin/main' into sft-local-backend

e5ee192

angkywilliam reviewed Feb 4, 2026

View reviewed changes

Kovbo added 5 commits February 4, 2026 01:27

correctly register lora, fix unsloth proxy check

2991645

Merge branch 'main' of github.com:OpenPipe/ART into sft-local-backend

d2513eb

add sft train from file streaming

24dcc4c

add openpipe qwen back

f38ff55

lint fix

e8c9f9a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: SFT (local backend)#530

WIP: SFT (local backend)#530
Kovbo wants to merge 59 commits intomainfrom
sft-local-backend

Kovbo commented Jan 22, 2026

Uh oh!

angkywilliam Jan 23, 2026

Uh oh!

angkywilliam Jan 24, 2026

Uh oh!

angkywilliam Jan 24, 2026

Uh oh!

angkywilliam Jan 24, 2026

Uh oh!

angkywilliam Jan 24, 2026

Uh oh!

Kovbo Feb 5, 2026

Uh oh!

angkywilliam Jan 24, 2026

Uh oh!

angkywilliam Jan 24, 2026 •

edited

Loading

Uh oh!

angkywilliam Feb 3, 2026

Uh oh!

angkywilliam Feb 3, 2026

Uh oh!

angkywilliam Feb 3, 2026

Uh oh!

Kovbo Feb 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants



		class SFTConfig(pydantic.BaseModel):
		learning_rate: float = 1e-4

		"""Model-specific configuration for chat templates and training defaults."""


		def detect_chat_template_parts(

Conversation

Kovbo commented Jan 22, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

angkywilliam Jan 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

angkywilliam Jan 24, 2026 •

edited

Loading