Add support for LFM2 models by xenova · Pull Request #1367 · huggingface/transformers.js

xenova · 2025-07-16T20:07:15Z

Example usage:

import { pipeline, TextStreamer } from "@huggingface/transformers";

// Create a text generation pipeline
const generator = await pipeline(
  "text-generation",
  "onnx-community/LFM2-350M-ONNX",
  { dtype: "q4" },
);

// Define the list of messages
const messages = [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "What is the capital of France?" },
];

// Generate a response
const output = await generator(messages, {
    max_new_tokens: 512,
    do_sample: false,
    streamer: new TextStreamer(generator.tokenizer, { skip_prompt: true, skip_special_tokens: true}),
});
console.log(output[0].generated_text.at(-1).content);
// The capital of France is Paris. It is a vibrant city known for its historical landmarks, art, fashion, and gastronomy.

List of converted models:

HuggingFaceDocBuilderDev · 2025-07-16T20:09:23Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

* ONNX Runtime improvements (experimental native webgpu; fix iOS) (#1231) * customize the wasm paths * update implementation * allow using 'webgpu' in nodejs binding * update version of onnxruntime-node * Upgrade onnxruntime-web to same version as onnxruntime-node * Update list of supported devices --------- Co-authored-by: Joshua Lochner <26504141+xenova@users.noreply.github.com> * customize the wasm paths (#1250) * customize the wasm paths * update implementation * [internal] Add is_decoder option to session retrieval for preferred output location * Update tests * Formatting * Bump ort versions * Bump onnxruntime-node version * Bump versions * Bump ORT versions * Bump versions * Only check webgpu fp16 for non-node environments * Fix * Assume node supports webgpu * Update ORT node support comment * Relax test strictness * Update conversion script versions * Downgrade onnxslim * cleanup * Update package-lock.json * Update onnxruntime versions * Update post-build script * Use built-in session release function * Call garbage collection after each tokenizer test * Do not double-throw error * Fix race-condition in build process with file removal * Update versions * Bump jinja version * [version] Update to 3.6.3 * Bump jinja version to support new features * [version] Update to 3.6.3 * Add support for LFM2 models (#1367) * Use prefix in lfm2 output location (#1369) * Update package-lock.json * Run `npm audit fix` * Add special tokens in text-generation pipeline if tokenizer requires (#1370) * Add special tokens in text-generation pipeline if tokenizer requires * Fix logits processors tests * Update bundles.test.js * Update comment * Formatting * Add support for ModernBERT Decoder (#1371) * Use from/to buffer instead of string Actually fixes #1343 * Add support for Voxtral (#1373) * Support longform voxtral processing (#1375) * [version] Update to 3.7.0 * Add support for Arcee (#1377) * Optimize tensor.slice() (#1381) * Optimize tensor.slice() The performance of executing `tensor.slice()` is super poor, especially for the 'logits' tensor with large dimensions. ``` const logits = outputs.logits.slice(null, -1, null);` ``` This is because currently implementation of the `slice` method manually iterates through each element and calculate indices which is a big time consuming if the tensor shape is large. For cases like `slice(null, -1, null)`, where the slicing operation is contiguous along certain dimensions, which can be optimized by bulk copy by using `TypeArray.subarray()` and `TypeArray.set()`. * nit * Add a few more tensor slice unit tests --------- Co-authored-by: Joshua Lochner <26504141+xenova@users.noreply.github.com> --------- Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com> Co-authored-by: Wanming Lin <wanming.lin@intel.com>

yukiarimo · 2025-09-18T05:54:28Z

How to convert?

Add support for LFM2 models

8288cee

xenova merged commit 1d08e91 into main Jul 17, 2025
4 checks passed

xenova deleted the add-lfm2 branch July 17, 2025 16:16

xenova mentioned this pull request Jul 17, 2025

Use prefix in LFM2 output location for WebGPU #1369

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for LFM2 models#1367

Add support for LFM2 models#1367
xenova merged 1 commit intomainfrom
add-lfm2

xenova commented Jul 16, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Jul 16, 2025

Uh oh!

Uh oh!

yukiarimo commented Sep 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

xenova commented Jul 16, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Jul 16, 2025

Uh oh!

Uh oh!

yukiarimo commented Sep 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants