Skip to content

Migrate transformers cli to Typer#41487

Merged
LysandreJik merged 32 commits intomainfrom
switch-transformers-cli-to-typer
Oct 16, 2025
Merged

Migrate transformers cli to Typer#41487
LysandreJik merged 32 commits intomainfrom
switch-transformers-cli-to-typer

Conversation

@Wauplin
Copy link
Contributor

@Wauplin Wauplin commented Oct 9, 2025

This PR migrates the transformers CLI to Typer.

Typer is a package build on top of click by the creator of FastAPI. It is now a dependency of huggingface_hub, meaning it's also a dependency of transformers. Typer simplifies arg definition and ensures consistency using type annotations, which should help with maintenance. The benefit for users is the built-in autocompletion feature that let someone do transformers chat [TAB][TAB] to check what are the options. The --help section is also improved.

By migrating to Typer, a longer term goal is to delegate to huggingface_hub some aspects of the installation and auto-update of the CLI. This will come in a second time and doesn't have to be correlated with the v5 release.

CLI --help

transformers --help
Usage: transformers [OPTIONS] COMMAND [ARGS]...

  Transformers CLI

Options:
  --install-completion  Install completion for the current shell.
  --show-completion     Show completion for the current shell, to copy it or
                        customize the installation.
  --help                Show this message and exit.

Commands:
  add-fast-image-processor  Add a fast image processor to a model.
  add-new-model-like        Add a new model to the library, based on an...
  chat                      Chat with a model from the command line.
  download                  Download a model and its tokenizer from the Hub.
  env                       Print information about the environment.
  run                       Run a pipeline on a given input file.
  serve                     Run a FastAPI server to serve models...
  version                   Print CLI version.

Side notes

Noted down some stuff while working on it. Can be addressed in later PRs.

  1. Any command, even a simple transformers env is currently very long due both in previous and new CLI. This is due to torch import, no matter if it's used or not. I do think this is not good UX, especially if we want to have something like transformers chat as entrypoint for any openai-compatible server.
    This is not really related to CLI only, but to lazy imports in general. I broke it down to is_torch_available actually importing the package, not just checking it's existence.

  2. The current transformers serve + transformers chat twin commands are really nice. One to start a server, the other one launch a chat interface. However, I feel that the current UX for chat is too bloated since it covers both the case where a server is already started AND started a new server from a model id (or path). I do think that transformers chat should only be to consume an existing API. It would make the whole implementation much cleaner and the interface less bloated for the end user (currently having 4-5 arguments only to provide model name, path, address, port, and host, instead of a single "url" argument).
    Since this would be a breaking change, I think addressing it for v5 is the perfect timing.

=> EDIT: this is now done in this PR. transformers chat does not serve a model now (making it much simpler)

transformers chat https://router.huggingface.co/v1 HuggingFaceTB/SmolLM3-3B
  1. The transformers serve feature is currently only available as a CLI. I do believe it would be best to move it to its own module so that someone could call it programmatically (e.g. from transformers import serve in a notebook).

TODO

  • (minor) do not import typer_factory from private internal (requires an update in huggingface_hub first)
  • delete ./commands folder and remove transformers-legacy CLI (that I currently use for testing)
  • adapt remaining CLI tests
  • transformers chat UI-only (not serving)
  • do not use classes for chat and serve? + expose them as modules (e.g. from transformers import chat, serve)

@Wauplin Wauplin requested review from a team and LysandreJik October 9, 2025 16:28
@Wauplin Wauplin added the for_v5? label Oct 9, 2025
@Wauplin Wauplin marked this pull request as draft October 9, 2025 16:28
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@gante
Copy link
Contributor

gante commented Oct 11, 2025

@Wauplin in general LGTM 👍

Regarding your notes:

  1. imports were reworked very recently here -- maybe it helped?
  2. imo makes sense to have chat being consume-only :)
  3. also makes sense to have serve as its own module (it would also simplify testing)

@LysandreJik LysandreJik self-assigned this Oct 13, 2025
@Wauplin
Copy link
Contributor Author

Wauplin commented Oct 13, 2025

imports were reworked very recently #41268 -- maybe it helped?

Good to know! Changes in this PR looks nice. I tried again to run a simple transformers version and we still import torch by default (at least) in src/transformers/utils/generic.py (L45 and L369). I tried to hack it to import torch only when needed but then I didn't know what to do with _torch_pytree. So torch is still not lazy-loaded even though we are getting closer. I won't work on it in this P (happy to help later).

imo makes sense to have chat being consume-only :)

Yay! Will work on that since I also got informal approval from @LysandreJik

also makes sense to have serve as its own module (it would also simplify testing)

Will check that again but might be for a future PR.

@Wauplin
Copy link
Contributor Author

Wauplin commented Oct 13, 2025

@gante @LysandreJik should now be ready for review. The revamped transformers chat is now UI-only (doesn't have all the serving part). I tried to keep most of the logic intact, though I do think some parts where a bit broken/untested. Interface can be tested like this:

transformers chat https://router.huggingface.co/v1 HuggingFaceTB/SmolLM3-3B

@Wauplin Wauplin marked this pull request as ready for review October 13, 2025 16:07
@Wauplin
Copy link
Contributor Author

Wauplin commented Oct 13, 2025

Note: I'm not 100% sure that the slow tests are running. @ydshieh would it be possible to trigger them please? (or anyone else with the permissions?)

@LysandreJik
Copy link
Member

LysandreJik commented Oct 14, 2025

Reviewing it in batches:

  • The changes from the previous arg parsing to typer are very welcome
  • The chat changes are also coherent from an overview
  • Need to play with chat
  • Need to check the serve changes
  • Need to play with serve

@LysandreJik
Copy link
Member

While I agree we can remove the support of chat to launch models, I think we should throw good errors when doing so; for example if I run the following command which should work well:

transformers chat google/vaultgemma-1b

I get the following error:

Usage: transformers chat [OPTIONS] BASE_URL MODEL_ID [GENERATE_FLAGS]...
Try 'transformers chat --help' for help.

Error: Missing argument 'MODEL_ID'.

which isn't super clear: I'd tell the user that this path isn't supported anymore and to please launch a transformers serve session alongside it.

@LysandreJik
Copy link
Member

For transformers serve x chat to work well we likely need to merge the following in this PR: #41446

It throws an error currently that is fixed by the above ^ I'll merge it into main shortly

@LysandreJik LysandreJik force-pushed the switch-transformers-cli-to-typer branch from d299b27 to 0756797 Compare October 15, 2025 09:39
tests + fixup
@LysandreJik LysandreJik force-pushed the switch-transformers-cli-to-typer branch from 0756797 to cee44ab Compare October 15, 2025 09:42
@Wauplin
Copy link
Contributor Author

Wauplin commented Oct 15, 2025

While I agree we can remove the support of chat to launch models, I think we should throw good errors when doing so; for example if I run the following command which should work well:

transformers chat google/vaultgemma-1b

I get the following error:

Usage: transformers chat [OPTIONS] BASE_URL MODEL_ID [GENERATE_FLAGS]...
Try 'transformers chat --help' for help.

Error: Missing argument 'MODEL_ID'.

which isn't super clear: I'd tell the user that this path isn't supported anymore and to please launch a transformers serve session alongside it.

@LysandreJik I addressed this comment in 05d9515. It's not so straightforward to do it but "it works". I didn't want to just make it optional and then raise an error if not passed as it would have modified the --help section in a misleading way.

Error mesage is now:

> transformers chat google/vaultgemma-1b
Error: Missing argument 'MODEL_ID'.

Launching a server directly from the `transformers chat` command is no longer supported. Please use `transformers serve` to launch a server. Use --help for more information.

@LysandreJik
Copy link
Member

Awesome, thanks @Wauplin !

@LysandreJik LysandreJik merged commit af2a66c into main Oct 16, 2025
23 checks passed
@LysandreJik LysandreJik deleted the switch-transformers-cli-to-typer branch October 16, 2025 11:29
@Wauplin Wauplin mentioned this pull request Oct 16, 2025
ngazagna-qc pushed a commit to ngazagna-qc/transformers that referenced this pull request Oct 23, 2025
* Add typer-slim as explicit dependency

* Migrate CLI to Typer

* code quality

* bump release candidate

* adapt test_cli.py

* Remove ./commands + adapt tests

* fix quality

* consistency

* doctested

* do not serve model in chat

* style

* will it fix them?

* fix test

* capitalize classes

* Rebase

* Rebase

* tests + fixup

tests + fixup

* csutom error message

* fix ?

* should be good

* fix caplog globally

* inner caplog

* last attempt

* Retry

* Let's try with capsys disabled

---------

Co-authored-by: Lysandre <hi@lysand.re>
SangbumChoi pushed a commit to SangbumChoi/transformers that referenced this pull request Jan 23, 2026
* Add typer-slim as explicit dependency

* Migrate CLI to Typer

* code quality

* bump release candidate

* adapt test_cli.py

* Remove ./commands + adapt tests

* fix quality

* consistency

* doctested

* do not serve model in chat

* style

* will it fix them?

* fix test

* capitalize classes

* Rebase

* Rebase

* tests + fixup

tests + fixup

* csutom error message

* fix ?

* should be good

* fix caplog globally

* inner caplog

* last attempt

* Retry

* Let's try with capsys disabled

---------

Co-authored-by: Lysandre <hi@lysand.re>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants