Skip to content

QoL: use uv for package installs and limit fineweb download to 500M tokens#222

Open
alexandermorgan wants to merge 3 commits intoKellerJordan:masterfrom
alexandermorgan:master
Open

QoL: use uv for package installs and limit fineweb download to 500M tokens#222
alexandermorgan wants to merge 3 commits intoKellerJordan:masterfrom
alexandermorgan:master

Conversation

@alexandermorgan
Copy link

This is a simple quality of life pr, not a new record. This changes the two bash commands in the README so that they use uv to install the python packages. For the docker case, this also required minor changes to the Dockerfile. These changes took the docker package install time from 3m30s to 59s for me. This pr also limits the number of tokens downloaded from fineweb to 500M, which is sufficient given the recent token efficiency improvements.

¿Why use --system in the uv pip install command? Because uv will otherwise default to installing things in a virtualenv which is not needed in this case (assuming people are renting a hardware instance for the runs).

¿Why doesn't the docker command pip install uv like the other command does? Because this is the way the Astral team recommends to install uv in a docker container.

¿Why use uv pip install instead of uv sync + a pyproject.toml file? The performance difference is negligible so uv pip install is preferred because it is a simple drop-in replacement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant