Skip to content

Pin transitive dependencies #1495

@pquentin

Description

@pquentin

In the Python packaging world, libraries are handled quite differently from applications.

Libraries Applications
Publish Libraries are published as wheels on a package index (either a private registry or PyPI) Private applications are usually published as Docker images before deployment
Dependencies Libraries should not pin their dependencies in order to play well with other libraries Applications should pin all dependencies, including transitive ones, to make sure that we test the dependencies we ship
Tooling Libraries are expected to use setuptools or hatch Applications are expected to use Pipenv or Poetry

However, Rally is in a rough spot, because this is an application that we publish on PyPI like a library. We currently use hatch which was first designed for libraries, but pin our install requirements. This is perfectly fine in my opinion, because publishing to PyPI is very convenient, but we are fighting our tools. (And to be clear, I think this issue is orthogonal from #1420.)

If we were using pip-compile, Pipenv or Poetry to build a private application as a Docker image, then those two operations would be easy:

  1. upgrading our dependencies (which allows us to get the latest features and not hitting fixed bugs)
  2. pinning all transitive dependencies (to make sure that when a pull request passes, it's going to work)

It's actually 2. here that prompted this issue. We don't pin urllib3, and #1493 used an import introduced in urllib3 1.26.7 last year. The Rally CI worked fine, using the latest urllib3 version (1.26.9). But our nightly environments had urllib3 1.25.8 installed in 2020 and all pip install --upgrade .[develop] calls kept that version, which caused a benchmark failure because the import was not found. The fix will be to pin urllib3 manually, but that does not scale to all our existing dependencies.

To avoid this problem in the future, how can we pin our dependencies and have a mechanism to update them? There is no widely adopted tool that supports doing that and writing the result to pyproject.toml. Indeed, this authoritative post on setup.py insists that you should never pin dependencies like we do. And since this is seen as a big anti-pattern by everyone working on Python packaging, all proposals to make our use case easier are usually rejected or ignored (flit, pip-tools, setuptools, poetry).

Here's a proposal that solves our issue:

  1. Declare pinned dependencies in pyproject.toml
  2. Declare dependencies in a requirements.in file, leaving most of them abstract (but not the Elasticsearch Python client)
  3. Write or reuse a tool that pins the requirements with pip-tools (which is lighter than Pipenv/Poetry), then reads the pinned requirements and writes them to setup.cfg / pyproject.toml. We can call it manually from time to time.
  4. While we're at it, develop dependencies and test dependencies can also move to requirements file - this will be abstracted away by make install anyway

(I also considered using OpenStack pbr - https://docs.openstack.org/pbr/latest/user/index.html but that would force us to keep using setuptools.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions