Prefect Logo
Engineering

Switching a Big Python Library from setup.py to pyproject.toml

March 06, 2025
Nate Nowack
Senior Software Engineer
Share

On December 29th, 2022, zanie opened this issue to suggest we (prefecthq/prefect) migrate from the python package setup of old (e.g. setup.py, setup.cfg) to the now-standard pyproject.toml paradigm for defining project and build configuration.

Beyond being modern best practice, it's just a lot more convenient to have everything - all your pytest, ruff etc. config - all in the same place, and in case you don't trust my opinion, here's another reason: uv's top-level API (sync, lock etc.) requires a pyproject.toml 🙂.

Here we are, over 2 years later 😅 but we've finally come around to it. prefect is a large codebase that historically required these config files for linting, packaging, tests, etc.:

  • .ruff.toml
  • requirements.txt
  • requirements-dev.txt
  • requirements-client.txt
  • requirements-otel.txt
  • requirements-markdown-tests.txt
  • setup.cfg
  • setup.py
  • versioneer.py
  • MANIFEST.in

and now, all of that is replaced by a single pyproject.toml file

So, How'd We Do It?

Let's walk through each major aspect of our package configuration and how we migrated it, focusing on the benefits we've seen.

Package Metadata, Dependencies, and Extras

Previously, we had a setup.py that handled our package metadata and dependencies by reading from multiple requirements files:

1from pathlib import Path
2import versioneer
3from setuptools import find_packages, setup
4
5def read_requirements(file: str) -> list[str]:
6    requirements: list[str] = []
7    if Path(file).exists():
8        requirements = open(file).read().strip().split("\\n")
9    return requirements
10
11client_requires = read_requirements("requirements-client.txt")
12install_requires = read_requirements("requirements.txt")[1:] + client_requires
13dev_requires = read_requirements("requirements-dev.txt")
14otel_requires = read_requirements("requirements-otel.txt")
15
16setup(
17    name="prefect",
18    description="Workflow orchestration and management.",
19    packages=find_packages(where="src"),
20    package_dir={"": "src"},
21    python_requires=">=3.9",
22    install_requires=install_requires,
23    extras_require={
24        "dev": dev_requires,
25        "otel": otel_requires,
26        "aws": "prefect-aws>=0.5.0",
27        # ... many more extras
28    },
29)

This approach had several drawbacks:

  • Multiple sources of truth for dependencies (easy to make inconsistent updates)
  • Harder to track which dependencies were needed for what purpose (installing extras in CI)

Now with hatch (a modern Python build tool), all of this lives directly in pyproject.toml:

1[project]
2name = "prefect"
3description = "Workflow orchestration and management."
4requires-python = ">=3.9"
5dependencies = [
6    "aiosqlite>=0.17.0,<1.0.0",
7    "alembic>=1.7.5,<2.0.0",
8    # ... more dependencies
9]
10
11[project.optional-dependencies]
12aws = ["prefect-aws"]
13# ... many more extras
14
15[dependency-groups]
16dev = ["..."] # all of our dev dependencies

Note that dev is in the [dependency-groups] table and not in [project.optional-dependencies] - dependency groups are a modern python standard that allows you to group dependencies in a way that won't be exposed in published project metadata. This dev group receives a little bit of special treatment from uv (which we’ll see later on when we run our tests).

This consolidation creates a single source of truth for our dependencies, making it immediately clear what's required for each domain of the project. It also enables us to use modern tools like uv that can automatically manage our environment based on this configuration.

Managing Integration Packages with uv Workspaces

Prefect has a core library and multiple integration packages (AWS, GCP, Kubernetes, etc.) that live in the same repository. With our new setup, we've configured each integration package with its own pyproject.toml, while using uv's source references to link them together during development.

In our main pyproject.toml, we define paths to all integration packages:

1[tool.uv.sources]
2prefect-aws = { path = "src/integrations/prefect-aws" }
3prefect-azure = { path = "src/integrations/prefect-azure" }
4prefect-gcp = { path = "src/integrations/prefect-gcp" }
5# ... other integrations

And in each integration package's pyproject.toml, we reference the main Prefect package:

1# In src/integrations/prefect-aws/pyproject.toml
2[tool.uv.sources]
3prefect = { path = "../../../" }

This approach allows us to:

  1. Maintain separate package configurations for each integration
  2. Develop and test integrations against the local Prefect codebase
  3. Keep dependencies properly isolated while still having a unified development experience

Once you have the prefect repo cloned, install dependencies for all integrations by running:

1uv sync --all-extras

Because integrations depend on prefect from PyPI, the old setup was especially bad when working with editable integrations locally. When developing you needed to install from an integration root, then back to project root and install prefect editable to get changes from core. Beyond tedium, this often broke editors’ understanding of the resulting venv in annoying ways.

Version Management

Version management used to be handled by a customized versioneer.py with more configuration in setup.cfg:

1[versioneer]
2VCS = git
3style = pep440
4versionfile_source = src/prefect/_version.py
5versionfile_build = prefect/_version.py
6version_regex = ^(\\d+\\.\\d+\\.\\d+(?:[a-zA-Z0-9]+(?:\\.[a-zA-Z0-9]+)*)?)$

When evaluating modern alternatives, we initially looked at hatch-vcs, but found it didn't offer the same level of customization we needed to maintain continuity with our existing versioning scheme. Specifically, we needed to:

  1. Support our existing version format for backward compatibility
  2. Generate a version file with additional metadata (build date, git commit)
  3. Handle development versions with specific formatting

After exploring several options, we settled on versioningit, which provides the flexibility we needed while integrating nicely with hatch:

1[tool.hatch.version]
2source = "versioningit"
3
4[tool.versioningit.vcs]
5match = ["[0-9]*.[0-9]*.[0-9]*", "[0-9]*.[0-9]*.[0-9]*.dev[0-9]*"]
6default-tag = "0.0.0"
7
8[tool.versioningit.format]
9distance = "{base_version}+{distance}.{vcs}{rev}"
10dirty = "{base_version}+{distance}.{vcs}{rev}.dirty"
11distance-dirty = "{base_version}+{distance}.{vcs}{rev}.dirty"

One particularly nice feature of versioningit is the ability to write version information to a file during build time using a custom script. This allowed us to maintain backward compatibility with code that relied on our previous version information format.

<details>
<summary>custom script</summary>

1import textwrap
2from datetime import datetime, timezone
3from pathlib import Path
4from subprocess import CalledProcessError, check_output
5from typing import Any
6
7
8def write_build_info(
9    project_dir: str | Path, template_fields: dict[str, Any], params: dict[str, Any]
10) -> None:
11    """
12    Write the build info to the project directory.
13    """
14    path = Path(project_dir) / params.get("path", "src/prefect/_version.py")
15
16    try:
17        git_hash = check_output(["git", "rev-parse", "HEAD"]).decode().strip()
18    except CalledProcessError:
19        git_hash = "unknown"
20
21    build_dt_str = template_fields.get(
22        "build_date", datetime.now(timezone.utc).isoformat()
23    )
24    version = template_fields.get("version", "unknown")
25    dirty = "dirty" in version
26
27    build_info = textwrap.dedent(
28        f"""\\
29            # Generated by versioningit
30            __version__ = "{version}"
31            __build_date__ = "{build_dt_str}"
32            __git_commit__ = "{git_hash}"
33            __dirty__ = {dirty}
34            """
35    )
36
37    with open(path, "w") as f:
38        f.write(build_info)

</details>

1[tool.versioningit.write]
2method = { module = "write_build_info", value = "write_build_info", module-dir = "tools" }
3path = "src/prefect/_build_info.py"

Build Configuration

Our old build configuration was split between setup.py and setup.cfg. Now it's all handled by hatch:

1[build-system]
2requires = ["hatchling", "versioningit"]
3build-backend = "hatchling.build"
4
5[tool.hatch.build]
6artifacts = ["src/prefect/_build_info.py", "src/prefect/server/ui"]
7
8[tool.hatch.build.targets.sdist]
9include = ["/src/prefect", "/README.md", "/LICENSE", "/pyproject.toml"]

This consolidation has significantly simplified our build process. We no longer need to maintain separate files for different aspects of the build, and the declarative nature of TOML makes it much easier to understand and modify the configuration.

One piece of nuance here is our inclusion of src/prefect/server/ui in the sdist. This is a directory that is .gitignore'd, but is generated at UI build time. We include it in the sdist so that after installing prefect from PyPI users can run the dashboard with prefect server start.

Development Tooling Configuration

Previously, tool configurations were also scattered across multiple files. In our case, we had:

  • .ruff.toml for linting configuration
  • setup.cfg for assorted tooling configuration (pytest, mypy, etc)
  • .codespellrc for spellcheck configuration via codespell

Now (similar to the dependency declarations) they're all in one place:

1[tool.mypy]
2plugins = ["pydantic.mypy"]
3ignore_missing_imports = true
4follow_imports = "skip"
5python_version = "3.9"
6
7[tool.pytest.ini_options]
8testpaths = ["tests"]
9addopts = "-rfEs --mypy-only-local-stub"
10norecursedirs = ["*.egg-info"]
11python_files = ["test_*.py", "bench_*.py"]
12python_functions = ["test_*", "bench_*"]
13markers = [
14    "service(arg): a service integration test. For example 'docker'",
15    "clear_db: marker to clear the database after test completion",
16]
17
18[tool.ruff]
19...
20
21[tool.codespell]
22...

This consolidation makes it much easier to find and modify tooling configuration.

CI/CD Improvements

One of the most pleasant improvements we've seen from this migration is in our CI/CD process.

Previously, some or all of our CI pipelines had to:

  1. Install multiple requirements files in the correct order
  2. Implement some relatively manual dependency caching
  3. Use complex multi-step processes for different test scenarios

Looking at our GitHub Actions workflows now, we've dramatically simplified dependency installation across all our test jobs. For example, the core of our python-tests.yaml is now just:

1jobs:
2  run-tests:
3    steps:
4      - name: Set up uv and Python ${{ matrix.python-version }}
5        uses: astral-sh/setup-uv@v5
6        with:
7          enable-cache: true
8          python-version: ${{ matrix.python-version }}
9          cache-dependency-glob: "pyproject.toml"
10
11      - name: Run tests
12        run: |
13          uv run pytest ${{ matrix.test-type.modules }} \
14          --numprocesses auto \
15          --maxprocesses 6 \
16          --dist worksteal \
17          --disable-docker-image-builds \
18          --exclude-service kubernetes \
19          --exclude-service docker \
20          --durations 26 \

All by itself, uv run will inspect the project dependencies, install the dev group by default (or say --no-dev if you want) and then run pytest according to our flags and pyproject.toml config for pytest.

We use slight variations of run and sync for scenarios having varying requirements:

  • uv run pytest to simply run tests as configured by our project (requires no setup! ✨)
  • uv sync --compile-bytecode --no-editable to install deps for static analysis
  • uv sync --group benchmark --compile-bytecode to install deps for benchmarks
  • uv sync --group markdown-docs to install deps for documentation tests

It’s sufficient to say that uv just makes everything easier, but perhaps most significantly our workflow files are now just much cleaner - which makes things easier to read and maintain.

Compare the before:

1- name: Install dependencies
2  run: |
3    uv pip install ".[dev]"
4    uv pip install -r requirements-otel.txt
5    uv pip install -r requirements-markdown-tests.txt

To the after:

1- name: Install dependencies
2  run: uv sync --group markdown-docs --extra otel

💡 Using the top-level uv API allows us to more concisely and consistently install dependencies needed for different CI jobs.

Recap

This migration has delivered several concrete improvements:

  • Project configuration consolidated from nearly a dozen files to 1 pyproject.toml file
  • Simplified build configuration with hatch and versioningit
  • Reliable one-step contributor setup with uv sync
  • Consistent and concise installation of specific dependency groups in CI
  • Faster, more efficient workflows with modern tools
  • Clean separation between core and integration packages while maintaining a unified development experience

For teams considering a similar migration, we recommend:

  1. Start with the PyPA guide on migrating to pyproject.toml
  2. Choose tools that work well together - we found hatch, versioningit, and uv to be excellent choices for us
  3. Consider how your build and dependency management affects different aspects of your workflow, from development to CI/CD

By embracing modern packaging standards and tools, we've not only simplified our configuration but also improved the development experience for our team and contributors.

Have any questions or believe there’s a mistake in this post? Get a hold of us on GitHub!

Bonus!

This has been a library-focused blog post, but check out this great YouTube video by Hynek Schlawack where he explains his app-focused approach to structuring projects with pyproject.toml and uv.