Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support SDW components that manage dependencies using poetry #459

Closed
legoktm opened this issue Jul 31, 2023 · 11 comments · Fixed by #467
Closed

Support SDW components that manage dependencies using poetry #459

legoktm opened this issue Jul 31, 2023 · 11 comments · Fixed by #467

Comments

@legoktm
Copy link
Member

legoktm commented Jul 31, 2023

We parse requirements/requirements.txt in a few places, those need to be updated to parse poetry.lock. This should be easier since it's a TOML file, but also that means we need a TOML library for bullseye since it's only in the standard library with bookworm.

Places needing updates (may not be complete):

  • update-requirements - parses name+version out of requirements.txt
  • build-sync-wheel - passes requirements.txt over to pip download
@eloquence
Copy link
Member

I'd be game to take a stab the necessary scripting changes identified here, unless you or another team member prefer to work on it.

To recap my understanding:

  • As stated in the original poetry proposal, we're going to continue to use pip to install build-time requirements. Indeed, poetry does not yet support installing files from a local folder.
  • This is about ensuring that we can smoothly transition any of our existing repos over to Poetry.

I would suggest implementing this, for now, in a backwards-compatible manner, so we don't have to migrate all repositories to Poetry at once. We could maybe try moving a small repo like securedrop-proxy over, allowing us to test the tooling changes. Once all components use Poetry, we can remove the backwards-compatibility code.

@eloquence
Copy link
Member

I've got a poetry branch in securedrop-proxy. Next up I'll see if I can successfully build Debian packages from the poetry.lock requirements definition, building on your wip/poetry branch in this repo.

One thing I'm wondering is if we should migrate from setuptools to Poetry as a build system as part of switching to Poetry, but we can discuss that separately, for now I'm leaving the setuptools bits in place.

@legoktm
Copy link
Member Author

legoktm commented Aug 25, 2023

One thing I'm wondering is if we should migrate from setuptools to Poetry as a build system as part of switching to Poetry, but we can discuss that separately, for now I'm leaving the setuptools bits in place.

Whatever ends up being simpler I think. Normally I'm in favor of jettisoning extra dependencies, but setuptools is key enough to the entire Python ecosystem that we're going to end up depending on it regardless. Also setuptools has a long history of being stable and reasonably well maintained.

build-sync-wheel - passes requirements.txt over to pip download

I was thinking we could just do pip download foo==0.0.0 bar==0.0.0 etc. but we've had issues with pip download in the past (e.g. #457), so long-term we could also just download the tarballs ourselves? pypi URLs aren't easy to auto download from, but Debian has a redirector set up, so stuff like wget https://pypi.debian.net/setuptools/setuptools-68.1.2.tar.gz works.

@eloquence
Copy link
Member

eloquence commented Aug 28, 2023

but also that means we need a TOML library for bullseye since it's only in the standard library with bookworm.

Turns out we're already installing the older toml library via build in the bootstrap logic:
https://github.com/freedomofpress/securedrop-builder/blob/main/workstation-bootstrap/requirements.txt#L91

I've tried parsing securedrop-proxy's poetry.lock file with both toml, tomli and tomllib and the result is always the same, byte for byte. That said, I'd suggest we use tomli as a transitional library (until Bookworm/Python 3.11 gives us tomllib), since toml is still a developer release and does not seem to have fully implemented the TOML 1.0 spec (uiri/toml#300). build itself appears to have switched to tomli.

@eloquence
Copy link
Member

poetry-support has my current WIP; I've focused on the verify-hashes script for now since it's needed to get any build working. With those changes, and the only-poetry branch in securedrop-proxy, I'm able to build a securedrop-proxy package. I'll focus on the other scripts next, building on your existing WIP work.

I ended up just sticking with toml on Python 3.9 since it doesn't really seem worth adding an extra dependency.

As I go I'm updating some parts of the documentation and code comments that seem outdated or potentially confusing (e.g., I'm avoiding references to a "PyPI mirror" since I don't think that's the best terminology to describe our use of a local wheels directory without an index).

@eloquence
Copy link
Member

eloquence commented Sep 9, 2023

Progress update:

  • The only-poetry branch in securedrop-proxy now no longer enumerates transitive dependencies in pyproject.toml (but still pins them at the expected versions in poetry.lock).
  • I've cherry-picked Kunal's work in wip/poetry. Based on that, I've continued the consolidation work of parsing logic in utils.py. It's a bit of a mess right now; I'll do another cleanup/consolidation pass on it.
  • That said, I was able to now produce valid build-requirements.txt for securedrop-proxy using either input format.

Side quest:

  • Sadly (because we want to use them interchangeably as future-proofing for Python 3.11), the toml.load and tomllib.load function operate slightly differently -- the former expects a non-binary file object, the latter expects a binary one. To avoid using a different library or special-casing, I'm going with loads for now.
  • I did briefly explore adding tomli to the bootstrap. However, that resulted in some fun circular dependency hell (tomli violates PEP517 Build Requirements hukkin/tomli#154, which later versions of flit resolved by vendoring tomli, but we're stuck with an earlier version due to our other bootstrap requirements being so old). I look forward to refreshing our build toolchain as a whole to fight back against entropy.

@eloquence
Copy link
Member

eloquence commented Sep 12, 2023

OK, I'm happier with utils.py now. Next up: build-sync-wheels, then I'll take a pass through the tests.

I was thinking we could just do pip download foo==0.0.0 bar==0.0.0

I'll try that for now.

@eloquence
Copy link
Member

eloquence commented Sep 13, 2023

I took the easy way out for the source download step: poetry export seems to work just fine to generate the requirements.txt format. By default this excludes dev dependencies. The approach doesn't seem unreasonable to me; we're just asking poetry to speak pip's language when we use pip. In contrast, where we interpret the requirements ourselves, we parse Poetry's format directly.

This means we'll want poetry in the boostrap as well, but we'll probably need that sooner or later anyway. That's not in the branch yet.

From reading about pip download's surprising behavior (pypa/pip#1884), I agree entirely that we should switch to direct download and verification. Having the download step execute package-specific code (!) seems unnecessary and likely to bite us again in future in the form of random errors. I would suggest we track this separately.

CI is now fully green on poetry-support but I want to expand the tests a bit to cover the new behavior before, finally, opening a couple of PRs. In the meantime, please don't hesitate to let me know if the approach in the poetry-support branch raises any concerns or if I've missed any major functionality.

@eloquence
Copy link
Member

This means we'll want poetry in the boostrap as well,

.. he said foolishly, taunting the dependency demons. #464 looks like it'll be a good chunk of work and we may want to have "the conversation" first, of whether we really need to localwheel our localwheel-building environment (:xzibit:). In any event, I just added a smol utility function for now that should do the trick.

@legoktm
Copy link
Member Author

legoktm commented Sep 14, 2023

poetry has a lot of dependencies (comparatively) as it doesn't vendor them like pip does, which means adding it into the bootstrap is going to be a lot of work. I would suggest we avoid needing to bootstrap poetry by just installing it from Debian packages.

The only problem with doing so is it only shipped in bookworm, so we either wait for us to drop bullseye support or implement #460 in a way that lets us run different steps of the process with different containers/OSes. So maybe in the short term we just try to avoid running poetry? :/

@eloquence
Copy link
Member

Yeah, that's the conclusion I landed on as well - poetry-support branch now does the pip download step without using poetry. I agree, if we need it down the line in this repo, using the Bookworm version makes sense to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants