Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_validate_pyproject has inconsistent provenance #4472

Closed
jaraco opened this issue Jul 17, 2024 · 5 comments
Closed

_validate_pyproject has inconsistent provenance #4472

jaraco opened this issue Jul 17, 2024 · 5 comments

Comments

@jaraco
Copy link
Member

jaraco commented Jul 17, 2024

The code in _validate_pyproject is generated, so includes vendored copies of the validate-pyproject project and its dependencies, but it doesn't follow the pattern for vendored dependencies, leading to confusion when users suggest changes to that code. It's unclear to me, even after reviewing the technique for generating the package, which code is generated and which code is copied (I haven't gone as far as to review the pre_compile logic).

I'd like to move this code out of setuptools or treat it like any other vendored dependency, for clarity and consistency.

In #2825 (comment), @mgorny proposed to create a new, distinct package that contains the generated code. That approach seems suitable to me. Something like setuptools-validate-pyproject. That dependency could then be vendored just like any other.

An alternative to consider would be to move the generated code into a separate git repo and use git submodules to link it in. That would create the separation, encapsulate the generated code, and provide a clear custodial trail (users couldn't link the code in this repo and wouldn't be inclined to provide PRs to it here), but it would still be integrated into the release (as it is today). I'm not confident about this approach or what pitfalls it might entail, so I'm inclined to focus on vendoring instead.

Another option could be to avoid the static generation and instead have validate-pyproject generate the code on demand. Then setuptools could simply depend on validate-pyproject or maybe validate-pyproject[setuptools,distutils] (again, vendored per Setuptools' vendoring scheme).

@abravalheri How do you feel about having a new, separate package for the validator?

@abravalheri
Copy link
Contributor

Thanks, Jason. I can have a look at this when I am back another week. The only difference here would be to distribute the precompiled code in a separated package if I understood correctly.

which code is generated and which code is copied (I haven't gone as far as to review the pre_compile logic).

Most code is generated. The only files that are copied are the formats.py file and exception related modules: https://github.com/abravalheri/validate-pyproject/blob/c150b1583781c735d9ed1d3878f79c97fee61d71/src/validate_pyproject/pre_compile/__init__.py#L50-53 (in my mind treating it as "mostly" vendorised is imprecise, a closer analogy are generators closer like protobuf).

@jaraco
Copy link
Member Author

jaraco commented Jul 20, 2024

(in my mind treating it as "mostly" vendorised is imprecise, a closer analogy are generators closer like protobuf)

Thanks for the clarification. In that case, I feel less strongly about it, though I still think it would be nice to move the generated code into its own library or repo for better clarity. I'd like to explore the prospect of using a git submodule. I'll explore this possibility. Thanks for the insight and it's fine to consider this issue low priority.

@abravalheri
Copy link
Contributor

Hi @jaraco , sorry for the delay these days to reply (I am currently away from my computer).

My preferred approach is something like #4364.

The reason is the following:

By collocating the JSON schemas inside the setuptools project, we make the development process more agile and more accessible for contributors.

Currently, if we want to add a new configuration or fix a bug, I need to do it in the validate-pyproject repository. This is a bit contrived.

I would still keep the compilation step and keep the generated artifacts in the source tree because they simplify dependencies and allow us to bypass the complication of vendoring (which would introduce transient dependencies).

@abravalheri
Copy link
Contributor

Does #4364 make it more clear what is the provenance of the _validate_pyproject folder?

I hope it separates the concerns that come from setuptools directly (e.g. the tool.setuptools folder) and the infrastructure to compile such validations coming from _validate_pyproject.

Since the setuptools project is free to evolve the structure of tool.setuptools and any other TOML table it supports, it make sense to me to keep the generated code and the original schemas collocated in the setuptools repository. It also makes the development more agile because any contributor now can modify the JSON schemas and run tox -e generate-validation-code to see instant results without having to wait for changes in validate-pyproject.

@jaraco
Copy link
Member Author

jaraco commented Aug 12, 2024

Yes. That helps a lot. Let's see how that goes.

@jaraco jaraco closed this as completed Aug 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants