Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PoC for metadata-based invalidation #20914

Closed
wants to merge 2 commits into from

Conversation

tdyas
Copy link
Contributor

@tdyas tdyas commented May 13, 2024

This PR is a proof of concept for invalidating adhoc_tool and shell_command targets based on the metadata (and not just the content) of the files in the repository.

The PR adds:

  • A new intrinsic (and rule graph support) for getting "full" metadata of a filesystem entry.
  • Use of that support for adhoc_tool and shell_command target types via a new invalidation_globs field.

Naming and API design subject to change.

@tdyas
Copy link
Contributor Author

tdyas commented Jun 1, 2024

@stuhood: I rebased this PR on top of main with your change to intrinsics to make them (appear as?) functions. I'm getting rule graph errors now: https://github.com/pantsbuild/pants/actions/runs/9332193530/job/25687760655?pr=20914#step:9:344

What do I need to do to properly add an intrinsic post the intrinsics as functions change?

EDITED: Solved it. The issue was getting the return type on the intinsic's type stubs wrong.

@tdyas
Copy link
Contributor Author

tdyas commented Jun 4, 2024

The intrinsic has been reworked and split out to #20996. This PR remains a proof of concept so I can still explore the best DX for metadata-based invalidation.

@huonw
Copy link
Contributor

huonw commented Jun 5, 2024

We've just branched for 2.22, so merging this pull request now will come out in 2.23, please move the release notes updates to docs/notes/2.23.x.md, if appropriate. Thank you!

tdyas added a commit that referenced this pull request Jun 20, 2024
)

Add a new `path_metadata_request` intrinsic rule to allow rule code to
request metadata about paths in the filesystem.

The intended uses are to support:
- Some form of metadata-based invalidation for `adhoc_tool` and
`shell_command` targets, as explored in the proof of concept in
#20914. (This PR in fact is
split out from that other PR.)
- Future potential work to avoid indefinite negative caching of
PATH-style lookups (which is a problem of the existing BinaryPath
lookups) by switching to direct monitoring of system-level paths. This
API would likely be extended to that use case. (But I am not yet
committing to any particular solution. Still, some form of metadata API
will be useful.)

Metadata is represented by the `PathMetadata` dataclass.
`PathMetadataRequest` and `PathMetadataResult` are the input/output
types, respectively, for the intrinsic.

Note: `NodeKey::fs_path_to_watch` is introduced to allow
`NodeKey::PathMetadata` to watch the parent directory even though
`fs_subject` remains its configured path. This is necessary because the
watch code will error with "file not found" if the path to be watched
does not exist. The solution is to watch the parent directory and wait
for creation / removal events related to the `fs_subject`.
@tdyas
Copy link
Contributor Author

tdyas commented Jul 24, 2024

Closing since this work was landed via other PRs.

@tdyas tdyas closed this Jul 24, 2024
@tdyas tdyas deleted the fs_metadata_intrinsic branch July 24, 2024 21:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants