Add checksums to enforce immutability #250

avri-schneider · 2024-10-31T13:30:26Z

This PR enhances the TFLint setup action by allowing multiple expected SHA256 hash values for verifying the downloaded binary. With this update:

Multiple Hash Verification: The expected_sha256 input now accepts a comma-separated list of SHA256 hashes. This enables the action to validate the integrity of the TFLint binary against multiple valid hashes, adding flexibility when downloading from different sources or mirrors.
Documentation Updates:
- Updated action.yml to reflect that expected_sha256 supports multiple hashes.
- Revised README.md to include instructions for providing multiple hashes, with an example in the Usage section.
Usage Example: In the README, users can now see how to pass multiple expected hashes to the action.

This improvement ensures that the setup action can handle variations in binary hash values across sources while maintaining integrity checks. Please review the changes and confirm if any additional updates are needed.

I think this should address #16.

- Added support for SHA256 hash verification to ensure the integrity of downloaded binaries. - Introduced `expected_sha256` as an input parameter for users to supply the expected hash. - Implemented hash computation and comparison; download proceeds only if hashes match. - Improves security by preventing execution of potentially tampered files. This enhancement secures the installation process by verifying file integrity.

…tion - Introduced `expected_sha256` as a new input parameter in `action.yml`. - Allows users to specify an expected SHA256 hash for the downloaded TFLint binary. - If `expected_sha256` is provided, the action will compute and compare the hash, proceeding only if there's a match. - Enhances security by preventing execution of tampered or corrupted files. This update ensures that downloaded binaries meet integrity requirements before installation.

… feature - Updated version to 2.1.0 to reflect the new SHA256 verification feature for downloaded binaries. - Enhanced description and keywords to include SHA256 verification. - This update improves security by allowing users to verify the integrity of downloaded files with an expected SHA256 hash. Version bump ensures semantic versioning alignment with the new feature addition.

- Revised `README.md` to describe `expected_sha256` input as accepting a comma-separated list of SHA256 hashes. - Provided example usage in the `Usage` section showing how to specify multiple expected hash values. - Clarified that this change allows integrity verification against multiple valid hashes, adding flexibility for binaries from different sources or mirrors.

bendrucker

Can you cite prior art for supporting multiple hashes for a given version?

README.md

Updated documentation for the expected_sha256 parameter to clarify that multiple SHA256 hashes may be provided to support platform-specific binaries (e.g., macOS, Linux, Windows) for the same TFLint version. This ensures integrity checks are compatible across platforms in matrix workflows.

avri-schneider · 2024-10-31T21:54:20Z

prior art for supporting multiple hashes for a given version

Consider these examples of prior art for handling multiple hashes per version across platforms:

PyPI: Provides platform-specific hashes for each Python package distribution (e.g., .whl files).
Homebrew: Stores unique sha256 hashes per OS and CPU architecture in its formulae.
Docker: Uses manifest lists to assign unique hashes to platform-specific images under a single tag.

- Refactored handling of `expected_sha256` input to avoid errors when input is undefined. - Ensured `expectedHashes` is set to an array of hashes or `undefined` based on the presence of `expected_sha256`. - This fix prevents unexpected exceptions during execution if `expected_sha256` is not provided, allowing for optional integrity checks. This resolves issues with optional SHA256 verification in the `downloadCLI` function.

bendrucker

Made some minor comments. The correction from mirrors serving the same binary to multiple binaries for different OS/arch pairs is 👍🏻.

Still, we have two modern approaches available to verification that don't require hardcoding trusted checksums:

It's fine to implement explicit checksums but it would tend to benefit vastly more users to implement automatic provenance verification versus manual checksum verification. I am somewhat hesitant to ship the manual feature first.

Are you up for taking on this first, as a separate PR? Otherwise we can hold onto this and I can try to find time to implement the GH attestation API.

action.yml

README.md

src/setup-tflint.js

- Replaced `expected_sha256` input with `checksums` for clarity. - Updated `checksums` input to use newline-separated values instead of commas for better readability. - Added a dedicated example section for `checksums` usage in the documentation to guide users on its proper implementation.

avri-schneider · 2024-11-01T09:29:00Z

Made some minor comments. The correction from mirrors serving the same binary to multiple binaries for different OS/arch pairs is 👍🏻.

Still, we have two modern approaches available to verification that don't require hardcoding trusted checksums:

https://github.com/terraform-linters/tflint?tab=readme-ov-file#github-cli-recommended

https://github.com/terraform-linters/tflint?tab=readme-ov-file#cosign

It's fine to implement explicit checksums but it would tend to benefit vastly more users to implement automatic provenance verification versus manual checksum verification. I am somewhat hesitant to ship the manual feature first.

Are you up for taking on this first, as a separate PR? Otherwise we can hold onto this and I can try to find time to implement the GH attestation API.

Thank you for the details on "gh attestation" and "cosign verify-blob"—I wasn't aware of these methods and appreciate the insight.

I have a follow-up question about scenarios where companies need strict control over unvetted third-party code in CI. Pinning specific checksums of reviewed binaries provides security by ensuring that only a pre-approved, stable binary is executed for a given version, OS, and architecture.

To achieve the same security level with certificate-based verification, users would still need to fetch and verify these certificates or checksums independently and incorporate them into the workflow. This added layer could complicate workflows, as it would require manual management of certificates within each CI environment—whereas direct checksum verification offers straightforward control and security.

Otherwise, it would make sense to release the explicit checksum feature independently, as it provides immediate benefits that automatic provenance verification would not cover, even if implemented in the future.

Would you suggest any streamlined method for integrating certificate or attestation verification that would still allow the user to maintain strict control over which binaries execute in CI?

avri-schneider · 2024-11-01T12:13:24Z

It's fine to implement explicit checksums but it would tend to benefit vastly more users to implement automatic provenance verification versus manual checksum verification. I am somewhat hesitant to ship the manual feature first.

If automatic provenance verification offers a similar security level, explicit verification might not be necessary for most users. However, if the manual checksum addresses a distinct threat model, releasing it first makes sense, as it’s already implemented and could meet immediate security needs. Since the manual checksum feature is already in my PR, would there be any specific concerns about releasing it first?

remove irrelevant text

avri-schneider · 2024-11-03T10:45:33Z

@bendrucker As a side-note, users of this feature would most likely pin the action to the commit SHA - do you think the README.md should be updated accordingly?:

    - uses: terraform-linters/setup-tflint@ad6bd969c3aad663d980d8fa7d083aad8e7e758d
      name: Setup TFLint with Checksums
      with:
        tflint_version: v0.52.0
        checksums: |
          40f7ee2dbeb8e7cbd5ab7b10912f60eb14aa4fbff62603eeb67fdb5f7cbb794a
          bf758ff29b607b3fbc4a3630ea3b39df4afafe3cdb80c6d71fe528feeac2c58e
          fed6ff15ee10db34a23044ac0d4da8fdc1f2f3663b32ec85d388374dd95670aa

bendrucker · 2024-11-05T05:00:57Z

My hesitation stems from the fact that software supply chain security is complicated and most developers have a limited to bare understanding. "Simple" features can do harm by inviting users down the wrong path.

The unambiguously safe posture/tooling for an organization that wants to securely and reliably use third party software at scale is forks/mirrors. You control the source code, the builds, and the hosting of the artifacts.

reviewed binaries provides security

This is the sticking point. You can't review a binary.

If you don't have that control, the checksums only prevent the binary from changing to another binary after you adopt it. In order to trust a binary, you need to be able to prove that it originated from a specific source code revision. That's what attestation is all about.

Take this scenario:

We publish TFLint v0.100.0, with hash abc123, to GitHub Releases
A malicious actor gains access to the repository releases
The actor builds malicious TFLint binaries locally and generates new hashes
They upload the new artifacts, overwriting v0.100.0

Checksums only protect you if you adopt the version before 4 happens. Otherwise, you're liable to adopt a version poisoned with hidden malware which steals source code, credentials, etc. in the background while dutifully recording its checksum.

Signing techniques like attestation mitigate this by detecting step 4, where artifacts are published but were provably not signed in the trusted build environment. An attacker now needs to upload their malicious revision, tag it, and re-run the release workflow. I believe you can delete Action workflow run logs as a repository administrator but I don't think it's possible to scrub the metadata about the job run, which should immediately point to a compromise once examined.

Checksums are helpful but can't sufficiently secure binaries on their own. Take Terraform (Registry) for example. Providers ship as binaries. Terraform's lockfile prevents those binaries from changing on future downloads. But the Registry also requires provider authors to sign every binary (with a GPG key) when publishing to GitHub Releases. So you can at least be sure that the provider was built by someone with that GPG private key, and not just access to the author's Terraform Registry account. Same ideas but just not implemented as nicely because of the dependence on passing secrets to CI (a private key).

All that said, checksums have their place, they just suffer from especially poor usability in this context because we can't rely on a CLI to write them automatically.

users of this feature would most likely pin the action to the commit SHA

Probably right, but they should resolve the SHA for the latest version at install time. They can use https://github.com/sethvargo/ratchet or any of the other tools that handle auto-pinning in Actions workflows.

I'll find time to land this sometime soon with documentation revisions. Mostly want to emphasize in the docs that checksums guarantee immutability only, not integrity, trust, attribution, etc.

avri-schneider · 2024-11-05T07:15:19Z

The unambiguously safe posture/tooling for an organization that wants to securely and reliably use third-party software at scale is forks...

(Side note:) While GitHub forks allow organizations to control the source code and artifact hosting, they carry a risk. Forks of public repositories cannot be made private, so any secret inadvertently committed would be publicly exposed and difficult to remediate. This makes forks a double-edged sword for organizations aiming to avoid accidental exposure while managing third-party software.

This is the sticking point. You can't review a binary.

Organizations would indeed review the source code, ensure the binary was generated from that reviewed source, and then pin the binary’s checksum. For those that may not fully trust the upstream developer, the developer’s signing of binaries only partially mitigates supply-chain risks.

The safest approach remains for organizations to compile from reviewed source and store binaries in a private artifact repository. Still, some organizations might consider the likelihood of a malicious binary being undetected in the public repository to be a lower risk than allowing upstream code to run arbitrary builds in their CI environment.

For such organizations, a practical route could be to leverage provenance verification methods as you suggested, confirming that reviewed source code hashes match binaries signed by the developer, and then pinning those validated hashes. Since the process of pinning is inherently manual, there may be no real need for additional implementation. Instead, a note in the documentation advising users to verify hashes before pinning—using the methods you recommended—could suffice. Although this workflow requires manual verification, it balances security with practical CI use, as organizations can pin hashes after reviewing and approving the source and performing local provenance verification.

- Updated the README to include a note advising users to verify binary hashes independently before pinning. - Suggested using methods such as GitHub’s attestation or cosign for hash verification to ensure that only approved binaries are used in workflows. - Emphasized that the `checksums` input requires manual pinning and that users should validate hashes as an added security measure.

- Updated the `checksums` section to clarify that checksums guarantee only the immutability of the downloaded binary by verifying it has not been altered, without addressing integrity, trust, attribution, or authenticity. - Specified that multiple hashes enable cross-platform compatibility for workflows running on different platforms, architectures, or operating systems, where each may produce a unique hash. - Added guidance for users to verify the binary's source and authenticity independently using methods like GitHub’s Artifact Attestations or cosign before adding hashes to workflows.

avri-schneider · 2024-11-05T11:15:22Z

Probably right, but they should resolve the SHA for the latest version at install time. They can use https://github.com/sethvargo/ratchet or any of the other tools that handle auto-pinning in Actions workflows.

Pinning should follow a thorough code review of the reusable action to ensure no unpinned binaries, untrusted external resources, or vulnerabilities to injection attacks are present. During my review of your action, I noticed it was pulling binaries from GitHub, which required hardening. This led to my PR contribution to address these security considerations.

For organizations, maintaining an allow-list of pre-approved actions, each pinned to a specific commit SHA, enables developers to use only those actions vetted by the security team.

GitHub’s security model has a limitation where it resolves hashes to commits in forks, even if those commits haven’t been merged into the upstream repository. This allows a developer to potentially bypass security controls by creating a malicious commit in a fork. For instance, a SHA pin like terraform-linters/setup-tflint@b246abd5a435d1528cd70b4cda3542488db8f807 would technically work, even though my PR hasn’t been merged yet.

This limitation matters for organizations that might want to allow actions from a trusted organization using wildcard patterns, as it introduces a risk that unmerged, potentially harmful commits in forks could be resolved. For robust security, each action should be explicitly pinned and reviewed to avoid this potential bypass.

bendrucker · 2024-11-06T19:47:03Z

ensure the binary was generated from that reviewed source

My general hangup here is that I'm convinced virtually no one does this. Instead they copy a checksum that they never verify.

It's sort of like unverified SSL in that encryption is better than nothing but it's still not fully secure.

Still, some organizations might consider the likelihood of a malicious binary being undetected in the public repository to be a lower risk than allowing upstream code to run arbitrary builds in their CI environment.

This is the other half, it shouldn't be arbitrary, because you reviewed all the code. But likewise, I don't believe that most people are actually reviewing the content before pinning its hash. It feels secure (hashes!) but avoids any "hard" things like... reading code 🫢.

In any case, these are just my personal biases/priors on checksums, and why I'm a little hesitant to "push" users towards doing this unless they're sure they need it. Thanks for bearing with me! Made some small copy edits and ended up removing the example as it took up too much space for an "advanced" feature. Otherwise LGTM, will fix the conflicts and merge this momentarily.

bendrucker · 2024-11-06T20:00:16Z

Separately I'll try to make some time to work on automatic attestation. Attestation will return the exact commit SHA when verifying a checksum file. Users could then specify a commit SHA for verification. The action would download the specified version and verify it was built from that commit.

You can prove that the binaries came from the right repository, with a matching tag to the version, and the exact content that tag pointed to at build time. The only way the actual binary could ever change would be to re-run a previous release job. And even then, the workflow cannot change, so some upstream, unpinned dependency would have to be compromised before the re-run could compromise a build.

avri-schneider · 2024-11-06T20:48:04Z

The only way the actual binary could ever change would be to re-run a previous release job. And even then, the workflow cannot change, so some upstream, unpinned dependency would have to be compromised before the re-run could compromise a build.

What about self-hosted runners?

Thanks for bearing with me!

Likewise! :)

bendrucker · 2024-11-06T21:16:57Z

What about self-hosted runners?

Yes, the runner environment could be compromised. Since the workflow file must specify self-hosted runner labels in runs-on, you know for a given SHA whether the workflow would run on a GitHub runner or a self-hosted one.

avri-schneider · 2024-11-06T21:44:55Z

That is a very strong and undocumented constraint.

avri-schneider · 2024-11-06T22:03:46Z

Also, what prevents a maintainer from creating a self hosted runner labeled "ubuntu-latest"?

bendrucker · 2024-11-06T23:08:13Z

It's a bit hard to tell from the docs but it seems like self-hosted is a required label when passing a list of labels to runs-on. But when using runner groups, you don't need labels:

https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/managing-access-to-self-hosted-runners-using-groups

But if it's indeed possible to label your self-hosted as ubuntu-latest and magically start taking jobs away from the hosted runners, that's bad behavior and definitely something to report to GitHub.

avri-schneider · 2024-11-07T08:56:48Z

It's a bit hard to tell from the docs but it seems like self-hosted is a required label when passing a list of labels to runs-on. But when using runner groups, you don't need labels:

https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/managing-access-to-self-hosted-runners-using-groups

But if it's indeed possible to label your self-hosted as ubuntu-latest and magically start taking jobs away from the hosted runners, that's bad behavior and definitely something to report to GitHub.

tested it here - it executed the job on my self-hosted runner, which I labeled ubuntu-latest. This is a re-run of a previously successful workflow job that was executed on a github hosted runner

bendrucker · 2024-11-07T17:30:05Z

Interesting. Sloppy on GH's part but impossible to undo at this point given that someone could be depending on it.

I took a closer look at the attestation response and it looks you just need to assert against the runner environment if you want to enforce it.

gh attestation verify ~/Downloads/checksums.txt --repo terraform-linters/tflint --format json --jq '.[].verificationResult.signature.certificate.runnerEnvironment'

Loaded digest sha256:c22862d98642e3894687d0732632fe301e8053f581fb43405ceddd533a74f605 for file:///Users/ben/Downloads/checksums.txt
Loaded 1 attestation from GitHub API
✓ Verification succeeded!

github-hosted

avri-schneider · 2024-11-07T22:06:40Z

Interesting indeed, but for self-hosted runner, isn't the value of verificationResult.signature.certificate.runnerEnvironment controlled by the admin of the self-hosted runner?

bendrucker · 2024-11-08T00:42:28Z

Shouldn't be any more falsifiable than the SHA or any other attested attribute. I don't know if/where GitHub validates what was signed.

avri-schneider · 2024-11-08T12:05:17Z

I suspect this is a property that the runner provides to GitHub and they blindly sign it.

avri-schneider · 2024-11-08T13:09:48Z

Probably done here - this, as far as I can tell runs on the self-hosted runner.

avri-schneider and others added 7 commits October 31, 2024 10:50

Support multiple expected hashes

ca68eb4

Support multiple expected hashes

c757013

Support multiple expected hashes

99bad22

bendrucker reviewed Oct 31, 2024

View reviewed changes

README.md Outdated Show resolved Hide resolved

avri-schneider added 2 commits November 1, 2024 01:15

rebuild

65b066a

avri-schneider requested a review from bendrucker October 31, 2024 23:24

bendrucker reviewed Oct 31, 2024

View reviewed changes

action.yml Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

src/setup-tflint.js Outdated Show resolved Hide resolved

avri-schneider requested a review from bendrucker November 1, 2024 09:31

Update README.md

ad6bd96

remove irrelevant text

avri-schneider added 2 commits November 5, 2024 09:39

bendrucker added 4 commits November 6, 2024 11:21

copy edits

c80716d

remove example

d36097b

match action input desc to readme

85aeefe

that

c9201e9

Merge branch 'master' into feature/sha256-hash-verification

362774c

bendrucker approved these changes Nov 6, 2024

View reviewed changes

bendrucker added 2 commits November 6, 2024 13:46

tidy source

ea6155c

integration test

2432632

bendrucker added 3 commits November 6, 2024 16:21

fix multiline input test

5009ffd

build

cd2d27e

build

c821b10

bendrucker merged commit 8093687 into terraform-linters:master Nov 7, 2024
11 checks passed

bendrucker changed the title ~~Add support for multiple SHA256 hash values for TFLint download verification~~ Add checksums to enforce immutability Nov 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add checksums to enforce immutability #250

Add checksums to enforce immutability #250

avri-schneider commented Oct 31, 2024 •

edited by bendrucker

Loading

bendrucker left a comment

avri-schneider commented Oct 31, 2024

bendrucker left a comment

avri-schneider commented Nov 1, 2024 •

edited

Loading

avri-schneider commented Nov 1, 2024

avri-schneider commented Nov 3, 2024

bendrucker commented Nov 5, 2024

avri-schneider commented Nov 5, 2024

avri-schneider commented Nov 5, 2024

bendrucker commented Nov 6, 2024 •

edited

Loading

bendrucker commented Nov 6, 2024

avri-schneider commented Nov 6, 2024

bendrucker commented Nov 6, 2024

avri-schneider commented Nov 6, 2024

avri-schneider commented Nov 6, 2024

bendrucker commented Nov 6, 2024

avri-schneider commented Nov 7, 2024

bendrucker commented Nov 7, 2024

avri-schneider commented Nov 7, 2024

bendrucker commented Nov 8, 2024

avri-schneider commented Nov 8, 2024

avri-schneider commented Nov 8, 2024

Add checksums to enforce immutability #250

Add checksums to enforce immutability #250

Conversation

avri-schneider commented Oct 31, 2024 • edited by bendrucker Loading

bendrucker left a comment

Choose a reason for hiding this comment

avri-schneider commented Oct 31, 2024

bendrucker left a comment

Choose a reason for hiding this comment

avri-schneider commented Nov 1, 2024 • edited Loading

avri-schneider commented Nov 1, 2024

avri-schneider commented Nov 3, 2024

bendrucker commented Nov 5, 2024

avri-schneider commented Nov 5, 2024

avri-schneider commented Nov 5, 2024

bendrucker commented Nov 6, 2024 • edited Loading

bendrucker commented Nov 6, 2024

avri-schneider commented Nov 6, 2024

bendrucker commented Nov 6, 2024

avri-schneider commented Nov 6, 2024

avri-schneider commented Nov 6, 2024

bendrucker commented Nov 6, 2024

avri-schneider commented Nov 7, 2024

bendrucker commented Nov 7, 2024

avri-schneider commented Nov 7, 2024

bendrucker commented Nov 8, 2024

avri-schneider commented Nov 8, 2024

avri-schneider commented Nov 8, 2024

avri-schneider commented Oct 31, 2024 •

edited by bendrucker

Loading

avri-schneider commented Nov 1, 2024 •

edited

Loading

bendrucker commented Nov 6, 2024 •

edited

Loading