Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: run license script on pre-commit for staged files #136

Merged
merged 6 commits into from
Dec 27, 2024

Conversation

orteth01
Copy link
Contributor

@orteth01 orteth01 commented Dec 19, 2024

✅ Pull Request Checklist:

  • Included link to corresponding GitHub Issue.
  • The commit message follows conventional commit extended guidelines.
  • Added/updated unit tests and storybook for this change (for bug fixes / features).
  • Added/updated documentation (for bug fixes / features)
  • Filled out test instructions.

📝 Test Instructions:

create new file

  • create a new file and omit the copyright header
  • commit the change
  • the copyright header should be added to the staged file and included as part of the commit
  • Screenshot 2024-12-19 at 4 52 20 PM

modify existing file

  • remove the copyright header from an existing file that has it
  • commit the change
  • the copyright header should be added back to the staged file and included as part of the commit
  • Screenshot 2024-12-19 at 4 52 47 PM

remove a file

  • you should be able to remove a file and commit successfully
  • Screenshot 2024-12-19 at 4 51 43 PM

❓ Does this PR introduce a breaking change?

  • Yes
  • No

💬 Other information

  • after discussion with Brandon, decided to just do the license piece of Optimize for changed files only #4 for now
  • as suggested in Optimize for changed files only #4, i used lefthook for the precommit hook
  • scripts/license.mjs deets
    • I adjusted the script to work for files that have an interpreter directive
    • i adjusted the script to accept a '--files' argument, so that we could target only staged files in the pre-commit hook
  • we aren't currently adding the header to .yml files. should we?

parallel: true
commands:
generate-license:
run: pnpm run license "--files={staged_files}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. for multiple changed files, this will run

pnpm zx scripts/license.mjs "--files=packages/design-system/src/utils/css.ts packages/design-system/src/utils/events.ts"

@kalisjoshua
Copy link
Contributor

I'd rather have a pre-push hook than a pre-commit hook personally. Just attempting to open the conversation on this and make sure we are all aligned. For me a pre-commit hook is an opportunity to interrupt a devs natural flow (for me it is at least) and I don't like interrupting productivity. This might not be intrusive but if we don't already have pre-commit hooks already, this opens that door.

@orteth01
Copy link
Contributor Author

orteth01 commented Dec 20, 2024

I'd rather have a pre-push hook than a pre-commit hook personally. Just attempting to open the conversation on this and make sure we are all aligned. For me a pre-commit hook is an opportunity to interrupt a devs natural flow (for me it is at least) and I don't like interrupting productivity. This might not be intrusive but if we don't already have pre-commit hooks already, this opens that door.

because files with changes aren't "staged" in the pre-push flow (they're already committed), we can't leverage stage_fixed. So in a pre-push flow, we'd have to run the license generation script for all files and then commit the changes. With that, you could potentially end up with many extra "generate license" commits (if you push multiple times, missing the header in all the cases), and that extra noise in commit history isn't ideal. the hook could also end up changing files that aren't related to your PR at all, creating more noise in the review process. we could do some git-fu to get only the changed files, but it's just a little extra complexity compared to getting the staged files pre-commit for free.

in this case, i prefer pre-commit. that being said, i totally agree that we should be very thoughtful about what we include in a pre-commit hook. i would be against running tests pre-commit, for example. committing should be very zippy. if the general consensus is that we should avoid the pre-commit hook for this, i'm happy to adjust.

another option I loosely discussed with @brandonjpierce was doing this in a github action after code has been pushed. open to other ideas, if folks have them 👀

@belsrc
Copy link
Contributor

belsrc commented Dec 20, 2024

FYSA, since this (the ticket) is carried over from a current project. We do (not using lefthook):

  '*.{js,jsx,ts,tsx}': processCodeChunks((chunk) => [
    `prettier ${chunk.join(' ')} --write`,
    `eslint ${chunk.join(' ')} --fix --config "./eslint.config.fast.js"`,
  ]),
  '*.json': processCodeChunks((chunk) => [
    `prettier ${chunk.join(' ')} --write`,
  ]),
  '*.md': ['prettier --write'],

on precommit. And

"lint": "npx turbo run lint",
"test": "npx turbo run test",
"build": "npx turbo run build",

on prepush.

I would agree with @kalisjoshua, anything long running I would much prefer to not be on precommit as I commit often and it can get annoying. But also realize the needed for staged files in this case. So a question to you, @orteth01, what does this feel like in practice? Say with ~20 staged files that need the header.

@orteth01
Copy link
Contributor Author

So a question to you, @orteth01, what does this feel like in practice? Say with ~20 staged files that need the header.

@belsrc i just did a couple tests

type # of files affected time
Script ran on entire codebase 989 2s Screenshot 2024-12-20 at 10 53 27 AM
Script ran for subset of files 20 .6s Screenshot 2024-12-20 at 11 07 03 AM

while working on this PR and testing various cases, the commit did not feel particularly sluggish to me. but i get how a half second delay could be annoying if you're committing often

@belsrc
Copy link
Contributor

belsrc commented Dec 20, 2024

Oh, personally I am fine with those numbers. Current, in project, process is muuuch longer.

@brandonjpierce
Copy link
Contributor

brandonjpierce commented Dec 20, 2024

Oh, personally I am fine with those numbers. Current, in project, process is muuuch longer.

@belsrc Yeah that was sort of my push for lefthook in this case when speaking with @orteth01 some of the issues we have on our project can stem from one of two things:

  1. Usage of slower tooling e.g. lint-staged or nano-staged, and then things like eslint, prettier, biome, etc.
  2. Quantity of files staged (e.g. think about when we commit a large merge conf resolution)

In either scenario you have two parts of the performance equation:

  1. How fast is the tooling that is checking and running the scripts against the staged files e.g. lefthook or lint-staged
  2. How fast are the individual tasks within that e.g. eslint, prettier, biome, oxc etc etc.

We kind of have a worst case scenario in all scenarios on our current project but in this one we can optimize both sides of that equation technically.

@brandonjpierce
Copy link
Contributor

Another point for doing this at the precommit step is that a developer doesn't have to pull the remote in the event we did this work in a CI job and committed then. Less steps involved.

@kalisjoshua
Copy link
Contributor

As another perspective - and large just for devils advocate purposes - could we remove it from all source code in the repo and only add it in the published modules are part of that process. I personally dislike seeing it in the code when editing as it takes up screen space for no value and isn't easily collapsed.

Just stirring the pot. Would love to see this but not passionate about it at all. This is the last I hope to mention it.

@orteth01
Copy link
Contributor Author

orteth01 commented Dec 20, 2024

oo that's a cool idea @kalisjoshua. it would also mean that we'd never have to manually modify the header in cases where we need to e.g. when the year changes - it would always just be generated at publish time every time

@brandonjpierce
Copy link
Contributor

@kalisjoshua the Apache 2 license specifies we need to include the header in source code:

Each original source document (code and documentation, but not the LICENSE and NOTICE files) should include a short license header at the top. If the distribution contains documents not covered by an ICLA, CCLA or Software Grant (such as third-party libraries), consult the policy guide.

https://www.apache.org/legal/apply-license.html

@kalisjoshua
Copy link
Contributor

Ok. I'm convinced.

The wanna-be-lawyer in me wants to know if the hair could be split that it would be in the released code (published npm modules) while the code used to generate the released code would not need it (non-published github repo).

I know very little about the ramifications of licensing and am completely fine with being overruled by legality.

@brandonjpierce brandonjpierce merged commit 5f9dc55 into main Dec 27, 2024
3 checks passed
@brandonjpierce brandonjpierce deleted the chore/precommit branch December 27, 2024 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants