-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GitHub Actions workflow #3492
Conversation
the sparse checkout is neat but I'm not sure it'll be the best option forever, there are CI checks I could imagine that involve images at least (making sure links to images from markdown point to valid files, checking that no files are >1MB, ..) for now it's cool though 👍 didn't know git has options for this |
I mostly went for the sparse checkout as a normal clone ends up fetching ~300 MB of data which seems super wasteful to do every CI run. It takes time too, but not that much (half a minute-ish, github seems to give good bandwidth) When we get to wanting to run inspections on the blobs then I'll evaluate possibilities to hack past actually needing to fetch them (first thought is inspecting the list of files somehow without actually getting them, which feels like it should be possible). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not pinning versions is very unsafe. I vote against merging this until deps are pinned.
I'm aware of potential consequences of not locking versions. As I explained in the PR description I didn't really want to introduce a 1.8k line file and the need to track dependency versions for what should essentially be a CLI tool for a text wiki that runs mostly on a github VM, but I agree you're correct in that someone might for some reason actually try to install To that end I've added the lockfile, hard-pinned all direct dependencies to latest version (and latest version only, no semver carets - although |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
As for dependency updates, we already have a lot of integrations like WhiteSource Renovate to handle automatic dependency upgrades. So let's not worry so much about the integrations. |
looks like this is easy to extend to better ui (via github api) thanks to --reporter so let's just see if it works for now. did some similar tests to yours and I got good results 🙏 |
This PR contains an alternative method of automatically reviewing Markdown style, intended to replace the existing setup with Codacy/remark initially introduced as part of #3310.
Rationale
I've already described some of the issues with Codacy once in this comment, but to reiterate:
Codacy's main downfall for our use case is the fact that it seems to mis-use remark, leading to the set of active rules being different to what is actually intended by the configuration. While I can only hazard a guess as to what is really wrong in their setup, it seems it wrongly globally installs both the
remark-cli
package and remark linting rules, which due tonpm
local/global foibles leads to our overrides of rule presets not working.The primary example of this behaviour is the 80-character line length inspection still being active globally, despite being disabled both in the global
.remarkrc.js
and the news subconfig. It is likely that other overrides also completely fail to work due to the same reason.Another concern is that for some reason, some of the time Codacy will mis-report the actual PR status, seemingly at random. This can be seen for example in this PR, wherein Codacy claims that the pull request is "positive" despite once having reported 11 issues that have not actually been addressed (not to mention, all of which were actually bogus due to the previous point) and just 1 issue resolved.
Therefore, as I cannot fix Codacy's issues and I don't want to try and fix remark so that it can work with its rules installed globally, the solution that I'm proposing is to do this ourselves by leveraging GitHub Actions.
Technical details
The workflow pipeline can be seen in the included
.yml
file. As far as I can tell this is the most optimised the pipeline can be. The steps are as follows:First, the repository is cloned using read-only credentials provided by GitHub Actions. To save time and bandwidth, the clone is partial -
git sparse-checkout
lets me only fetch the actual markdown files necessary, plus other required files (package.json
, and all remark configuration files).The reason for blanket-including all
.json
files and not onlypackage.json
is to enable rapidly switching to markdownlint in the future if deemed necessary - all of the pipeline will remain the same, save for the final command.Then, the pipeline sets up a node install and runs
npm install
to get remark dependencies. This is why this PR also includes apackage.json
- the added benefit of which is being able to relatively quickly and painlessly install the required prerequisites in case someone wants to run linting locally. I had a major WTF moment when I had to fetch the odd twentysomething packages manually.Note that
package.json
specifiesversion: latest
and has no associated lockfile. This was intentional on my part, because we're using remark as a tool and not an actual dependency, and I assume we don't want to be tracking/bumping version numbers every time. I don't know if this is how things are done in JS land; if this is a grand transgression, then I'll include it.The reason why the install always runs instead of just skipping if there are no changed files is sort of aesthetic - I could not find a good way to early-exit out of a workflow pipeline, and if I did that just as a normal shell script, then the actual linting output would get mixed with
npm
install logs, and I don't want to either mix or silence that.Finally,
remark
is ran throughnpx
only on the changed files. As for the flags:-q
makes sure only warnings and errors are output,-f
ensures that the exit code is nonzero even with warnings, therefore failing the workflow,--no-stdout
is specified because remark will spew the linted file out there, which we don't want,--report
and--color
are used so that the output is somewhat nice-looking and accessible (see examples lower down).Experimental results
I have run a few test PRs on my fork to see how the new pipeline behaves, borrowing some of the outstanding upstream PRs (sorry @cl8n @Pennek!). Here's a couple of example comparisons:
Of course this is far from a comprehensive summary, but I think it's a convincing enough start.
As a side note, when it comes to execution time the GitHub actions workflow seems to blow Codacy out of the water - while it may take up to 3 minutes for codacy to be done, the workflow is usually done in ~30 secs.
I encourage you to check the error output of both tools - GH Actions is unfortunately worse, but I did manage to get it to show context lines for easier understanding of where the actual error lies.
Tradeoffs and summary
To sum up, with this PR we gain the following benefits:
at the cost of:
A possible future gain of doing this through GH Actions ourselves is that possibly we could get better integration in the way of automated review comments or even auto-committing fixes to user changes. While that is somewhat feasible in some use cases even now (there are pre-existing actions that commit and push fixes) they do not currently work with the fork/PR workflow due to the worry that a malicious actor could leak secrets if they had access to a non-read-only access token.
Since the decision is not trivial and there are potentially a few kinks to iron out before this is ready, I've decided to open this as draft and put it up to discussion. If it's deemed necessary, I can also run a "parallel repository" and evaluate the quality of the pipeline on my fork by rebasing PRs that come into upstream on top of the changes here, so that more evidence can be collected as to whether this is a feasible way forward.