-
Notifications
You must be signed in to change notification settings - Fork 67
Syncing across repos
This product is part of a collection of products that are similarly managed. In order to simplify the product management across these different repositories, we've added some automations that allow us to keep parts of the repos synchronized.
We use issue and pull request templates for consistency and for preserving important knowledge:
- Issue templates help us ensure that our issues have the information necessary for a teammate to pick it up and go work on it without having to track down the author to ask for more context.
- Pull request templates help us remind teammates of all the necessary steps before accepting product changes, such as accessibility reviews, outreach approval, etc. Both of these templates capture knowledge that the people maintaining these products have learned over time.
We keep the issue and pull request templates in the various repositories synchronized so that people do not have to remember to manually make the changes in multiple repos. Instead, changing, adding, or deleting a template will cause new pull requests to be opened on all the other repositories automatically. Then people only need to approve and merge them rather than manually authoring the changes.
The synchronization is handled by a GitHub Action workflow that executes on
every push to a repo's default branch (usually main
, but depends on the repo).
The action is located at .github/workflows/sync_templates.yml
. The workflow is
three steps.
In order to interact with multiple repos, we can't rely on either the automatic token created by the Action or a personal access token. Instead, we use a GitHub App to generate a short-term token for us. We use a third-party action called github-app-token to take the App credentials and use them to fetch an access token.
The github-app-token
action needs to know the ID, installation ID, and private
key of the GitHub App. Contact #admins-github on Slack to get those credentials.
To understand what needs to be synchronized, we must checkout the code. So... we do that! This step is super easy.
Finally, we use the repo-file-sync action to compare this repo with the others to see if our new changes need to be pushed outwards. This action relies on a synchronization configuration file to inform it about what to sync and how.
The sync config file is located at .github/sync.yml
. Documentation describing
the config file is available on
the action repo page.
The current configuration is a single "group" that lists all of the repos that
should be synchronized along with the files to be synced. The sync config file
itself is also synced across repos to help keep everything configured properly.
NOTE: This synchronization is one-way. Changes are pushed to other repos, not pulled in.
We also synchronize labels between our various product repos. This happens with
a GitHub Actions workflow that runs nightly. The workflow is located at
.github/workflows/sync-labels.yml
. It uses the
label-sync action. This synchronization
is a pull rather than a push.
There is a GitHub workflow trigger called label
that is triggered any time a label is
created, edited, or deleted. Ideally our workflow would run on the label
trigger instead of nightly. However, because the label-sync action is pull,
that does not work - if we add a label to this repo and then run the workflow
on this action, it will attempt to pull labels from the other repos, but the new
label won't exist on them yet. There is an open issue on the label-sync action
to allow pushing changes to other repos. If that is implemented, then we
should switch to using the label
trigger instead of the schedule
trigger.
Some pages of the wikis in this collection of repos are also synced. These are the pages that should be consistent across the project:
- our guiding principles
- our release practices
- our approach to testing
- our approach to synchronization
- the members of our team
The repo-file-sync-action
relies on the GitHub API to create git trees, but
wiki repos are not accessible via the API. Instead, for this method, we do
the synchronization through a small amount of custom code in a GitHub Actions
workflow.
Our workflow is configured to run in a matrix strategy. Our matrix consists of all the repos that should participate in syncing. Setting up a matrix strategy means that our workflow will run multiple times - once for each permutation of the provided matrix. Our matrix is one-dimensional, so the total number of workflow runs is equal to the number of repos we want to sync.
The workflow is configured to be triggered by changes to wiki pages using the gollum event. Each workflow run follows this sequence of steps:
In order to interact with multiple repos, we can't rely on either the automatic token created by the Action or a personal access token. Instead, we use a GitHub App to generate a short-term token for us. We use a third-party action called github-app-token to take the App credentials and use them to fetch an access token.
The github-app-token
action needs to know the ID, installation ID, and private
key of the GitHub App. Contact #admins-github on Slack to get those credentials.
Our workflow is triggered by a wiki changing, so we'll start by checking out the
changed wiki to use as the source of truth. We determine the git URL of the wiki
by adding .wiki
to the repo name provided by GitHub when the workflow runs.
The target wiki is determined by the target
value in the matrix. Each workflow
job run within a single workflow run will have a unique target
value. For
example, if the matrix target was [18f/repo-one, 18f/repo-two]
, there would be
two workflow job runs. The first's target would be 18f/repo-one
and the
second's target would be 18f/repo-two
.
The target wiki is the one that will be updated from the source-of-truth wiki from step 2.
The list of files that should be synchronized across repos is defined in this
step. We use rsync
to copy those files from the source wiki to the target
repo.
We use
rsync
because it is better at preventing file corruption than thecp
command.rsync
uses checksums to ensure that the copied data is not changed from the source data. It also does partial file transfers which is often faster, but given the small number of files and their nature, that is likely neglible for us.
After the files are copied, we use git
to determine if any files have changed.
If they have, then we commit and push them directly to the target wiki. Because
there is no pull request interface for wikis, the changes are made directly to
the wiki's default branch.
NOTE: This synchronization is one-way. Changes are pushed to other repos, not pulled in.