Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add data/cache version checking #85

Open
andrewda opened this issue Dec 14, 2017 · 1 comment
Open

Add data/cache version checking #85

andrewda opened this issue Dec 14, 2017 · 1 comment

Comments

@andrewda
Copy link
Member

andrewda commented Dec 14, 2017

This was discussed briefly in #78, but I'll use this issue as a way to better express my ideas around the bigger problem here. Our data.(js|min.js|yml) files are going to be changing quite often while this site is in development, not just in their values but also in their keys.

For example, if we were to make some large changes to the data file. Let's say, for example, the desired effect of our change is to move the wikidata link of an organization to a new key:

- "wikidata_url": "//www.wikidata.org/wiki/Q170855"
+ "wikidata_link": "//www.wikidata.org/wiki/Q170855"

When we deploy the code, it will still expect a "wikidata_url" key to exist, despite the name being changed to "wikidata_link". On top of that, the caching code will use the existing organization's object instead of making another API request to GCI (because we don't expect their returned data to change) so the "wikidata_url" will stick around along with the new "wikidata_link" leading to some very cluttered data.

So how do we fix this?

Pretty straight-forward. Adding a CACHE_VERSION or DATA_VERSION constant in our code and putting that value in our data.(js|min.js|yml) files will allow us to check if our data format is up to date. We would increment this version every time a breaking change is made to the data format (be it adding or removing a key, or updating the name of an existing key). If the DATA_VERSION we expect and the DATA_VERSION of the data file we're reading from don't match up, we ignore it and re-fetch all the data.

@jayvdb
Copy link
Member

jayvdb commented Dec 14, 2017

Ignoring the cache based on a schema version change is too simplistic; our rendered version would regress too often. Instead, we would eventually need 'migrations'; scripts which convert from one version to the other. But I dont see that being particularly important to build as an independent feature of this build system. It is a lot of work to build generically. Maybe we can find an existing system in Node? but I doubt it. It is something we can build using the django framework for the community project.

There is also the added complexity that if we do have downstream users of the js/yaml, we need to maintain the old keys anyway, so we would need to version the filenames if there is breaking change to the contents.

At the moment, each PR that includes key renames should also be removing old keys. We forgot that in #83 ? Send a PR to kill the old names, and we are done for the moment, but can revisit this if renames become more frequent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

4 participants