-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache regression tests in workflow #495
Cache regression tests in workflow #495
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good. Only one change is needed.
.github/workflows/main.yml
Outdated
${{ runner.os }}-build- | ||
${{ runner.os }}- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we try to restore files from caches that do not match env.cache-name
?
If I understand correctly, it could load files from other workflow caches that would use similar naming scheme (Linux*
). I don't think this is helpful, and potentially could break things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I've added this cache before looking into this repo existing caches, it's true that this could cause problem in this case, also for future cases.
For now restore keys are:
restore-keys: |
${{ env.cache-name }}-${{ runner.os }}-${{ job.container.id }}-
${{ env.cache-name }}-${{ runner.os }}-
It is not surprising that including only the first restore key entry caused jobs to fail, as test running jobs executed in container with different ID than the ones building/caching them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe in the second action we should depend only on ${{ env.cache-name }}-${{ runner.os }}-
restore-keys
to download the test? Exact key could never be matched, because we don't know the tests file hash - it will always fall back to restore key - which is the intended behaviour -> download the latest test files from current branch or master.
In this case some always invalid key:
could be used. Then fileHash
would also not crash the CI.
The ${{ job.container.id }}-
should not be used here, because it is independent docker image, that doesn't change the uploaded files.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed fileHash
to calculate contents of test/external/riscv-tests
directory.
It was empty before because it just tried to calculate hash from files that didn't exist before build. And after the change it crashed in fetching phase because it tried to follow encoding.h
, which is a symlink to a submodule that wasn't fetched in that job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid it is still incorrect :(
Build action after compilation pushes cache with id: ...-(riscv-toolchain docker id)-(hash of directory with asm test)
Then Run action downloads it, and pushes it again to second cache ...-(verilator docker id)-(hash of directory with only asm tests). This is the id that would be checked next time in Run action.
I don't know if it is the intended design, in my opinion pushing two caches is not necessary; and in Run action only last/correct cache from Build needs to be downloaded with the same key (and all following problems avoided) (we don't care about ventilator docker - because it is not used to build tests and not outputting anything to cache)
In any case, the current solution would not work when riscv-toolchain changes, because the second cache would still hit (id is not changed in any way)!
I'm not entirely sure about it, but it looks like if tests are updated, then Run action is a cache miss, but according to current priority it would select ${{ env.cache-name }}-${{ runner.os }}-${{ job.container.id }}-
prefix first - so the same previous Run cache with different file hash, and not update to new build files.
There is also a warning in logs: Run regression tests Unexpected input(s) 'submodules', valid inputs are ['path', 'key', 'restore-keys', 'upload-chunk-size', 'enableCrossOsArchive', 'fail-on-cache-miss', 'lookup-only']
submodules
should be in checkout stage, that also explains why file hashes are different in both actions.
9659cfc
to
e5eeddd
Compare
This was not working correctly because Also the last part of cache id ( |
d0ce05b
to
df5f9e0
Compare
df5f9e0
to
8061367
Compare
Please check JumiDeluxe#1 where I added proposition to fix problems reported by @piotro888 |
Add need dependency.
I'm sorry this is this late, but I have a little objection. Artifacts do have one feature that caches don't have: they can be downloaded. This is somewhat useful here, as one can get the compiled tests without the need to have the toolchain installed locally. Is there any way to keep the cache, but still be able to download the files if needed? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed on meeting, it still doesn't work correctly.
The hashFiles
part of key
is different in Build than Run action when fetching cache.
@lekcyjna123 found the probable reason for it - when fetching submodules .git
directory produces different hash each time.
This behaviour causes Run action to firstly fetch the cache from partial match via restore key and then reupload it with its own id at end of action (because of cache miss). This could cause fetching the outdated Run cache (that was reuploaded with Run id) next time, instead of Build cache (that could change and be updated).
Solution for it would be to exclude .git
directory form hasFiles
and the keys should always fully match (using **
at end of path was suggested on meeting to ignore dot-dirs (to verify) )
Co-authored-by: piotro888 <[email protected]>
Co-authored-by: piotro888 <[email protected]>
9801315
to
b67f78e
Compare
61cfe04
to
4eb22e9
Compare
test artifacts with dummy container id
4eb22e9
to
b46df40
Compare
I've added conditional step that uploads artifacts, it is executed on cache miss, after build - if: ${{ steps.cache-regression.outputs.cache-hit != 'true' }}
run: cd test/external/riscv-tests && make
- if: ${{ steps.cache-regression.outputs.cache-hit != 'true' }}
name: Upload riscv-tests
uses: actions/upload-artifact@v3
with:
path: test/external/riscv-tests |
|
477e688
to
a52a0ba
Compare
a52a0ba
to
1596cf6
Compare
1596cf6
to
10d8b4c
Compare
a94d7a8
to
d5e7e0f
Compare
Targets #476
Replaced building
test/external/riscv-tests
directory contents in every run, it is now done on cache miss.They are now stored using workflow cache instead of artifacts.