Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GEN-1653] Add integration tests to GH actions workflow #582

Open
wants to merge 30 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
143 changes: 121 additions & 22 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,16 @@ name: build
on:
push:
branches: [main, develop, 'GEN*', 'gen*']
paths-ignore:
- '**.md' # All Markdown files
- '**/docs/**' # Documentation directory
- '.github/workflows/codeql.yml' # Code scanner
- '.github/workflows/build_docs.yml' # mkdocs GH workflow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some paths I think we can ignore as well: tests/testthat, and .pre-commit-config.yaml. There maybe more. Since it's mainly about running through the pipeline, do you think we can use path instead to only point to related scripts folder?


pull_request:
types:
- opened
- reopened

release:
types:
Expand All @@ -16,9 +24,36 @@ on:
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
TEST_PROJECT_SYNID: syn7208886

jobs:

determine-changes:
runs-on: ubuntu-latest
outputs:
non-test-changes: ${{ steps.get-changes.outputs.non-test-changes }}
steps:
- name: Checkout Code
uses: actions/checkout@v3
with:
fetch-depth: 0

- name: Determine Changes
id: get-changes
run: |
if [ "${{ github.event_name }}" == "push" ]; then
echo "Handling a push event..."
ALL_CHANGED_FILES=$(git diff --name-only HEAD^ HEAD)
elif [ "${{ github.event_name }}" == "pull_request" ]; then
echo "Handling a pull request event..."
git fetch origin ${{ github.base_ref }}
ALL_CHANGED_FILES=$(git diff --name-only origin/${{ github.base_ref }} HEAD)
fi
echo "All Changed Files: $ALL_CHANGED_FILES"
NON_TEST_CHANGES=$(echo "$ALL_CHANGED_FILES" | grep -v '^tests/' || true)
echo "Non-Test Changes: $NON_TEST_CHANGES"
echo "::set-output name=non-test-changes::$NON_TEST_CHANGES"
rxu17 marked this conversation as resolved.
Show resolved Hide resolved

test:
runs-on: ubuntu-latest
strategy:
Expand Down Expand Up @@ -54,34 +89,14 @@ jobs:
# Use always() to always run this step to publish test results when there are test failures
if: ${{ always() }}


lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: psf/black@stable
with:
version: "~=23.12"
deploy:
needs: [test, lint]
runs-on: ubuntu-latest
if: github.event_name == 'release'
permissions:
id-token: write
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.x'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install setuptools wheel build
- name: Build distributions
run: python -m build
- name: Publish to pypi
uses: pypa/gh-action-pypi-publish@release/v1


build-container:
needs: [test, lint]
Expand Down Expand Up @@ -109,7 +124,7 @@ jobs:
uses: docker/metadata-action@v5
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}

- name: Format tags as registry refs
id: registry_refs
env:
Expand All @@ -129,3 +144,87 @@ jobs:
labels: ${{ steps.meta.outputs.labels }}
cache-from: ${{ steps.registry_refs.outputs.tags }},mode=max
cache-to: ${{ steps.registry_refs.outputs.tags }},mode=max

integration-tests:
needs: [determine-changes, lint, test, build-container]
runs-on: ubuntu-latest
if: ${{ needs.determine-changes.outputs.non-test-changes }}
steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Extract Branch Name
run: |
if [ "$GITHUB_HEAD_REF" != "" ]; then
echo "BRANCH_NAME=$GITHUB_HEAD_REF" >> $GITHUB_ENV
thomasyu888 marked this conversation as resolved.
Show resolved Hide resolved
else
echo "BRANCH_NAME=${GITHUB_REF#refs/heads/}" >> $GITHUB_ENV
fi

- name: Pull Public Docker Image from GHCR
run: |
docker pull ghcr.io/sage-bionetworks/genie:${{ env.BRANCH_NAME }}

- name: Start Docker Container
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run: |
docker run -d --name genie-container \
-e SYNAPSE_AUTH_TOKEN="${{ secrets.SYNAPSE_AUTH_TOKEN }}" \
ghcr.io/sage-bionetworks/genie:${{ env.BRANCH_NAME }} \
sh -c "while true; do sleep 1; done"
Copy link
Member

@thomasyu888 thomasyu888 Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whats the motivation of this while true statement?

Whats your rationale for starting the container instead of just having 5 docker run commands?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm from my understanding I thought that we had to keep the container "open" while running the code to keep it all in one container run/same state persistence. However thinking about it now, it should probably mirror nextflow where each step is probably separate docker run.

I also wonder if this is the reason why I'm seeing differences in the outputs compared to outputs when I run the nextflow pipeline (something I am investigating ...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Differences in outputs?! What are the differences?

Copy link
Contributor Author

@rxu17 rxu17 Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All in the clinical consortium files. It's dropping more records for some reason (at least more compared to the nextflow pipeline run)


- name: Run validation in test pipeline
run: |
docker exec genie-container \
python3 /root/Genie/bin/input_to_database.py \
mutation \
--project_id ${{ env.TEST_PROJECT_SYNID }} \
--onlyValidate \
--genie_annotation_pkg /root/annotation-tools

- name: Run processing on mutation data in test pipeline
run: |
docker exec genie-container \
python3 /root/Genie/bin/input_to_database.py mutation \
--project_id ${{ env.TEST_PROJECT_SYNID }} \
--genie_annotation_pkg /root/annotation-tools \
--createNewMafDatabase

- name: Run processing on non-mutation data in test pipeline
run: |
docker exec genie-container \
python3 /root/Genie/bin/input_to_database.py main \
--project_id ${{ env.TEST_PROJECT_SYNID }}

- name: Run consortium release in test pipeline
run: |
docker exec genie-container \
python3 /root/Genie/bin/database_to_staging.py Jan-2017 ../cbioportal TEST --test

- name: Run public release in test pipeline
run: |
docker exec genie-container \
python3 /root/Genie/bin/consortium_to_public.py Jan-2017 ../cbioportal TEST --test

- name: Stop and Remove Docker Container
run: docker stop genie-container && docker rm genie-container

deploy:
needs: [test, lint, build-container, integration-tests]
runs-on: ubuntu-latest
if: github.event_name == 'release'
permissions:
id-token: write
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.x'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install setuptools wheel build
- name: Build distributions
run: python -m build
- name: Publish to pypi
uses: pypa/gh-action-pypi-publish@release/v1
6 changes: 3 additions & 3 deletions .github/workflows/codeql.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ jobs:

# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v2
uses: github/codeql-action/init@v3
with:
languages: ${{ matrix.language }}
# If you wish to specify custom queries, you can do so here or in a config file.
Expand All @@ -56,7 +56,7 @@ jobs:
# Autobuild attempts to build any compiled languages (C/C++, C#, or Java).
# If this step fails, then you should remove it and run the build manually (see below)
- name: Autobuild
uses: github/codeql-action/autobuild@v2
uses: github/codeql-action/autobuild@v3

# ℹ️ Command-line programs to run using the OS shell.
# 📚 See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun
Expand All @@ -69,4 +69,4 @@ jobs:
# ./location_of_script_within_repo/buildscript.sh

- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v2
uses: github/codeql-action/analyze@v3
2 changes: 2 additions & 0 deletions tests/test_clinical.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
"""Test GENIE Clinical class"""

import datetime
import json
from collections import Counter
Expand Down
Loading