From 386be93257f008f417874ef51cac347cdb05eb91 Mon Sep 17 00:00:00 2001 From: Weisu Yin Date: Wed, 2 Dec 2020 14:21:56 -0800 Subject: [PATCH] Github Actions for CI (#1541) * [WIP] Github Actions (#1) * incorporate autodatasets (#1496) * Add torch clarification (#1495) * Add torch clarification * fix * Fix auto detectors (#1497) * fix yolo predictor * fix predict * fix config (#1498) * Added support for AWS Batch. Added support for docker (#1474) * Added support for AWS Batch. Added support for docker * Fixed style. Removed code in commet. Updated README to include boto3 usage * Renamed template file. Removed gluon aws id * fix readme * fix * fix imports (#1499) * fix imports * fix * fix image classification * fix * fix width height * fix * fix batch size * fix * fix * none to empty string (#1502) * [WIP] Tinycoco (#1501) * Add minicoco * update jenkins for minicoco * fix * renamed mini to tiny * fix * fix * fix, add VOCDetectionTiny * fix * fix env * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * test * test * test * clean up Co-authored-by: Joshua Z. Zhang * Fix rcnn target generator (#1508) * fix not used rcnn target generator * fix lint * fix * fix * add get flops (#1509) * warmup scheduler for video torch (#1510) 1. refine warmup logic, now using cfg.CONFIG.TRAIN.USE_WARMUP to control open warmup or not. 2. fix bug in gluoncv/torch/utils/lr_policy.py 3. change training configs 4. change ddp_train_pytorch and ddp_train_shortonly_pytorch, This is tested on ec2 machines * update torchvideo model zoo (#1513) * add ir-csn-152 into torchvideo model zoo (#1515) * Revise danet.py (#1507) The dropout layer should be placed before the classification layer. * icnet missing background class (#1518) * Add CSN model to torch video model zoo (#1517) * add ircsn * update model zoo * fix lint * Improve auto tasks (#1523) * use in-memory pickle instead of disk file * add feature extractor for image classification * add tests * fix * fix lint * more unittests * fix * fix * Added github action and workflow for sanity check * Removed container and actions. * Added unit test * Added build docs * Fix * Fix * Fix * Fix * Test * test * Update unit test * fix * fix * fix * fix * fix * fix * fix * subclass coco * fix * fix * fix * fix * rebase conflict * fix rebase * fix * fix * add aws authentication * add aws authentication * test * test * test * test * test * fix log * test * test * test * test * test * test * fix * rebase * add tiny motorbike * fix * model zoo * test * fix docker * parallel jobs * parallel jobs * fix * add torch * add torch * fix * fix * fix * full test * full test * test build docs * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * test branch * test branch * fix * test * test * add comment Co-authored-by: Joshua Z. Zhang Co-authored-by: Yi Zhu Co-authored-by: Xinyu Li Co-authored-by: Chunhui Liu Co-authored-by: YANYI ZHANG Co-authored-by: BebDong Co-authored-by: Kuang Haofei * [WIP] Test PR (#3) * Added github action and workflow for sanity check * Removed container and actions. * Added unit test * Added build docs * Fix * Fix * Fix * Fix * Test * test * Update unit test * fix * fix * fix * fix * fix * fix * fix * subclass coco * fix * fix * fix * fix * rebase conflict * fix rebase * fix * fix * add aws authentication * add aws authentication * test * test * test * test * test * fix log * test * test * test * test * test * test * fix * rebase * add tiny motorbike * fix * model zoo * test * fix docker * parallel jobs * parallel jobs * fix * add torch * add torch * fix * fix * fix * full test * full test * test build docs * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * test branch * test branch * fix * test * test * add comment * test * full test * full test * full test * test (#5) * test * fix * change to 12x * test comments * change to pr_target * [WIP] Full Test (#6) * full test * test model zoo * test model zoo * full test * full test * add auto * add gpu_test.sh * test efs modelzoo * test efs modelzoo * test efs modelzoo * test without auto * test repo name * test repo name * test repo name * test repo name * test sharemem * full test (#8) * [WIP] Github Actions (#1) * incorporate autodatasets (#1496) * Add torch clarification (#1495) * Add torch clarification * fix * Fix auto detectors (#1497) * fix yolo predictor * fix predict * fix config (#1498) * Added support for AWS Batch. Added support for docker (#1474) * Added support for AWS Batch. Added support for docker * Fixed style. Removed code in commet. Updated README to include boto3 usage * Renamed template file. Removed gluon aws id * fix readme * fix * fix imports (#1499) * fix imports * fix * fix image classification * fix * fix width height * fix * fix batch size * fix * fix * none to empty string (#1502) * [WIP] Tinycoco (#1501) * Add minicoco * update jenkins for minicoco * fix * renamed mini to tiny * fix * fix * fix, add VOCDetectionTiny * fix * fix env * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * test * test * test * clean up Co-authored-by: Joshua Z. Zhang * Fix rcnn target generator (#1508) * fix not used rcnn target generator * fix lint * fix * fix * add get flops (#1509) * warmup scheduler for video torch (#1510) 1. refine warmup logic, now using cfg.CONFIG.TRAIN.USE_WARMUP to control open warmup or not. 2. fix bug in gluoncv/torch/utils/lr_policy.py 3. change training configs 4. change ddp_train_pytorch and ddp_train_shortonly_pytorch, This is tested on ec2 machines * update torchvideo model zoo (#1513) * add ir-csn-152 into torchvideo model zoo (#1515) * Revise danet.py (#1507) The dropout layer should be placed before the classification layer. * icnet missing background class (#1518) * Add CSN model to torch video model zoo (#1517) * add ircsn * update model zoo * fix lint * Improve auto tasks (#1523) * use in-memory pickle instead of disk file * add feature extractor for image classification * add tests * fix * fix lint * more unittests * fix * fix * Added github action and workflow for sanity check * Removed container and actions. * Added unit test * Added build docs * Fix * Fix * Fix * Fix * Test * test * Update unit test * fix * fix * fix * fix * fix * fix * fix * subclass coco * fix * fix * fix * fix * rebase conflict * fix rebase * fix * fix * add aws authentication * add aws authentication * test * test * test * test * test * fix log * test * test * test * test * test * test * fix * rebase * add tiny motorbike * fix * model zoo * test * fix docker * parallel jobs * parallel jobs * fix * add torch * add torch * fix * fix * fix * full test * full test * test build docs * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * test branch * test branch * fix * test * test * add comment Co-authored-by: Joshua Z. Zhang Co-authored-by: Yi Zhu Co-authored-by: Xinyu Li Co-authored-by: Chunhui Liu Co-authored-by: YANYI ZHANG Co-authored-by: BebDong Co-authored-by: Kuang Haofei * [WIP] Test PR (#3) * Added github action and workflow for sanity check * Removed container and actions. * Added unit test * Added build docs * Fix * Fix * Fix * Fix * Test * test * Update unit test * fix * fix * fix * fix * fix * fix * fix * subclass coco * fix * fix * fix * fix * rebase conflict * fix rebase * fix * fix * add aws authentication * add aws authentication * test * test * test * test * test * fix log * test * test * test * test * test * test * fix * rebase * add tiny motorbike * fix * model zoo * test * fix docker * parallel jobs * parallel jobs * fix * add torch * add torch * fix * fix * fix * full test * full test * test build docs * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * test branch * test branch * fix * test * test * add comment * test * full test * full test * full test * test (#5) * test * fix * change to 12x * test comments * change to pr_target * [WIP] Full Test (#6) * full test * test model zoo * test model zoo * full test * full test * add auto * add gpu_test.sh * test efs modelzoo * test efs modelzoo * test efs modelzoo * test without auto * test repo name * test repo name * test repo name * test repo name * test sharemem * test pr only on yinweisu * test pr only on yinweisu * update repo name * test pr only on yinweisu (#9) * full test on pr only yinweisu (#10) * ready to pr * fix * change doc env name * add torch to env * add yacs to env * fix path Co-authored-by: Joshua Z. Zhang Co-authored-by: Yi Zhu Co-authored-by: Xinyu Li Co-authored-by: Chunhui Liu Co-authored-by: YANYI ZHANG Co-authored-by: BebDong Co-authored-by: Kuang Haofei --- .github/workflows/build_docs.sh | 41 +++ .github/workflows/ci.yml | 243 ++++++++++++++++++ .github/workflows/gpu_test.sh | 16 ++ Jenkinsfile | 8 +- .../model_zoo/rcnn/faster_rcnn/faster_rcnn.py | 1 + tests/py3_mxnet.yml | 2 +- tests/py3_mxnet_ci.yml | 27 ++ tools/batch/docker/Dockerfile.gpu | 7 +- tools/batch/docker/docker_deploy.sh | 2 +- tools/batch/submit-job.py | 15 +- 10 files changed, 348 insertions(+), 14 deletions(-) create mode 100644 .github/workflows/build_docs.sh create mode 100644 .github/workflows/ci.yml create mode 100644 .github/workflows/gpu_test.sh create mode 100644 tests/py3_mxnet_ci.yml diff --git a/.github/workflows/build_docs.sh b/.github/workflows/build_docs.sh new file mode 100644 index 0000000000..57d897c491 --- /dev/null +++ b/.github/workflows/build_docs.sh @@ -0,0 +1,41 @@ +#!/usr/bin/env bash + +BRANCH=$(basename $1) +COMMIT_SHA=$2 +GIT_REPO=$3 +PR_NUMBER=$4 + +EFS=/mnt/efs + +mkdir -p ~/.mxnet/datasets +for f in $EFS/.mxnet/datasets/*; do + if [ -d "$f" ]; then + # Will not run if no directories are available + ln -s $f ~/.mxnet/datasets/$(basename "$f") + fi +done + +python3 -m pip install sphinx>=1.5.5 sphinx-gallery sphinx_rtd_theme matplotlib Image recommonmark scipy mxtheme + +export MXNET_CUDNN_AUTOTUNE_DEFAULT=0 +cd docs +make html +COMMAND_EXIT_CODE=$? +sed -i.bak 's/33\\,150\\,243/23\\,141\\,201/g' build/html/_static/material-design-lite-1.3.0/material.blue-deep_orange.min.css; +sed -i.bak 's/2196f3/178dc9/g' build/html/_static/sphinx_materialdesign_theme.css; +sed -i.bak 's/pre{padding:1rem;margin:1.5rem\\s0;overflow:auto;overflow-y:hidden}/pre{padding:1rem;margin:1.5rem 0;overflow:auto;overflow-y:scroll}/g' build/html/_static/sphinx_materialdesign_theme.css + +if [[ ($BRANCH == master) && ($GIT_REPO == dmlc/gluon-cv) ]]; then + # aws s3 cp s3://gluon-cv.mxnet.io/coverage.svg build/html/coverage.svg + aws s3 sync --delete build/html/ s3://gluoncv-ci/build_docs/master/ --acl public-read --cache-control max-age=7200 + # aws s3 cp build/html/coverage.svg s3://gluon-cv.mxnet.io/coverage.svg --acl public-read --cache-control max-age=300 + # echo "Uploaded doc to http://gluon-cv.mxnet.io" + echo master +else + # aws s3 cp s3://gluoncv-ci/build_docs/$PR_NUMBER/$COMMIT_SHA/coverage.svg build/html/coverage.svg + aws s3 sync --delete build/html/ s3://gluoncv-ci/build_docs/$PR_NUMBER/$COMMIT_SHA/ --acl public-read + # echo "Uploaded doc to http://gluon-vision-staging.s3-website-us-west-2.amazonaws.com/${env.BRANCH_NAME}/${env.BUILD_NUMBER}/index.html" + echo "Uploaded doc to https://gluoncv-ci.s3-us-west-2.amazonaws.com/build_docs//$PR_NUMBER/$COMMIT_SHA/index.html" + echo $GIT_REPO: $BRANCH +fi; +exit $COMMAND_EXIT_CODE diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml new file mode 100644 index 0000000000..c3173945ac --- /dev/null +++ b/.github/workflows/ci.yml @@ -0,0 +1,243 @@ +name: CI workflow +on: + push: + branches: + - master + pull_request_target: +jobs: + unittests: + if: ${{ (github.event_name == 'push' && github.repository == 'yinweisu/gluon-cv') || (github.event_name == 'pull_request_target' && github.event.pull_request.head.repo.full_name == 'yinweisu/gluon-cv') }} + runs-on: ${{ matrix.os }} + strategy: + matrix: + os: [macos-latest, windows-latest, ubuntu-latest] + steps: + - name: Checkout repository(For push) + if: ${{ github.event_name == 'push' }} + uses: actions/checkout@v2 + - name: Checkout Pull Request Repository(For pull request) + if: ${{ github.event_name == 'pull_request' || github.event_name == 'pull_request_target' }} + uses: actions/checkout@v2 + with: + repository: ${{ github.event.pull_request.head.repo.full_name }} + ref: ${{ github.event.pull_request.head.ref }} + - name: Setup Miniconda + uses: conda-incubator/setup-miniconda@v2.0.0 + with: + auto-update-conda: true + python-version: 3.7 + - name: sanity-check + shell: bash -l {0} + run: | + conda env create -n gluon_cv_lint -f ./tests/pylint.yml + conda env update -n gluon-cv-lint -f ./tests/pylint.yml --prune + conda activate gluon-cv-lint + conda list + make clean + make pylint + - name: unit-test + shell: bash -l {0} + run: | + conda env create -n gluon_cv_py3_test -f tests/py3_mxnet_ci.yml + conda env update -n gluon_cv_py3_test -f tests/py3_mxnet_ci.yml --prune + conda activate gluon_cv_py3_test + conda list + export CUDA_VISIBLE_DEVICES=0 + export KMP_DUPLICATE_LIB_OK=TRUE + make clean + pip install --upgrade --force-reinstall --no-deps . + env + export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64 + export MPLBACKEND=Agg + export MXNET_CUDNN_AUTOTUNE_DEFAULT=0 + export TINY_COCO=~/.mxnet/datasets/tiny_coco + export TINY_MOTORBIKE=~/.mxnet/datasets/tiny_motorbike + mkdir -p $TINY_COCO/annotations + curl -s https://gluoncv-ci.s3-us-west-2.amazonaws.com/mini_coco/sub_val.zip --output sub_val.zip + unzip -q sub_val.zip -d $TINY_COCO + mv $TINY_COCO/sub_val $TINY_COCO/val2017 + curl -s https://gluoncv-ci.s3-us-west-2.amazonaws.com/mini_coco/instances_val2017_tiny.json --output instances_val2017_tiny.json + mv instances_val2017_tiny.json $TINY_COCO/annotations + curl -s https://gluoncv-ci.s3-us-west-2.amazonaws.com/tiny_motorbike.zip --output tiny_motorbike.zip + unzip -q tiny_motorbike.zip -d $TINY_MOTORBIKE + nosetests --with-timer --timer-ok 5 --timer-warning 20 -x --with-coverage --cover-package gluoncv -v tests/unittests + model_zoo_mxnet: + if: ${{ (github.event_name == 'push' && github.repository == 'yinweisu/gluon-cv') || (github.event_name == 'pull_request_target' && github.event.pull_request.head.repo.full_name == 'yinweisu/gluon-cv') }} + needs: unittests + runs-on: ubuntu-latest + steps: + - name: checkout + uses: actions/checkout@v2 + - name: Configure AWS Credentials + uses: aws-actions/configure-aws-credentials@v1 + with: + aws-access-key-id: ${{ secrets.GLUONCV_DEV_ACCESS_ID }} + aws-secret-access-key: ${{ secrets.GLUONCV_DEV_SECRET_ACCESS_KEY }} + aws-region: us-east-1 + - name: Install dependencies + run: | + pip install --upgrade --force-reinstall --no-deps . + pip install boto3 + - name: Test model_zoo_mxnet on AWS Batch(For push) + shell: bash -l {0} + if: ${{ github.event_name == 'push' }} + run: | + echo "Start submitting job" + python ./tools/batch/submit-job.py --region us-east-1 \ + --job-type p3.2x \ + --name GluonCV-GPU-ModelZooMxnet-${{ github.ref }} \ + --source-ref ${{ github.ref }} \ + --work-dir . \ + --remote https://github.com/${{ github.repository }} \ + --command "chmod +x ./.github/workflows/gpu_test.sh && ./.github/workflows/gpu_test.sh gluoncv tests/model_zoo" \ + --wait + - name: Test model_zoo_mxnet on AWS Batch(For pull request) + if: ${{ github.event_name == 'pull_request' || github.event_name == 'pull_request_target' }} + run: | + echo "Start submitting job" + python ./tools/batch/submit-job.py --region us-east-1 \ + --job-type p3.2x \ + --name GluonCV-GPU-ModelZooMxnet-PR#${{ github.event.number }} \ + --source-ref ${{ github.event.pull_request.head.ref }} \ + --work-dir . \ + --remote https://github.com/${{ github.event.pull_request.head.repo.full_name }} \ + --command "chmod +x ./.github/workflows/gpu_test.sh && ./.github/workflows/gpu_test.sh gluoncv tests/model_zoo" \ + --wait + model_zoo_torch: + if: ${{ (github.event_name == 'push' && github.repository == 'yinweisu/gluon-cv') || (github.event_name == 'pull_request_target' && github.event.pull_request.head.repo.full_name == 'yinweisu/gluon-cv') }} + needs: unittests + runs-on: ubuntu-latest + steps: + - name: checkout + uses: actions/checkout@v2 + - name: Configure AWS Credentials + uses: aws-actions/configure-aws-credentials@v1 + with: + aws-access-key-id: ${{ secrets.GLUONCV_DEV_ACCESS_ID }} + aws-secret-access-key: ${{ secrets.GLUONCV_DEV_SECRET_ACCESS_KEY }} + aws-region: us-east-1 + - name: Install dependencies + run: | + pip install --upgrade --force-reinstall --no-deps . + pip install boto3 + - name: Test model_zoo_torch on AWS Batch(For push) + shell: bash -l {0} + if: ${{ github.event_name == 'push' }} + run: | + echo "Start submitting job" + python ./tools/batch/submit-job.py --region us-east-1 \ + --job-type p3.2x \ + --name GluonCV-GPU-ModelZooTorch-${{ github.ref }} \ + --source-ref ${{ github.ref }} \ + --work-dir . \ + --remote https://github.com/${{ github.repository }} \ + --command "chmod +x ./.github/workflows/gpu_test.sh && ./.github/workflows/gpu_test.sh gluoncv/torch tests/model_zoo_torch" \ + --wait + - name: Test model_zoo_torch on AWS Batch(For pull request) + if: ${{ github.event_name == 'pull_request' || github.event_name == 'pull_request_target' }} + run: | + echo "Start submitting job" + python ./tools/batch/submit-job.py --region us-east-1 \ + --job-type p3.2x \ + --name GluonCV-GPU-ModelZooTorch-PR#${{ github.event.number }} \ + --source-ref ${{ github.event.pull_request.head.ref }} \ + --work-dir . \ + --remote https://github.com/${{ github.event.pull_request.head.repo.full_name }} \ + --command "chmod +x ./.github/workflows/gpu_test.sh && ./.github/workflows/gpu_test.sh gluoncv/torch tests/model_zoo_torch" \ + --wait + auto: + if: ${{ (github.event_name == 'push' && github.repository == 'yinweisu/gluon-cv') || (github.event_name == 'pull_request_target' && github.event.pull_request.head.repo.full_name == 'yinweisu/gluon-cv') }} + needs: unittests + runs-on: ubuntu-latest + steps: + - name: checkout + uses: actions/checkout@v2 + - name: Configure AWS Credentials + uses: aws-actions/configure-aws-credentials@v1 + with: + aws-access-key-id: ${{ secrets.GLUONCV_DEV_ACCESS_ID }} + aws-secret-access-key: ${{ secrets.GLUONCV_DEV_SECRET_ACCESS_KEY }} + aws-region: us-east-1 + - name: Install dependencies + run: | + pip install --upgrade --force-reinstall --no-deps . + pip install boto3 + - name: Test model_zoo_torch on AWS Batch(For push) + shell: bash -l {0} + if: ${{ github.event_name == 'push' }} + run: | + echo "Start submitting job" + python ./tools/batch/submit-job.py --region us-east-1 \ + --job-type p3.2x \ + --name GluonCV-GPU-Auto-${{ github.ref }} \ + --source-ref ${{ github.ref }} \ + --work-dir . \ + --remote https://github.com/${{ github.repository }} \ + --command "chmod +x ./.github/workflows/gpu_test.sh && ./.github/workflows/gpu_test.sh gluoncv tests/auto" \ + --wait + - name: Test model_zoo_torch on AWS Batch(For pull request) + if: ${{ github.event_name == 'pull_request' || github.event_name == 'pull_request_target' }} + run: | + echo "Start submitting job" + python ./tools/batch/submit-job.py --region us-east-1 \ + --job-type p3.2x \ + --name GluonCV-GPU-Auto-PR#${{ github.event.number }} \ + --source-ref ${{ github.event.pull_request.head.ref }} \ + --work-dir . \ + --remote https://github.com/${{ github.event.pull_request.head.repo.full_name }} \ + --command "chmod +x ./.github/workflows/gpu_test.sh && ./.github/workflows/gpu_test.sh gluoncv tests/auto" \ + --wait + build-docs: + if: ${{ (github.event_name == 'push' && github.repository == 'yinweisu/gluon-cv') || (github.event_name == 'pull_request_target' && github.event.pull_request.head.repo.full_name == 'yinweisu/gluon-cv') }} + needs: [unittests, model_zoo_mxnet, model_zoo_torch, auto] + runs-on: ubuntu-latest + steps: + - name: checkout + uses: actions/checkout@v2 + - name: Configure AWS Credentials + uses: aws-actions/configure-aws-credentials@v1 + with: + aws-access-key-id: ${{ secrets.GLUONCV_DEV_ACCESS_ID }} + aws-secret-access-key: ${{ secrets.GLUONCV_DEV_SECRET_ACCESS_KEY }} + aws-region: us-east-1 + - name: Install dependencies + run: | + pip install --upgrade --force-reinstall --no-deps . + pip install boto3 + - name: Set SHA outputs + id: vars + run: echo "::set-output name=sha_short::$(git rev-parse --short HEAD)" + - name: Build docs on AWS Batch(For push) + shell: bash -l {0} + if: ${{ github.event_name == 'push' }} + run: | + echo "Start submitting job" + python ./tools/batch/submit-job.py --region us-east-1 \ + --job-type p3.2x \ + --name GluonCV-GPU-BuildDocs-${{ github.ref }} \ + --source-ref ${{ github.ref }} \ + --work-dir . \ + --remote https://github.com/${{ github.repository }} \ + --command "chmod +x ./.github/workflows/build_docs.sh && ./.github/workflows/build_docs.sh ${{ github.ref }} ${{ steps.vars.outputs.sha_short }} ${{ github.repository }} ${{ github.event.number }}" \ + --wait + - name: Build docs on AWS Batch(For pull request) + if: ${{ github.event_name == 'pull_request' || github.event_name == 'pull_request_target' }} + run: | + echo "Start submitting job" + python ./tools/batch/submit-job.py --region us-east-1 \ + --job-type p3.2x \ + --name GluonCV-GPU-BuildDocs-PR#${{ github.event.number }} \ + --source-ref ${{ github.event.pull_request.head.ref }} \ + --work-dir . \ + --remote https://github.com/${{ github.event.pull_request.head.repo.full_name }} \ + --command "chmod +x ./.github/workflows/build_docs.sh && ./.github/workflows/build_docs.sh ${{ github.event.pull_request.head.ref }} ${{ steps.vars.outputs.sha_short }} ${{ github.event.pull_request.head.repo.full_name }} ${{ github.event.number }} " \ + --wait + - name: Comment on PR + if: ${{ github.event_name == 'pull_request' || github.event_name == 'pull_request_target' }} + uses: peter-evans/create-or-update-comment@v1.4.3 + with: + issue-number: ${{ github.event.number }} + body: | + Job ${{ github.event.number }}-${{ steps.vars.outputs.sha_short }} is done. + Docs are uploaded to http://gluon-vision-staging.s3-website-us-west-2.amazonaws.com/${{ github.event.number }}/${{ steps.vars.outputs.sha_short }}/index.html + \ No newline at end of file diff --git a/.github/workflows/gpu_test.sh b/.github/workflows/gpu_test.sh new file mode 100644 index 0000000000..3851059ef1 --- /dev/null +++ b/.github/workflows/gpu_test.sh @@ -0,0 +1,16 @@ +#!/usr/bin/env bash + +COVER_PACKAGE=$1 +TESTS_PATH=$2 + +EFS=/mnt/efs + +mkdir -p ~/.mxnet/models +for f in $EFS/.mxnet/models/*.params; do + ln -s $f ~/.mxnet/models/$(basename "$f") +done + +export MXNET_CUDNN_AUTOTUNE_DEFAULT=0 +export MPLBACKEND=Agg +export KMP_DUPLICATE_LIB_OK=TRUE +nosetests --with-timer --timer-ok 5 --timer-warning 20 -x --with-coverage --cover-package $COVER_PACKAGE -v $TESTS_PATH diff --git a/Jenkinsfile b/Jenkinsfile index 385f8b1056..3c0a1b3410 100644 --- a/Jenkinsfile +++ b/Jenkinsfile @@ -135,11 +135,11 @@ stage("Build Docs") { checkout scm VISIBLE_GPU=env.EXECUTOR_NUMBER.toInteger() % 8 sh """#!/bin/bash - conda env remove -n gluon_vision_docs -y + conda env remove -n gluon_cv_docs -y set -ex - conda env create -n gluon_vision_docs -f docs/build.yml - conda env update -n gluon_vision_docs -f docs/build.yml --prune - conda activate gluon_vision_docs + conda env create -n gluon_cv_docs -f docs/build.yml + conda env update -n gluon_cv_docs -f docs/build.yml --prune + conda activate gluon_cv_docs export PYTHONPATH=\${PWD} export CUDA_VISIBLE_DEVICES=${VISIBLE_GPU} env diff --git a/gluoncv/model_zoo/rcnn/faster_rcnn/faster_rcnn.py b/gluoncv/model_zoo/rcnn/faster_rcnn/faster_rcnn.py index 20e456a86b..d850b422b2 100644 --- a/gluoncv/model_zoo/rcnn/faster_rcnn/faster_rcnn.py +++ b/gluoncv/model_zoo/rcnn/faster_rcnn/faster_rcnn.py @@ -223,6 +223,7 @@ def target_generator(self): raise ValueError("`minimal_opset` enabled, target generator is not available") if not isinstance(self._target_generator, mx.gluon.Block): self._target_generator = self._target_generator() + self._target_generator.initialize() return self._target_generator def reset_class(self, classes, reuse_weights=None): diff --git a/tests/py3_mxnet.yml b/tests/py3_mxnet.yml index 6cb3b64b71..76d0cfb2af 100644 --- a/tests/py3_mxnet.yml +++ b/tests/py3_mxnet.yml @@ -23,4 +23,4 @@ dependencies: - opencv-python - git+https://github.com/zhanghang1989/detail-api.git#subdirectory=PythonAPI - portalocker - - autocfg + - autocfg \ No newline at end of file diff --git a/tests/py3_mxnet_ci.yml b/tests/py3_mxnet_ci.yml new file mode 100644 index 0000000000..604daf8e5d --- /dev/null +++ b/tests/py3_mxnet_ci.yml @@ -0,0 +1,27 @@ +name: gluon_cv_py3_mxnet +channels: + - conda-forge + - defaults +dependencies: + - python=3.6 + - perl + - sphinx=1.7.2 + - nose + - coverage=4.5.4 + - scipy + - cython + - pip=20.2.4 + - requests + - matplotlib + - tqdm + - pillow + - pip: + - mxnet + - coverage-badge + - awscli + - nose-timer + - opencv-python + - git+https://github.com/zhanghang1989/detail-api.git#subdirectory=PythonAPI + - portalocker + - autocfg + - boto3 diff --git a/tools/batch/docker/Dockerfile.gpu b/tools/batch/docker/Dockerfile.gpu index 1b96a8afed..2a04f60b2b 100644 --- a/tools/batch/docker/Dockerfile.gpu +++ b/tools/batch/docker/Dockerfile.gpu @@ -20,6 +20,7 @@ RUN apt-get update && apt-get install -y --no-install-recommends \ python3-pip \ python3-setuptools \ pandoc \ + libgl1-mesa-glx \ libxft-dev &&\ rm -rf /var/lib/apt/lists/* @@ -28,7 +29,11 @@ RUN pip3 install --no-cache --upgrade \ wheel \ cmake \ awscli \ - pypandoc + pypandoc \ + nose \ + nose-timer \ + torch \ + torchvision RUN git clone https://github.com/dmlc/gluon-cv WORKDIR gluon-cv ADD gluon_cv_job.sh . diff --git a/tools/batch/docker/docker_deploy.sh b/tools/batch/docker/docker_deploy.sh index 0335dff7e2..03cc015d58 100755 --- a/tools/batch/docker/docker_deploy.sh +++ b/tools/batch/docker/docker_deploy.sh @@ -14,7 +14,7 @@ if [ $TYPE == cpu ] || [ $TYPE == CPU ]; then elif [ $TYPE == gpu ] || [ $TYPE == GPU ]; then docker build -f Dockerfile.gpu -t gluon-cv-1:latest . docker tag gluon-cv-1:latest $AWS_ECR_REPO:latest - docker push $AWS_ECR_REPO1:latest + docker push $AWS_ECR_REPO:latest else echo "Invalid type detected. Choices: cpu, gpu" exit 1 diff --git a/tools/batch/submit-job.py b/tools/batch/submit-job.py index 2d0f698223..fcf59179f9 100644 --- a/tools/batch/submit-job.py +++ b/tools/batch/submit-job.py @@ -27,7 +27,7 @@ parser.add_argument('--saved-output', help='output to be saved, relative to working directory. ' 'it can be either a single file or a directory', - type=str, default='.') + type=str, default='None') parser.add_argument('--save-path', help='s3 path where files are saved.', type=str, default='batch/temp/{}'.format(datetime.now().isoformat())) @@ -131,10 +131,11 @@ def main(): describeJobsResponse = batch.describe_jobs(jobs=[jobId]) status = describeJobsResponse['jobs'][0]['status'] if status == 'SUCCEEDED' or status == 'FAILED': - print('=' * 80) - print('Job [{} - {}] {}'.format(jobName, jobId, status)) + print('Output [{}]:\n {}'.format(logStreamName, '=' * 80)) if logStreamName: startTime = printLogs(logGroupName, logStreamName, startTime) + 1 + print('=' * 80) + print('Job [{} - {}] {}'.format(jobName, jobId, status)) sys.exit(status == 'FAILED') elif status == 'RUNNING': @@ -142,10 +143,10 @@ def main(): if not running: running = True print('\rJob [{}, {}] is RUNNING.'.format(jobName, jobId)) - if logStreamName: - print('Output [{}]:\n {}'.format(logStreamName, '=' * 80)) - if logStreamName: - startTime = printLogs(logGroupName, logStreamName, startTime) + 1 + # if logStreamName: + # if logStreamName: + # startTime = printLogs(logGroupName, logStreamName, startTime) + 1 + print('\rJob [{}, {}] is still RUNNING.'.format(jobName, jobId)) elif status not in status_set: status_set.add(status) print('\rJob [%s - %s] is %-9s... %s' % (jobName, jobId, status, spin[spinner % len(spin)]),)