Skip to content

margaretmeehan/R-Tasks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

R Tasks for Azure DevOps

Overview

This repo contains two Azure DevOps tasks for R projects. These tasks are intended to work on any build agent.

  • InstallRPackages - Install a specified list of R packages from CRAN
  • RunRScript - Run a specified R script with support for command-line arguments

Build Status

InstallRPackage

Build Status

RunRScript

Build Status

Azure DevOps Marketplace Extensions

I. CI/CD Setup for Task Development

For the development of these tasks, there are two general pipeline templates all the tasks use: Build and Release.

Build

Build is a short, single-stage pipeline template primarily concerned with installing dependencies, building the TypeScript task files, and running associated unit tests. It's intended to be run frequently during development, like whenever a relevant change is pushed up to a feature branch.

Release

Release is a longer pipeline template with three stages: Build, Dev, and Prod.

  • Build Stage - largely the same as the Build pipeline, this stage reuses the same templated steps with the added step of packaging the task for use in the next two stages.
  • Dev Stage - using the packaged task from the build stage, this task is responsible for privately uploading that task to an AzDO org and then validating the task in a more real-world scenario. It does this by directly invoking the new task as a user would and running a step to validate that it works as expected (e.g. using a custom script). These specific validation steps vary depending on what task is in the current context.
  • Prod Stage - assuming the previous stages have both succeeded and that the pipeline is running on the master branch, this stage currently publishes the current task to the Visual Studio Marketplace (currently this is set as a private publish).

Task Context: Triggering Pipelines at the Task-level

A problem encountered early on was finding a good way to run only the pipeline steps relevant to the task being actively worked on. For instance, making a code change to one task ideally should not also trigger a pipeline run for all other tasks in the repository.

The chosen solution to this was extracting as much as possible into two generalized pipeline templates (Build and Release). These pipeline templates don't know anything about the current task and only use variables to do all their work.

Each individual task has two short YAML files in ./pipelines that define these variables and specify pipeline triggers. One references the Build pipeline template and the other references the Release pipeline template. This allows each task to define their own triggering rules while still reusing virtually all of the same pipeline logic for only their own relevant files.

For examples of this, take a look at the two pipeline files for RunRScript:

II. Task Development

Adding a new task

Prerequisites

A note about Node versions in developing extensions for Azure DevOps:

The production Environment only uses Node10 or Node6 (by using the Node in the execution object instead of Node10).

Cited from the Microsoft docs on developing an Azure Pipelines task here. For this reason, the tasks in this repo target Node v10.

Folder structure

In this repo, each task is its own Azure DevOps extension. The following is the general folder structure:

|-- extensionDir/ # e.g. RScriptTask
  |-- README.md
  |-- taskLogo.png
  |-- vss-extension.json
  |-- taskDir/ # e.g. RunRScript
    |-- index.ts
    |-- task.json
    |-- package.json
    |-- package-lock.json
    |-- tsconfig.json
    |-- tests/
      |-- _suite.ts

A more in-depth look at each file's role:

  • README.md - what will be displayed in the extension's marketplace page when published

  • taskLogo.png - the icon to be used for the extension in the VS marketplace

  • vss-extension.json - sometimes referred to as the extension's manifest file: defines metadata for the extension such as icons, the markdown file to use (README.md), and the task's src directory (taskDir/)

    See example manifest JSON

    In this example, our task will be called PlaceholderTask, so that will be replace the role of taskDir.

    {
      "manifestVersion": 1,
      "id": "placeholder-task",
      "name": "Placeholder Task",
      "version": "0.0.1",
      "publisher": "placeholder-publisher",
      "targets": [
        {
          "id": "Microsoft.VisualStudio.Services"
        }
      ],  
      "description": "Brief task description text",
      "categories": [
        "Azure Pipelines"
      ],
      "icons": {
        "default": "taskLogo.png"
      },
      "files": [
        {
          "path": "PlaceholderTask"
        }
      ],
      "content": {
        "details": {
          "path": "README.md"
        }
      },
      "contributions": [
        {
          "id": "placeholder-task-id", // this can be anything
          "type": "ms.vss-distributed-task.task",
          "targets": [
            "ms.vss-distributed-task.tasks"
          ],
          "properties": {
            "name": "PlaceholderTask"
          }
        }
      ]
    }
  • index.ts - the entry point for the task program (runs as a Node app)

  • task.json - supplementing vss-extension.json, this describes task-specific metadata such as name, ID, author, input definitions, and execution information (e.g. Node10)

    See example task JSON
    {
      "$schema": "https://raw.githubusercontent.com/Microsoft/azure-pipelines-task-lib/master/tasks.schema.json",
      "id": "{{ GUID }}",
      "name": "PlaceholderTask",
      "friendlyName": "Placeholder Task",
      "description": "Placeholder task for demonstration!",
      "helpMarkDown": "",
      "category": "Utility",
      "author": "[email protected]",
      "version": {
        "Major": 0,
        "Minor": 0,
        "Patch": 1
      },
      "instanceNameFormat": "{{ text to be displayed in the pipeline step list }}",
      "inputs": [
        {
          "name": "stringInput1",
          "type": "string",
          "label": "Simple string input",
          "defaultValue": "",
          "required": true,
          "helpMarkDown": "Required string input named stringInput1 with no default value"
        }
      ],
      "execution": {
        "Node10": {
          "target": "index.js"
        }
      }
    }
  • package.json - tracks the task's Node dependencies as well as custom npm scripts

    See example package JSON
    {
      "name": "placeholdertask",
      "version": "1.0.0",
      "description": "",
      "main": "index.js",
      "scripts": {
        "compile": "tsc",
        "test": "mocha tests/_suite.js",
        "upload": "tfx build tasks upload --task-path ./ --overwrite",
        "delete": "tfx build tasks delete --task-id {{ GUID from task.json }}"
      },
      "author": "",
      "license": "ISC",
      "dependencies": {
        "azure-pipelines-task-lib": "^2.9.5"
      },
      "devDependencies": {
        "@types/mocha": "^7.0.2",
        "@types/node": "^14.0.6",
        "@types/q": "^1.5.4",
        "mocha": "^7.2.0",
        "sync-request": "^6.1.0",
        "typescript": "^3.9.3"
      }
    }
  • package-lock.json - describes the exact dependency tree that was generated

  • tsconfig.json - specifies root files and compiler options required to compile the task scripts to JS

  • _suite.ts - defines the unit tests for the task (uses the Mocha test framework)

Unit tests

The tasks in this repo use the Mocha testing framework for unit tests. To run the tests for any given task, run the following from the task's source directory (containing package.json):

npm run compile # compile to JS files
npm run test # run the Mocha test suite

CI/CD

As mentioned in the previous CI/CD setup section, each new task adds two new YAML files under /pipelines defining respective variables and pipeline triggers: one for the build pipeline and one for the release pipeline.

Add entry point YAML files
placeholder-task-build.yml example
trigger:
  branches:
    include:
    - dev/*
  paths:
    include:
    - PlaceholderTaskDir/* # dir containing vss-extension.json

variables:
- group: PlaceholderTask # variable group to be created

jobs:
- template: templates/build.yml
placeholder-task-release.yml example
trigger:
  branches:
    include:
    - master
  paths:
    include:
    - PlaceholderTaskDir/* # dir containing vss-extension.json

variables:
- group: PlaceholderTask # variable group to be created
- group: Marketplace # variable group containing publishing info

stages:
- template: templates/release.yml
Create variable group

As referenced in the above YAML examples, each new task has its own variable group defined in the Azure DevOps UI. This page can be found using the blade on the left-hand side under Pipelines -> Library.

The new variable group should have the following four variables defined:

  • taskID - the id value from the task's vss-extension.json
    • e.g. placeholder-task
  • taskName - the directory name containing the task's task.json
    • e.g. PlaceholderTask
  • taskRoot - the directory name containing the task's vss-extension.json
    • e.g. PlaceholderTaskDir
  • taskSrc - the value of taskRoot/taskName
    • e.g. PlaceholderTaskDir/PlaceholderTask

The shared Marketplace variable group should be referenced in all release pipelines. It can be edited in the Azure DevOps UI and contains:

  • publisherID - publisher ID to use during private publish to Visual Studio Marketplace
  • sharedList - comma-separated list of ADO organizations to share task with after publish
  • serviceUrl - URL value in the format https://dev.azure.com/{{org}}
  • accessToken - PAT with sufficient permissions for uploading an extension to the organization

Automated versioning

Currently, the release pipeline automatically increments a task's patch version upon publishing to the marketplace (assuming there exists a currently published version). If the task is being published for the first time, it just uses whatever version is specified in the manifest file, vss-extension.json.

For incrementing a task's major and minor version, it should just be manually updated in the task's manifest file and the automated patch updates should pick that up and reset.

Task validation / Integration testing

In addition to unit tests, the Dev stage of the release pipeline runs the validate-steps.yml template as a form of integration testing. These steps are intended to invoke the updated task as a user would in yaml:

- task: PlaceholderTask@0
  displayName: Run Placeholder Task
  condition: eq(variables['taskName'], 'PlaceholderTask')
  inputs:
    stringInput1: hello world

These are run only when Release is being run for an appropriate task, defined by condition. Depending on the task, just verifying that the task runs without error might be enough validation. Otherwise, additional steps can be added using the same condition for more validation steps for that task.

If any of these steps fail, the Dev stage will also fail and the Prod stage will not run.

Publishing

This step will only run if running from the master branch (i.e. after merging a PR or after triggering manually). Assuming all the above variables have been set, no further changes should be required for publishing.

The pipeline is currently configured to publish privately and share with select organizations defined in the Marketplace variable group.

Note: The first time a new task is publishing, the release pipeline will appear to fail when it has actually succeeded. This is a result of the step to query for the currently published version of the task -- which would not exist. This "succeeds with issues" and then results in the pipeline showing a red fail icon, even though it has successfully published to the marketplace.

III. Refactor / Todo

R Installation Cleanup

While waiting on R to be installed on the Microsoft-hosted build agents, the Dev stage of the release pipeline has been referencing a powershell script to manually install R on each platform: scripts/install_r.ps1

In addition to this, the pipeline uses an added step to manually set the R_LIBS_USER environment variable to a new directory. This environment variable is set by default by R and defines where to install packages to. However, when installing R manually, it's set to a directory the agent doesn't have write permissions to (depending on platform). This issue is mitigated by manually setting this environment variable to a directory the agent has full write access to.

Both of these steps are used in pipelines/templates/validate-steps.yml and can be removed once R is installed and configured on all Microsoft-hosted agent images.

InstallRPackages - Installing packages from sources other than CRAN

Prioritizing core functionality first, the task currently only supports installing packages from CRAN. However, there are numerous packages available in other locations such as GitHub, Bioconductor, etc. One approach for addressing this is to use the devtools R package to install packages from many different sources. Example usages:

devtools::install_github() # installing from GitHub
devtools::install_bioc() # installing from Bioconductor

This can be addressed in a future version of this task or done by using the RunRScript task.

Release Pipeline (Validation) - Referencing a task before it has been uploaded for the first time

In the Dev stage of the Release pipeline, a task is uploaded directly to an AzDO org to be validated. As part of this validation, the task is referenced in YAML as a user would use it:

- task: RunRScript@0
  displayName: Run R Script
  inputs:
    scriptPath: # path to R script
    arguments: # string of script args

The issue here is that the pipeline will not be able to run if this step exists before the task has been uploaded for the first time. The pipeline will raise an error regarding referencing a task that does not exist.

The workaround for this has been to manually trigger a Release pipeline run without the validation steps present for that specific task (validate-steps.yml). A first-time run of the Dev stage causes the pipeline to recognize the task moving forward. In order to prevent a full publish in this scenario, there is a check in the Prod stage ensuring that it will only publish to marketplace if it's running from the master branch. This allows for safe manual runs of the Release pipeline in feature branches.

After the new task has been uploaded to the AzDO org for the first time, those validation steps can be added back for all subsequent runs.

Access Tokens / Organization Permissions

I tried using my own access token with the appropriate scopes set, but when uploading a task directly to an organization for testing I kept seeing this error:

You cannot install this extension because it includes a build task, and you do not have sufficient permissions.
To proceed, you must be an administrator of All Pools.

Even when added as an administrator to all pools and as a project administrator, I still saw the error.

My current workaround is to use an access token created by someone else who already has the correct permissions. This seems to be a common issue, as can be seen in this GitHub issue thread for the tfs-cli. The accepted resolution appears to be getting added to the Project Collection Administrators group.

IV. Credit

The R logo has been sourced from the the R Project website and is being used under the CC-BY-SA 4.0 license.


License

This project uses MIT licensing. All contributions in any form shall have the Open Source Community's best interest in mind. Please feel free to use, fork and contribute back to the community.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

About

R-tasks for Azure DevOps Pipelines

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published