Skip to content

Commit

Permalink
Run data caterer in background, allow for option of running data gene…
Browse files Browse the repository at this point in the history
…ration first then application
  • Loading branch information
pflooky committed Jul 1, 2024
1 parent 5b270da commit 875c874
Show file tree
Hide file tree
Showing 14 changed files with 449 additions and 326 deletions.
276 changes: 55 additions & 221 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,234 +1,62 @@
# Create a JavaScript Action
# Data Caterer Action

[![GitHub Super-Linter](https://github.com/actions/javascript-action/actions/workflows/linter.yml/badge.svg)](https://github.com/super-linter/super-linter)
![CI](https://github.com/actions/javascript-action/actions/workflows/ci.yml/badge.svg)

Use this template to bootstrap the creation of a JavaScript action. :rocket:
Automated integration tests for any application or job.

This template includes compilation support, tests, a validation workflow,
publishing, and versioning guidance.
- Spin up any external services
- Generate production-like data
- Run data validations to ensure application or job works as expected

If you are new, there's also a simpler introduction in the
[Hello world JavaScript action repository](https://github.com/actions/hello-world-javascript-action).

## Create Your Own Action

To create your own action, you can use this repository as a template! Just
follow the below instructions:

1. Click the **Use this template** button at the top of the repository
1. Select **Create a new repository**
1. Select an owner and name for your new repository
1. Click **Create repository**
1. Clone your new repository

> [!IMPORTANT]
>
> Make sure to remove or update the [`CODEOWNERS`](./CODEOWNERS) file! For
> details on how to use this file, see
> [About code owners](https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners).
## Initial Setup

After you've cloned the repository to your local machine or codespace, you'll
need to perform some initial setup steps before you can develop your action.

> [!NOTE]
>
> You'll need to have a reasonably modern version of
> [Node.js](https://nodejs.org) handy. If you are using a version manager like
> [`nodenv`](https://github.com/nodenv/nodenv) or
> [`nvm`](https://github.com/nvm-sh/nvm), you can run `nodenv install` in the
> root of your repository to install the version specified in
> [`package.json`](./package.json). Otherwise, 20.x or later should work!
1. :hammer_and_wrench: Install the dependencies

```bash
npm install
```

1. :building_construction: Package the JavaScript for distribution

```bash
npm run bundle
```

1. :white_check_mark: Run the tests

```bash
$ npm test
PASS ./index.xtest.js
✓ throws invalid number (3ms)
wait 500 ms (504ms)
test runs (95ms)
...
```

## Update the Action Metadata

The [`action.yml`](action.yml) file defines metadata about your action, such as
input(s) and output(s). For details about this file, see
[Metadata syntax for GitHub Actions](https://docs.github.com/en/actions/creating-actions/metadata-syntax-for-github-actions).

When you copy this repository, update `action.yml` with the name, description,
inputs, and outputs for your action.

## Update the Action Code

The [`src/`](./src/) directory is the heart of your action! This contains the
source code that will be run when your action is invoked. You can replace the
contents of this directory with your own code.

There are a few things to keep in mind when writing your action code:

- Most GitHub Actions toolkit and CI/CD operations are processed asynchronously.
In `main.js`, you will see that the action is run in an `async` function.

```javascript
const core = require('@actions/core')
//...
async function run() {
try {
//...
} catch (error) {
core.setFailed(error.message)
}
}
```

For more information about the GitHub Actions toolkit, see the
[documentation](https://github.com/actions/toolkit/blob/master/README.md).

So, what are you waiting for? Go ahead and start customizing your action!

1. Create a new branch

```bash
git checkout -b releases/v1
```

1. Replace the contents of `src/` with your action code
1. Add tests to `__tests__/` for your source code
1. Format, test, and build the action

```bash
npm run all
```

> [!WARNING]
>
> This step is important! It will run [`ncc`](https://github.com/vercel/ncc)
> to build the final JavaScript action code with all dependencies included.
> If you do not run this step, your action will not work correctly when it is
> used in a workflow. This step also includes the `--license` option for
> `ncc`, which will create a license file for all of the production node
> modules used in your project.

1. Commit your changes
## Usage

```bash
git add .
git commit -m "My first action is ready!"
1. Create YAML file `.github/workflows/integration-test.yaml`
```yaml
name: Integration Test
on:
push:
branches:
- *
jobs:
integration-test:
name: Integration Test via Data Caterer
runs-on: ubuntu-latest
steps:
- name: Run integration tests
uses: data-catering/data-caterer-action@v1
```

1. Push them to your repository

```bash
git push -u origin releases/v1
1. Create YAML file `data-caterer.yaml`
1. For the latest supported services,
[check here](https://github.com/data-catering/insta-infra?tab=readme-ov-file#services).
Supported services include: activemq, airflow, cassandra, clickhouse,
cockroachdb, dagster, data-caterer, debezium, doris, druid, duckdb,
elasticsearch, flight-sql, flink, httpbin, kafka, keycloak, mage-ai,
mariadb, marquez, minio, mongodb, mysql, neo4j, pinot, postgres, prefect,
presto, rabbitmq, solace, spanner, sqlite, temporal, trino, unitycatalog,
zookeeper
```yaml
services: []
run:
- command: ./my-app/run-app.sh
test:
generation:
parquet:
- options:
path: /tmp/parquet/accounts
schema:
fields:
- name: account_id
validation:
parquet:
- options:
path: /tmp/parquet/accounts
validations:
- expr: ISNOTNULL(account_id)
- aggType: count
aggExpr: count == 1000
```

1. Create a pull request and get feedback on your action
1. Merge the pull request into the `main` branch

Your action is now published! :rocket:

For information about versioning your action, see
[Versioning](https://github.com/actions/toolkit/blob/master/docs/action-versioning.md)
in the GitHub Actions toolkit.

## Validate the Action

You can now validate the action by referencing it in a workflow file. For
example, [`ci.yml`](./.github/workflows/ci.yml) demonstrates how to reference an
action in the same repository.

```yaml
steps:
- name: Checkout
id: checkout
uses: actions/checkout@v3
- name: Test Local Action
id: test-action
uses: ./
with:
milliseconds: 1000
- name: Print Output
id: output
run: echo "${{ steps.test-action.outputs.time }}"
```

For example workflow runs, check out the
[Actions tab](https://github.com/actions/javascript-action/actions)! :rocket:

## Usage

After testing, you can create version tag(s) that developers can use to
reference different stable versions of your action. For more information, see
[Versioning](https://github.com/actions/toolkit/blob/master/docs/action-versioning.md)
in the GitHub Actions toolkit.

To include the action in a workflow in another repository, you can use the
`uses` syntax with the `@` symbol to reference a specific branch, tag, or commit
hash.

```yaml
steps:
- name: Checkout
id: checkout
uses: actions/checkout@v4
- name: Run my Action
id: run-action
uses: actions/javascript-action@v1 # Commit with the `v1` tag
with:
milliseconds: 1000
- name: Print Output
id: output
run: echo "${{ steps.run-action.outputs.time }}"
```

## Automatic Integration Tests via GitHub Actions

1. Create pull request
1. Triggers GitHub action
1. Sets up services via insta-infra
1. Data generated via data-caterer
1. Results published to pull request

## Setup

- Create new GitHub application
- Follows similar flow to renovatebot
- Creates initial pull request with template YAML file and GitHub action YAML
- Can try to detect services used (i.e. check `src/test/resources`)
- Can try detect via dependencies (i.e. build.gradle, pom.xml,
requirements.txt, etc.)
- Users can manually add the services their app interacts with (i.e.
Postgres, MySQL)
- The target repository could be one or more apps/jobs
- Need to know how to run their app/job
- Use docker image by default, otherwise build and start script could be
provided by user
- Use default credentials of each service
- Ability to define custom data generation/validation with YAML

### Example Flow

GitHub Action YAML
Expand Down Expand Up @@ -273,3 +101,9 @@ options: #additional options
keepAlive: true #could allow services to be kept alive after running
deleteData: false #retain data for further investigation/debugging/testing
```

## Test Command

```shell
CONFIGURATION_FILE=example/postgres-to-csv.yaml INSTA_INFRA_FOLDER=../insta-infra BASE_FOLDER=/tmp/data-caterer-action node src/index.js
```
4 changes: 2 additions & 2 deletions action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,12 +30,12 @@ runs:
uses: actions/checkout@v4
with:
repository: data-catering/insta-infra
path: data-caterer-integration-test/insta-infra
path: ${{ inputs.insta_infra_folder }}
- name: Run integration tests
uses: actions/github-script@v6
with:
script: |
const { script } = require("${{github.action_path}}/dist/index.js")
const { script } = require("${{ github.action_path }}/dist/index.js")
script()
env:
CONFIGURATION_FILE: ${{ inputs.configuration_file }}
Expand Down
2 changes: 1 addition & 1 deletion badges/coverage.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 875c874

Please sign in to comment.