Skip to content

Commit

Permalink
refactor(operator): feature discovery
Browse files Browse the repository at this point in the history
refs akash-network/support#141

Signed-off-by: Artur Troian <[email protected]>
  • Loading branch information
troian committed Jan 25, 2024
1 parent af85c28 commit c127da6
Show file tree
Hide file tree
Showing 82 changed files with 6,385 additions and 2,202 deletions.
100 changes: 88 additions & 12 deletions .github/workflows/integration-tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,47 +16,123 @@ jobs:
KIND_NAME: kube
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Setup GOPATH
run: echo "GOPATH=$GITHUB_WORKSPACE/go" >> $GITHUB_ENV
- name: Ensure GOPATH dirs
run: mkdir -p ${{ env.GOPATH }}/{src,bin,pkg}
- name: Checkout codebase
uses: actions/checkout@v4
with:
fetch-depth: 0
set-safe-directory: true
path: go/src/github.com/akash-network/provider
- name: Detect required Go version
working-directory: "${{ env.GOPATH }}/src/github.com/akash-network/provider"
run: |
toolchain=$(./script/tools.sh gotoolchain | sed 's/go*//')
echo "GOVERSION=${toolchain}" >> $GITHUB_ENV
- uses: actions/setup-go@v4
with:
go-version: "${{ env.GOVERSION }}"
- name: Setup direnv
uses: HatsuneMiku3939/direnv-action@v1
- name: Install tools
run: |
curl -sfL https://direnv.net/install.sh | bash
go install github.com/fullstorydev/grpcurl/cmd/grpcurl@latest
- name: Hook direnv to bash
run: echo 'eval "$(direnv hook bash)"' >> $HOME/.bashrc
- name: Direnv allow
run: direnv allow ${{ env.GOPATH }}/src/github.com/akash-network/provider
- name: Checkout akash-api
working-directory: ${{ env.GOPATH }}/src/github.com/akash-network/provider
run: |
AKASH_API_VERSION=$(go list -mod=readonly -m -f '{{ .Version }}' github.com/akash-network/akash-api)
echo "AKASH_API_VERSION=$AKASH_API_VERSION" >> "$GITHUB_ENV"
- uses: actions/checkout@v4
with:
repository: "akash-network/akash-api"
fetch-depth: 0
set-safe-directory: true
path: go/src/github.com/akash-network/akash-api
ref: ${{ env.AKASH_API_VERSION }}
- name: Setup akash-api
run: |
direnv allow ${{ env.GOPATH }}/src/github.com/akash-network/akash-api
cd ${{ env.GOPATH }}/src/github.com/akash-network/akash-api
cat "$GITHUB_ENV" > "$GITHUB_WORKSPACE/env.backup"
direnv export gha >> "$GITHUB_ENV"
- name: Add akash-api to go.work
working-directory: ${{ env.GOPATH }}/src/github.com/akash-network/akash-api
run: |
make modvendor
printf "use (\n\t.\n)\n" > ../provider/go.work
printf "\ngo ${{ env.GOVERSION }}\n" >> ../provider/go.work
printf "\nreplace (\n\tgithub.com/akash-network/akash-api => ../akash-api\n)\n" >> ../provider/go.work
cat "$GITHUB_WORKSPACE/env.backup" > "$GITHUB_ENV"
- name: Setup direnv for provider
run: direnv export gha >> "$GITHUB_ENV"
working-directory: ${{ env.GOPATH }}/src/github.com/akash-network/provider
- name: Fetch kind version from go modules
working-directory: ${{ env.GOPATH }}/src/github.com/akash-network/provider
run: echo "KIND_VERSION=$(go list -mod=readonly -m -f '{{ .Version }}' sigs.k8s.io/kind)" >> $GITHUB_ENV
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Go mod tidy
working-directory: ${{ env.GOPATH }}/src/github.com/akash-network/provider
run: go mod tidy
- name: Make node scripts executable
working-directory: ${{ env.GOPATH }}/src/github.com/akash-network/provider
run: make chmod-akash-scripts
- uses: helm/kind-action@v1
- name: Setup kind
uses: helm/kind-action@v1
with:
version: "${{ env.KIND_VERSION }}"
node_image: "kindest/node:${{ env.KINDEST_VERSION }}"
cluster_name: "${{ env.KIND_NAME }}"
config: ./_run/kube/kind-config.yaml
config: ${{ env.GOPATH }}/src/github.com/akash-network/provider/_run/kube/kind-config.yaml
- name: Configure Kind cluster
run: KUSTOMIZE_INSTALLS=akash-operator-inventory make -s -C _run/kube kube-cluster-setup-e2e-ci
- name: k8s-ingress
run: make -s -C _run/kube kind-k8s-ip
- name: Kube Environment
working-directory: ${{ env.GOPATH }}/src/github.com/akash-network/provider
run: |
KUSTOMIZE_INSTALLS=akash-operator-inventory make -s -C _run/kube kube-cluster-setup-e2e-ci
- name: Setup K8S ingress
working-directory: ${{ env.GOPATH }}/src/github.com/akash-network/provider
run: |
make -s -C _run/kube kind-k8s-ip
- name: K8S dump config
working-directory: ${{ env.GOPATH }}/src/github.com/akash-network/provider
run: |
kubectl config view
- name: K8S dump cluster info
working-directory: ${{ env.GOPATH }}/src/github.com/akash-network/provider
run: |
kubectl cluster-info
- name: K8S dump running pods
working-directory: ${{ env.GOPATH }}/src/github.com/akash-network/provider
run: |
kubectl get pods,ingress,svc -A
- name: K8S wait for operator inventory
working-directory: ${{ env.GOPATH }}/src/github.com/akash-network/provider
run: |
make -s -C _run/kube kube-deployment-rollout-operator-inventory
- name: K8S test operator inventory GRPC ports
working-directory: ${{ env.GOPATH }}/src/github.com/akash-network/provider
run: |
__pod=$(kubectl -n akash-services get pods -l app.kubernetes.io/part-of=provider -l app.kubernetes.io/component=operator -l app.kubernetes.io/instance=inventory-node --no-headers -o custom-columns=":metadata.name")
kubectl -n akash-services port-forward --address 0.0.0.0 pod/$__pod 8444:grpc &
kubectl -n akash-services port-forward --address 0.0.0.0 service/operator-inventory 8445:grpc &
./script/inventory-grpc-probe.sh --host=localhost:8444 --mode=plaintext akash.inventory.v1.NodeRPC/QueryNode
./script/inventory-grpc-probe.sh --host=localhost:8445 --mode=plaintext akash.inventory.v1.ClusterRPC/QueryCluster
kubectl -n akash-services logs service/operator-inventory
- name: Run E2E Tests
run: make test-e2e-integration
working-directory: ${{ env.GOPATH }}/src/github.com/akash-network/provider
run: |
make test-e2e-integration
- name: Run K8s Tests
run: make test-k8s-integration
- name: Post-Run Kube Environment
working-directory: ${{ env.GOPATH }}/src/github.com/akash-network/provider
run: |
make test-k8s-integration
- name: Post-Run K8S environment
working-directory: ${{ env.GOPATH }}/src/github.com/akash-network/provider
run: |
kubectl get ns,pods,ingress,svc -A
252 changes: 252 additions & 0 deletions _docs/development-environment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,252 @@
# Setting-up development environment

**Warning**
All links to the `provider` repo are referencing to the `gpu` branch. As soon as `gpu` is merged into `main` all links need update.

This page covers setting up development environment for both [node](https://github.com/akash-network/node) and [provider](https://github.com/akash-network/provider) repositories.
The provider repo elected as placeholder for all the scripts as it depends on the `node` repo (Better explanation?)
Should you already know what this guide is all about - feel free to explore [examples](#how-to-use-runbook)

## Code

Checkout code if not done so into place of your convenience.
For this example, repositories will be located in `~/go/src/github.com/akash-network`

Checkout below assumes `git` is set to use SSH connection to GitHub

```shell
cd ~/go/src/github.com/akash-network # all commands below assume this as current directory
git clone [email protected]:akash-netowrk/node
git clone [email protected]:akash-netowrk/provider
```

## Requirements

- `Go` must be installed. Both projects are keeping up-to-date with major version on development branches.
Both repositories are using the latest version of the Go, however only minor that has to always match.

### Install tools

Run following script to install all system-wide tools.
Currently supported host platforms:

- MacOS
- Debian based OS
PRs with another hosts are welcome (except Windows)

```shell
./provider/script/install_dev_dependencies.sh
```

## How it works

### General behaviour

All examples are located within [_run](https://github.com/akash-network/provider/blob/gpu/_run) directory.
[Commands](#commands) are implemented as `make` targets.

There are three ways we use to set up the k8s cluster.

- kind
- minukube
- ssh

Both `kind` and `minikube` are e2e, i.e. the configuration is capable of spinning up cluster and the local host, whereas `ssh` expects cluster to be configured before use.

### Runbook

There are four configuration variants, each presented as directory within [_run](https://github.com/akash-network/provider/blob/gpu/_run).

- `kube` - uses `kind` to set up local cluster. It is widely used by e2e testing of the provider. Provider and the node run as host services. All operators run as kubernetes deployments.
- `single` - uses `kind` to set up local cluster. Main difference is both node and provider (and all operators) are running within k8s cluster as deployments. (at some point we will merge `single`
with `kube` and call it `kind`)
- `minikube` - not in use for now
- `ssh` - expects cluster to be up and running. mainly used to test sophisticated features like `GPU` or `IP leases`

The only difference between environments above is how they set up. Once running, all commands are the same.

Running through the entire runbook requires multiples terminals.
Each command is marked __t1__-__t3__ to indicate a suggested terminal number.

If at any point something goes wrong and cluster needs to be run from the beginning:

```shell
cd _run/<kube|single|ssh>
make kube-cluster-delete
make clean
```

### Kustomize

TBD

#### Parameters

| Name | Default value | Effective on target(s) | Notes |
|:---------------------|:------------------------------------------------------------------------------:|------------------------------------------------------------------------------------------------------------------------------------------|-------|
| `SKIP_BUILD` | `false` | |
| `DSEQ` | `1` | `deployment-*`<br/>`lease-*`<br/>`bid-*`<br/>`send-manifest` |
| `OSEQ` | `1` | `deployment-*`<br/>`lease-*`<br/>`bid-*`<br/>`send-manifest` |
| `GSEQ` | `1` | `deployment-*`<br/>`lease-*`<br/>`bid-*`<br/>`send-manifest` |
| `KUSTOMIZE_INSTALLS` | Depends on runbook<br/>Refer to each runbook's `Makefile` to see default value | `kustomize-init`<br/>`kustomize-templates`<br/>`kustomize-set-images`<br/>`kustomize-configure-services`<br/>`kustomize-deploy-services` | |

##### Keys

Each configuration creates four [keys](https://github.com/akash-network/provider/blob/gpu/_run/common.mk#L40..L43):
They keys are assigned to the targets and under normal circumstances there is no need to alter it. However, it can be done with setting `KEY_NAME`:

```shell
# create provider from **provider** key
make provider-create

# create provider from custom key
KEY_NAME=other make provider-create
```

#### How to use runbook

##### Kube

This runbook requires three terminals

1. Open runbook

__all three terminals__
```shell
cd _run/kube
```

2. Create and provision local kind cluster.

__t1 run__
```shell
make kube-cluster-setup
```
3. Start akash node

__t2 run__
```shell
make node-run
```
4. Create provider

__t1 run__
```shell
make provider-create
```

5. Start the provider

__t3 run__
```shell
make provider-create
```

6. Start the provider

__t1 run__
```shell
make provider-create
```

7. __t1__ Create a deployment. Check that the deployment was created. Take note of the `dseq` - deployment sequence:

```shell
make deployment-create
```

```shell
make query-deployments
```

After a short time, you should see an order created for this deployment with the following command:

```shell
make query-orders
```

The Provider Services Daemon should see this order and bid on it.

```shell
make query-bids
```

8. __t1 When a bid has been created, you may create a lease__

To create a lease, run

```shell
make lease-create
```

You can see the lease with:

```shell
make query-leases
```

You should now see "pending" inventory in the provider status:

```shell
make provider-status
```

9. __t1 Distribute Manifest__

Now that you have a lease with a provider, you need to send your
workload configuration to that provider by sending it the manifest:

```shell
make send-manifest
```

You can check the status of your deployment with:

```shell
make provider-lease-status
```

You can reach your app with the following (Note: `Host:` header tomfoolery abound)

```shell
make provider-lease-ping
```

10. __t1 Get service status__

```sh
make provider-lease-status
```

Fetch logs from deployed service (all pods)

```sh
make provider-lease-logs
```

##### Kube for e2e tests

This runbook requires two terminal

1. Open runbook

__t1__
```shell
cd _run/kube
```

2. Create and provision local kind cluster for e2e testing.

__t1 run__
```shell
make kube-cluster-setup-e2e

3. Run e2e tests

```shell
make test-e2e-intergration
```

##### Single

##### SSH
Loading

0 comments on commit c127da6

Please sign in to comment.