Skip to content

Commit

Permalink
ci: add additional disk size
Browse files Browse the repository at this point in the history
  • Loading branch information
Maksim Shakavin committed Jul 26, 2024
1 parent ed186fa commit 9f62e59
Show file tree
Hide file tree
Showing 29 changed files with 910 additions and 122 deletions.
1 change: 1 addition & 0 deletions .envrc
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
#shellcheck disable=SC2148,SC2155
export KUBECONFIG="$(expand_path ./kubeconfig)"
export SOPS_AGE_KEY_FILE="$(expand_path ./age.key)"
export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES
export TALOSCONFIG="$(expand_path ./kubernetes/talos/clusterconfig/talosconfig)"
use flake
4 changes: 4 additions & 0 deletions .sops.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,7 @@ creation_rules:
key_groups:
- age:
- "age1k5xl02aujw4rsgghnnd0sdymmwd095w5nqgjvf76warwrdc0uqpqsm2x8m"
- path_regex: .*\.sops\.ya?ml
key_groups:
- age:
- "age1k5xl02aujw4rsgghnnd0sdymmwd095w5nqgjvf76warwrdc0uqpqsm2x8m"
18 changes: 18 additions & 0 deletions .taskfiles/Ansible/Taskfile.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
---
# yaml-language-server: $schema=https://taskfile.dev/schema.json
version: "3"

vars:
ANSIBLE_INVENTORY_DIR: "{{.ANSIBLE_DIR}}/inventory"
ANSIBLE_PLAYBOOK_DIR: "{{.ANSIBLE_DIR}}/playbooks"

tasks:
proxmox-setup:
desc: Run Ansible setup playbook on the nodes
cmds:
- ansible-playbook -i {{.ANSIBLE_INVENTORY_DIR}}/hosts.yaml {{.ANSIBLE_PLAYBOOK_DIR}}/proxmox-setup.yaml -v

proxmox-update:
desc: Update proxmox packages
cmds:
- ansible-playbook -i {{.ANSIBLE_INVENTORY_DIR}}/hosts.yaml {{.ANSIBLE_PLAYBOOK_DIR}}/proxmox-apt-upgrade.yaml
50 changes: 32 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,16 +29,30 @@ exploring Kubernetes and Infrastructure as Code (IaC) practices using tools like
## 📖 Table of contents

- [🍼 Overview](#-overview)
- [📖 Table of contents](#-table-of-contents)
- [📚 Documentation](#-documentation)
- [🖥️ Technological Stack](#-technological-stack)
- [🔧 Hardware](#-hardware)
- [☁️ External Dependencies](#-external-dependencies)
- [🤖 Automation](#-automation)
- [🤝 Thanks](#-thanks)
- [📖 Table of contents](#-table-of-contents)
- [📚 Documentation](#-documentation)
- [🖥️ Technological Stack](#-technological-stack)
- [🔧 Hardware](#-hardware)
- [☁️ External Dependencies](#-external-dependencies)
- [🤖 Automation](#-automation)
- [🤝 Thanks](#-thanks)

## 📚 Documentation

1. [Prerequisites](docs/prerequisites.md)
- [Cloudflare](docs/prerequisites.md#1-set-up-cloudflare)
- [Secrets store](docs/prerequisites.md#2-set-up-secrets-store)
- [UDM](docs/prerequisites.md#3-set-up-udm)
- [Discord](docs/prerequisites.md#4-get-discord-token)
- [PiHole](docs/prerequisites.md#5-set-up-pihole-and-generate-token-for-homepage)
- [NAS and Minio](docs/prerequisites.md#6-nas-set-up)
2. [Setup Guide](docs/set-up.md)
- [Install and Configure Proxmox](docs/set-up.md#install-and-configure-proxmox)
- [Create and Install Talos Images](docs/set-up.md#create-and-install-talos-images)
- [Bootstrap Kubernetes Cluster](docs/set-up.md#bootstrap-kubernetes-cluster)
- [Install Flux](docs/set-up.md#install-flux)
3. [How To](docs/howto.md)

## 🖥️ Technological Stack

| | Name | Description |
Expand Down Expand Up @@ -74,17 +88,17 @@ exploring Kubernetes and Infrastructure as Code (IaC) practices using tools like
<img src="https://raw.githubusercontent.com/MaksimShakavin/flux-homelab/main/docs/assets/rack.jpg" align="center" width="200px" alt="rack"/>
</details>

| Device | Count | Disk Size | RAM | OS | Purpose |
|----------------------------|-------|-----------|------|---------|-------------------------|
| Lenovo M910Q Tiny i5-6500T | 3 | 256G | 32GB | Talos | Kubernetes Master Nodes |
| Raspberry Pi 5 | 1 | | 8GB | RpiOS | DNS, SmartHome |
| Synology RS422+ | 1 | 4x16TB | 2GB | DSM | NAS |
| UPS 5UTRA91227 | 1 | | | | UPS |
| UniFi UDM Pro | 1 | | | UnifiOS | Router |
| UniFi USW PRO 24 Gen2 | 1 | | | | Switch |
| UniFi USW Lite 8 | 1 | | | | Switch |
| UniFi U6 In-Wall | 1 | | | | Access Point |
| UniFi U6 Mesh | 1 | | | | Access Point |
| Device | Count | Disk Size | RAM | OS | Purpose |
|----------------------------|-------|------------|------|---------|-------------------------|
| Lenovo M910Q Tiny i5-6500T | 3 | 2x1TB SSD | 32GB | Talos | Kubernetes Master Nodes |
| Raspberry Pi 5 | 1 | | 8GB | RpiOS | DNS, SmartHome |
| Synology RS422+ | 1 | 4x16TB HDD | 2GB | DSM | NAS |
| UPS 5UTRA91227 | 1 | | | | UPS |
| UniFi UDM Pro | 1 | | | UnifiOS | Router |
| UniFi USW PRO 24 Gen2 | 1 | | | | Switch |
| UniFi USW Lite 8 | 1 | | | | Switch |
| UniFi U6 In-Wall | 1 | | | | Access Point |
| UniFi U6 Mesh | 1 | | | | Access Point |

## ☁️ External Dependencies

Expand Down
5 changes: 5 additions & 0 deletions Taskfile.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,17 @@ version: "3"
vars:
# Directories
KUBERNETES_DIR: "{{.ROOT_DIR}}/kubernetes"
INFRA_DIR: "{{.ROOT_DIR}}/infrastructure"
ANSIBLE_DIR: "{{.INFRA_DIR}}/ansible"
# Files
AGE_FILE: "{{.ROOT_DIR}}/age.key"
KUBECONFIG_FILE: "{{.ROOT_DIR}}/kubeconfig"
INFRA_SECRETS_FILE: "{{.INFRA_DIR}}/secrets.sops.yaml"

env:
KUBECONFIG: "{{.KUBECONFIG_FILE}}"
SOPS_AGE_KEY_FILE: "{{.AGE_FILE}}"
INFRA_SECRETS_FILE: "{{.INFRA_SECRETS_FILE}}"

includes:
kubernetes:
Expand All @@ -21,6 +25,7 @@ includes:
talos: .taskfiles/Talos/Taskfile.yaml
sops: .taskfiles/Sops/Taskfile.yaml
volsync: .taskfiles/VolSync/Taskfile.yaml
ansible: .taskfiles/Ansible/Taskfile.yaml
secrets: .taskfiles/ExternalSecrets/Taskfile.yaml

tasks:
Expand Down
72 changes: 72 additions & 0 deletions docs/howto.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
## How to

### Reset node ephemeral storage

In case some of the local hostpath PVs use all the node storage and fill up the disk, the only way is to completely
reset the disk. It can be done with the following command:

```sh
talosctl --talosconfig=./kubernetes/bootstrap/talos/clusterconfig/talosconfig --nodes=[NODE_IP] reset --system-labels-to-wipe EPHEMERAL
```

1. Start the node from the Proxmox UI.
2. Manually delete all previous PVCs and PVs for a local-hostpath storage class that were hosted on the node.
3. Manually delete pods so they are recreated

### Upgrade ssd storage

1. Add a new SSD to the machine
2. Wipe it from the Proxmox UI and press “Initialize Disk with GPT.”
3. Create a new LVM Volume group. LVM allows creating snapshots, which is probably not needed.
4. Add the disk as hardware to the VM. Don’t forget to disable backup.

### Replace a node

1. Reset the Talos node
```sh
talosctl --talosconfig=./kubernetes/bootstrap/talos/clusterconfig/talosconfig --nodes=[node-ip] reset`
```
2. Delete the node from Kubernetes
```shell
Add the disk as hardware to the VM. Don’t forget to disable backup.
```
3. Delete the node from the Proxmox cluster. SSH to an existing node and run:
```sh
pvecm delnode [node-name]
```
where node-name is the name from the Proxmox cluster configuration.
4. Delete information about the node on Proxmox machines from /etc/pve/nodes.
5. Continue with the [setup guide](./set-up.md) until the bootstrapping cluster point.
6. Apply the configuration to the new node:
```sh
talosctl apply-config --talosconfig=./clusterconfig/talosconfig --nodes=[node-ip] --file=./clusterconfig/home-kubernetes-k8s-control-1.yaml --insecure`
```
### Remove Cluster Info from Proxmox Node
```sh
systemctl stop pve-cluster corosync
pmxcfs -l
rm -rf /etc/corosync/*
rm /etc/pve/corosync.conf
killall pmxcfs
systemctl start pve-cluster
```
Delete information about rest nodes in /etc/pve/nodes
### Set Up GitHub App for a New Repository
1. Create a GitHub app following
the [guideline](https://docs.github.com/en/apps/creating-github-apps/registering-a-github-app/registering-a-github-app)
2. Copy the app ID and save it to a `BOT_APP_ID` repository secret and to a `ACTION_RUNNER_CONTROLLER_GITHUB_APP_ID`
property of an `actions-runner-controller` 1Password secret.
3. Generate a new app private key and add it to a `BOT_APP_PRIVATE_KEY` repository secret and to
the `ACTION_RUNNER_CONTROLLER_GITHUB_PRIVATE_KEY` property of an `actions-runner-controller` 1Password secret in the
format
```
-----BEGIN RSA PRIVATE KEY-----
...
-----END RSA PRIVATE KEY-----
```
114 changes: 114 additions & 0 deletions docs/prerequisites.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
## Prerequisites

### 1. Set up cloudflare
1. Go to [Cloudflare API Tokens](https://dash.cloudflare.com/profile/api-tokens) and create an API Token.
2. Under the `API Tokens` section, click the blue `Create Token` button.
3. Select the `Edit zone DNS` template by clicking the blue `Use template` button.
4. Under `Permissions`, click `+ Add More` and add the following permissions:
- `Zone - DNS - Edit`
- `Account - Cloudflare Tunnel - Read`
5. Limit the permissions to specific account and zone resources.
6. Click the blue `Continue to Summary` button and then the blue `Create Token` button.
7. Copy the token and save it to the secrets store under a `CF_API_TOKEN` field.

### 2. Set up secrets store
I use 1Password as the secrets store for my homelab cluster. To execute the IaC scripts that provision the
infrastructure, the [1Password Connect](https://developer.1password.com/docs/connect/) must be set up separately with access
to the 1Password vault. Once the cluster setup is complete, 1Password Connect will be hosted inside the cluster.

Ensure you update `OP_CONNECT_HOST` and `OP_CONNECT_TOKEN` in the [env file](../infrastructure/secrets.sops.yaml).

The 1Password vault should contain the following items:
<details>
<summary>1Password Vault Items</summary>

| Item name | Fields | Description |
|---------------------------|-------------------------------------------------|-----------------------------------------------------------|
| mino | MINIO_ROOT_USER | |
| | MINO_ROOT_PASSWORD | |
| | MINO_LOKI_BUCKET | |
| | MINO_LOKI_SECRET_KEY | |
| | MINO_LOKI_ACCESS_KEY | |
| | MINO_THANOS_BUCKET | |
| | MINO_THANOS_SECRET_KEY | |
| | MINO_THANOS_ACCESS_KEY | |
| cloudnative-pg | POSTGRESS_SUPER_USER | |
| | POSTGRESS_SUPER_PASS | |
| cloudflare | CLOUDFLARE_ACCOUNT_TAG | |
| | CLOUDFLARE_TUNNEL_SECRET | |
| | CLUSTER_CLOUDFLARE_TUNNEL_ID | |
| | CLOUDFLARE_HOMEPAGE_TUNNEL_SECRET | |
| | CF_API_TOKEN | |
| proxmox | username | |
| | password | |
| | HOMEPAGE_PROXMOX_USERNAME | |
| | HOMEPAGE_PROXMOX_PASSWORD | |
| actions-runner-controller | ACTION_RUNNER_CONTROLLER_GITHUB_APP_ID | |
| | ACTION_RUNNER_CONTROLLER_GITHUB_INSTALLATION_ID | |
| | ACTION_RUNNER_CONTROLLER_GITHUB_PRIVATE_KEY | In a format starting with -----BEGIN RSA PRIVATE KEY----- |
| unifipoller | username | |
| | password | |
| discord | GATUS_DISCORD_WEBHOOK | |
| | ALERTMANAGER_DISCORD_WEBHOOK | |
| gatus | GATUS_POSTGRES_USER | |
| | GATUS_POSTGRES_PASS | |
| nodered | CREDENTIAL_SECRET | Used to encrypt nodered secrets |
| overseerr | OVERSEERR_TOKEN | Used in homepage |
| pihole | HOMEPAGE_PI_HOLE_TOKEN | |
| synology | HOMEPAGE_SYNOLOGY_USERNAME | |
| | HOMEPAGE_SYNOLOGY_PASSWORD | |
| plex | PLEX_TOKEN | Used in homepage |
| prowlarr | PROWLARR_API_KEY | Used in homepage |
| | PROWLARR_POSTGRES_USER | |
| | PROWLARR_POSTGRES_PASSWORD | |
| sonarr | SONARR_API_KEY | Used in homepage |
| | SONARR_POSTGRES_USER | |
| | SONARR_POSTGRES_PASSWORD | |
| radarr | RADARR_API_KEY | Used in homepage |
| | RADARR_POSTGRES_USER | |
| | RADARR_POSTGRES_PASSWORD | |
| qbittorrent | username | |
| | password | |
| grafana | GRAFANA_POSTGRESS_USER | |
| | GRAFANA_POSTGRESS_PASS | |
| pihole | HOMEPAGE_PI_HOLE_TOKEN | |
</details>

### 3. Set up UDM

1. Set up the unifipoller user (TODO docs).
2. Forward port for qBittorrent (TODO docs).

### 4. Get discord token

1. Go to Server settings -> Integrations and create two webhooks:
- Webhook for Prometheus alerts. Save it to the `ALERTMANAGER_DISCORD_WEBHOOK` item in 1Password.
- Webhook for Gatus alerts. Save it to the `GATUS_DISCORD_WEBHOOK` item in 1Password.

### 5. Set up pihole and generate token for Homepage

1. Set up Pi-hole on a separate Raspberry Pi.
2. Generate a token for the Homepage widget in Pi-hole and save it to the `HOMEPAGE_PI_HOLE_TOKEN` item in 1Password.

### 6. NAS set up

#### Install and Configure Minio on NAS

1. **Install Synology Container Manager:**
1. Install the `Synology Container Manager` package from the Package Center.
2. Open the `Synology Container Manager` and run a Docker container using the `minio/minio` image. Ensure that port `9000` is forwarded.

2. **Create Minio Buckets:**
- Manually create the following buckets:
- `cloudnative-pg` for PostgreSQL backups.
- `loki-bucket` to store logs.
- `thanos` to store old metrics data with Thanos.
- Update the corresponding 1Password items with the necessary details.

#### Configure NFS Connections

1. **Create a Shared Folder:**
1. Open the Synology Control Panel and navigate to `Shared Folders`.
2. Create a shared folder for the Kubernetes cluster.
3. Go to the folder settings and select `NFS Permissions`.
4. Add the IP addresses of all Kubernetes nodes. Select `Squash` as `No`.
Loading

0 comments on commit 9f62e59

Please sign in to comment.