Skip to content

Commit

Permalink
add mimir support
Browse files Browse the repository at this point in the history
  • Loading branch information
Alexander Ryabov committed Nov 19, 2023
1 parent 2befcf5 commit 5cf6f61
Show file tree
Hide file tree
Showing 21 changed files with 1,359 additions and 1,846 deletions.
7 changes: 4 additions & 3 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
.gitignore
.idea/
.vscode/
data/
/.idea/
/.vscode/
/data/
/tmp/
46 changes: 32 additions & 14 deletions .github/workflows/docker.yml
Original file line number Diff line number Diff line change
@@ -1,27 +1,45 @@
name: CI
on:
push:
branches: [ master ]
branches: [master]

jobs:
build:
runs-on: ubuntu-latest

steps:
- uses: actions/setup-go@v2
with:
go-version: 1.15.1
- uses: actions/checkout@v4

- uses: actions/checkout@v2
- name: Docker meta
id: meta
uses: docker/metadata-action@v3
with:
images: sepa/thanos-kit
tags: |
type=raw,value={{date 'YYMMDD'}}
type=raw,value={{branch}}-{{sha}}
flavor: |
latest=true
- uses: actions/cache@v2
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to DockerHub
uses: docker/login-action@v3
with:
path: ~/go/pkg/mod
key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}
restore-keys: |
${{ runner.os }}-go-
username: sepa
password: ${{ secrets.DOCKER_PASSWORD }}

- run: make docker-push
env:
DOCKER_USERNAME: sepa
DOCKER_PASSWORD: ${{ secrets.DOCKER_PASSWORD }}
- name: Build and push
id: docker_build
uses: docker/build-push-action@v5
with:
platforms: linux/amd64,linux/arm64
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
build-args: |
VER=${{ DOCKER_METADATA_OUTPUT_VERSION }}
cache-from: type=gha
cache-to: type=gha,mode=max
9 changes: 5 additions & 4 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
.idea/
.vscode/
thanos-kit
data/
/.idea/
/.vscode/
/thanos-kit
/data/
/tmp/
6 changes: 3 additions & 3 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
FROM golang:alpine as builder
FROM golang:1.20 as builder
WORKDIR /build
RUN apk add --no-cache make git
# try to cache deps
COPY go.mod go.sum ./
RUN go mod download -x
# resets caches
COPY . .
ARG VER
RUN make test && make build

FROM quay.io/prometheus/busybox:latest
COPY --from=builder /build/thanos-kit /bin/thanos-kit
ENTRYPOINT [ "/bin/thanos-kit" ]
ENTRYPOINT ["/bin/thanos-kit"]
25 changes: 0 additions & 25 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,21 +1,11 @@
DOCKER_IMAGE ?= 'sepa/thanos-kit'
VER ?= `git show -s --format=%cd-%h --date=format:%y%m%d`
OS ?= $(shell uname -s | tr '[A-Z]' '[a-z]')
ARCH ?= $(shell uname -m)

help: ## Displays help
@awk 'BEGIN {FS = ":.*##"; printf "\nUsage:\n make \033[36m<target>\033[0m\n\nTargets:\n"} /^[a-z0-9A-Z_-]+:.*?##/ { printf " \033[36m%-10s\033[0m %s\n", $$1, $$2 }' $(MAKEFILE_LIST)

.PHONY: deps
deps: ## Ensures fresh go.mod and go.sum
@go mod tidy
@go mod verify

.PHONY: test
test: ## Run unit tests
@go test ./...

.PHONY: build
build: ## Build binaries with version set
@CGO_ENABLED=0 go build -ldflags "-w -s \
-X github.com/prometheus/common/version.Version=${VER} \
Expand All @@ -24,20 +14,5 @@ build: ## Build binaries with version set
-X github.com/prometheus/common/version.BuildUser=${USER}@`hostname` \
-X github.com/prometheus/common/version.BuildDate=`date +%Y%m%d-%H:%M:%S`"

.PHONY: docker
ifeq ($(OS)_$(ARCH), linux_x86_64)
docker: test build
@docker build -t "thanos-kit" -f ci.dockerfile .
else
docker: ## Builds 'thanos-kit' docker with no tag
@docker build -t "thanos-kit" .
endif

.PHONY: docker-push
docker-push: docker # CI only
@echo "Pushing ver: ${VER}"
ifneq (${DOCKER_PASSWORD},)
@echo "${DOCKER_PASSWORD}" | docker login -u="${DOCKER_USERNAME}" --password-stdin
@docker tag thanos-kit ${DOCKER_IMAGE}:latest && docker push ${DOCKER_IMAGE}:latest
endif
@docker tag thanos-kit ${DOCKER_IMAGE}:${VER} && docker push ${DOCKER_IMAGE}:${VER}
93 changes: 93 additions & 0 deletions Readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# thanos-kit
Tooling to work with Thanos blocks in object storage.

- **ls** - List all blocks ULIDs in the bucket, also show ULID as time (same as `thanos tools bucket ls` but with mimir support)
- **inspect** - Inspect all blocks in the bucket in detailed, table-like way (same as `thanos tools bucket inspect` but with mimir support)
- **analyze** - Analyze churn, label pair cardinality for specific block. (same as `promtool tsdb analyze` but also show Labels suitable for block split)
- **dump** - Dump samples from a TSDB to text format (same as `promtool tsdb dump` but to promtext format)
- **import** - Import samples to TSDB blocks (same as `promtool tsdb create-blocks-from openmetrics` but from promtext format). Read more about [backfill](#backfill) below

Cli arguments are mostly the same as for `thanos`, help is available for each sub-command:
```
$ docker run sepa/thanos-kit -h
usage: thanos-kit [<flags>] <command> [<args> ...]
Tooling to work with Thanos blocks in object storage
Flags:
-h, --help Show context-sensitive help (also try --help-long and --help-man).
--version Show application version.
--log.level=info Log filtering level (info, debug)
--objstore.config-file=<file-path>
Path to YAML file that contains object store%s configuration. See format details: https://thanos.io/tip/thanos/storage.md/
--objstore.config=<content>
Alternative to 'objstore.config-file' flag (mutually exclusive). Content of YAML file that contains object store%s configuration. See format details: https://thanos.io/tip/thanos/storage.md/
Commands:
help [<command>...]
Show help.
ls [<flags>]
List all blocks in the bucket.
inspect [<flags>]
Inspect all blocks in the bucket in detailed, table-like way
analyze [<flags>] <ULID>
Analyze churn, label pair cardinality and find labels to split on
dump [<flags>] <ULID>...
Dump samples from a TSDB to text
import --input-file=INPUT-FILE --label=<name>="<value>" [<flags>]
Import samples from text to TSDB blocks
```

### Get it
Docker images are available on [Docker Hub](https://hub.docker.com/repository/docker/sepa/thanos-kit/tags)

### Backfill
([Original PR](https://github.com/prometheus/prometheus/pull/7586))
Supported input format is Prometheus text format.
You are free to export/convert your existing data to this format, into one **time-sorted** text file.

`metric{[labels]} value timestamp_ms`

For example:
```ini
k8s_ns_hourly_cost{namespace="kube-system"} 5.7 1599771600000
k8s_ns_hourly_cost{namespace="kube-system"} 5.0 1599771630000
...
```

Note, `value` can be mixed as normal or scientific number as per your preference.

This format is simple to produce, but not optimized or compressed, so it's normal if your data file is huge.
Example of a 19G OpenMetrics file, with ~20k timeseries and 200M data points (samples) on 2y period. Globally resolution is very low in this example.
Import will take around 2h and uncompacted new TSDB blocks will be around 2.1G for 7600 blocks. When thanos-compact scan them, it starts automatically compacting them in the background. Once compaction is completed (~30min), TSDB blocks will be around 970M for 80 blocks.
The size, and number of blocks depends on timeseries numbers and metrics resolution, but it gives you an order of sizes.

Apart from labels set for each metric in text file, you would also need to set Thanos Metadata Labels for the whole batch of blocks you are importing (consider this as prometheus `external_labels` which scraped the metrics from the text file)

Example of command for importing data from `data.prom` (above) to GCS bucket `bucketname`:
```bash
docker run -it --rm \
-v `pwd`:/work -w /work \
-e GOOGLE_APPLICATION_CREDENTIALS=/work/svc.json \
sepa/thanos-kit import \
--objstore.config='{type: GCS, config: {bucket: bucketname}}' \
--input-file data.prom \
--label=replica=\"prom-a\" \
--label=location=\"us-east1\"
```
Please note that compactor has default `--consistency-delay=30m` which is based on file upload time (not ULID), so it could take some time before compactor would start processing these blocks.

### Cache dir
By default, `thanos-kit` will cache blocks from object storage to `./data` directory, and the dir is not cleaned up on exit. This is to speed up subsequent runs, and to avoid deleting user data when `--data-dir=/tmp` is used for example.

Important note that `dump` command downloads specified blocks to cache dir, but then dump TSDB as a whole (including blocks already present there)

### Alternatives
- [thanos tools bucket](https://thanos.io/tip/components/tools.md/#bucket)
- [promtool tsdb](https://prometheus.io/docs/prometheus/latest/command-line/promtool/#promtool-tsdb)
- [mimirtool](https://grafana.com/docs/mimir/latest/manage/tools/mimirtool/#backfill)
Loading

0 comments on commit 5cf6f61

Please sign in to comment.