-
Notifications
You must be signed in to change notification settings - Fork 28
Managing Kafka, Docker and Kubernetes
This page describes essential and useful commands for Kafka, Docker and Kubernetes, which we have been using in the FASTEN project.
-
Kafka
- Listing all the Kafka topics
- Creating a Kafka topic
- Deleting a Kafka topic
- Info of a Kafka topic
- Altering the number of partitions of a Kafka topic
- Listing Kafka consumer groups
- Checking the status of a consumer group
- Resetting the offsets of a consumer group
- Viewing the records of a Kafka topic
- Getting the number of records in a Kafka topic
- Getting the date of the last message in a Kafka topic
-
Docker
- Listing Docker images
- Listing Running Docker containers
- Building a Docker image
- Starting a Docker container
- Stopping a Docker container
- Tagging a Docker image
- Pushing a Docker image
- Removing a Docker image
- Viewing logs of a Docker container
- Viewing the resource usage of a Docker container
- Using Bash inside a Docker container
-
Kubernetes
- Listing pods
- Listing nodes
- Listing namespaces
- Listing deployments
- Viewing info of a node
- Making a node unschedulable
- Creating a deployment
- Deleting a deployment
- Scaling a deployment
- Deleting a pod
- Deleting pods from a specified node
- Cleaning evicted pods
- Viewing logs of a pod
- Viewing detailed description of the resources of a pod
- Viewing resource usage of nodes
- Using Bash inside a pod
The Confluent Kafka distribution v5.4 is setup at /opt/kafka/bin
. The following commands assume that you have added this path to your $PATH
. To do that: export PATH=$PATH:/opt/kafka/bin
In the default FASTEN installation, Kafka is being served by a dedicated cluster, running both Kafka (port 9092) and Zookeeper (port 2181). The names of the servers are: delft
, samos
and goteborg
. You can point the following commands to any of those servers.
kafka-topics --list --zookeeper samos:2181
kafka-topics --create --zookeeper samos:2181 --replication-factor 2 --partitions 9 --topic fasten.test
This creates a Kafka topic the name of fasten.test
and 9 partitions. Note that the name of a Kafka topic should start with the prefix fasten.
if you're a FASTEN developer.
-
The number of partitions dictate how many parallel consumers can read from it. For topics that need high parallelism downstream (e.g., call graph generation), consider at least 30 partitions.
-
As we run a 3 node cluster, consider making the number of partitions divisible by 3, to fully consume cluster resources.
-
Too many partitions will mean that Kafka will spend too much time rebalancing consumer groups if a processing node fails. To avoid cascading failures consider increasing the
max.poll.interval.ms
in your consumer group configurations. Also, choose the number of partitions wisely. -
Avoid replication for ephemeral topics, i.e., one-off topics, debug topics etc.
kafka-topics --zookeeper samos:2181 --delete --topic topic-name
Replace topic-name
with the name of the Kafka topic you want to delete.
kafka-topics --describe --zookeeper samos:2181 --topic topic-name
Replace topic-name
with the name of the Kafka topic you want to know about its number of partitions, leaders and etc.
kafka-topics --alter --zookeeper samos:2181 --topic fasten.test --partitions 20
This increases the number of partitions of the topic fasten.test
to 20. Change the example topic to your desired Kafka topic as well as the number of partitions. This creates empty partitions. To move Kafka records to another topic with even partitions, perform the following steps:
1- Write Kafka records to a temporary file:
kafkacat -b samos:9092/kafka -C -t topic-name -o beginning -e -q | jq '(.groupId + ":" + .artifactId + ":" + .version + "|") + (. | tostring)' | sed 's/\\//g' | sed -e 's/^"//' -e 's/"$//' > file-name.txt
2- Delete the previous topic and create a new one with arbitrary partitions. 3- Read the file from step 1 and produce records to the newly created topic:
kafka-console-producer --broker-list samos:9092 --topic topic-name --property 'parse.key=true' --property 'key.separator=|' < file-name.txt
kafka-consumer-groups --list --bootstrap-server samos:9092
kafka-consumer-groups --bootstrap-server samos:9092 --describe --group my-consumer-group
Replace the my-consumer-group
with the name of your consumer group.
First, we need to find latest message that was successfully processed. We do so by analysing the latest messages in all partitions of the output topic. As per FASTEN requirements, this will contain portions of the input message.
kafkacat -C -q -b samos:9092 -t <output_topic> -o -1 -e -f '%p %o %T %s\n'|sort -r -k3|tail -n 1
We can then serially scan the topic we want to reset the index for for a unique characteristic of the latest output message.
kafkacat -C -q -b samos:9092 -t input_topic -f '%p %o %T %s\n' |grep ...
This will give us the partition, offset and timestamp of the message we want to reset out offsets to.
75 12626 1601626238839 {"input":{"date":1481642191, ....
The third value is a UTC-based ms-resolution timestamp. This needs to be
converted to the following format: yyyy-MM-ddTHH:mm:ss.xxx
Then we can run the following on our consumer group
kafka-consumer-groups --bootstrap-server samos:9092 --group <consumer_group> --reset-offsets --all-topics --to-datetime <ts>
For extra confidence, we can look up the partition-offset number to be exactly the same as the ones identified by grepping above.
kafka-consumer-groups --bootstrap-server samos:9092 --reset-offsets --to-earliest --group my-consumer-group --topic my-topic --execute
Replace the my-consumer-group
and my-topic
with the name of your consumer group and your topic, respectively.
kafka-console-consumer --bootstrap-server samos:9092 --from-beginning --topic topic-name
Replace the topic-name
with the name of the topic that you want to view its records.
kafka-run-class kafka.tools.GetOffsetShell --broker-list samos:9092 --time -1 --offsets 1 --topic topic-name | awk -F ':' '{sum += $3} END {print sum}'
Replace the topic-name
with the name of the topic that you want to get the number of records in.
date +'%Y-%m-%d %H:%M:%S' -d "@$(expr $(kafkacat -C -q -b samos:9092 -t $topic -p 0 -o -1 -e -f '%T') / 1000)"
Replace the topic-name
with the name of the topic that you want to get the info about. By default the following command gets a date of the last message in partition 0, you can change this behavior by changing -p
flag.
The command works in most of the cases, but can throw an error if the topic is empty and in a couple of other cases
You can obtain a high-level overview of FASTEN-related Kafka topics by running the following script.
#!/usr/bin/env bash
echo topic,partitions,records,latest
for topic in $(kafka-topics --list --zookeeper samos:2181|grep fasten); do
partitions=$(kafka-topics --describe --zookeeper samos:2181 --topic $topic| grep Replicas|wc -l)
records=$(kafka-run-class kafka.tools.GetOffsetShell --broker-list samos:9092 --time -1 --offsets 1 --topic $topic | awk -F ':' '{sum += $3} END {print sum}')
latest=$(date +'%Y-%m-%d %H:%M:%S' -d "@$(expr $(kafkacat -C -q -b samos:9092 -t $topic -p 0 -o -1 -e -f '%T') / 1000)")
echo $topic,$partitions,$records,$latest
done 2>/dev/null
docker images
docker ps -a
docker build -t image-name -f docker-file .
Replace the image-name
with your desired name and change docker-file
to the path of your Docker file.
docker run -d image-name
Replace image-name
with the name of the Docker image that you want to run.
docker stop container-id
Replace the container-id
with the container ID of the Docker container you want to stop.
This command is useful when you want to publish your Docker image on Dockerhub.
docker tag image-tag yourhubusername/image-name
Replace image-tag
and yourhubusername
with the tag of your Docker image and your Dockerhub username, respectively. Use the docker images
to find the tag.
If you want to publish your docker image on Dockerhub, use the following command:
docker push yourhubusername/image-name
Replace the yourhubusername
and image-name
with your username on Dockerhub and the name of your Docker image, respectively. Note that you have to tag your docker image before pushing it to the Dockerhub.
docker rmi -f image-id
Replace image-id
with the ID of the Docker image that you want to remove.
docker logs container-id
Replace the container-id
with the container ID of your Docker container. Also, you can add -f
arg to the above command to view the logs in real-time.
docker stats container-id
Replace the container-id
with the container ID of your Docker container.
docker exec -it container-id bin/bash
Replace the container-id
with the container ID of your Docker container.
kubectl get pods -n fasten
This shows pods of the fasten namespace.
kubectl get nodes
Add -o wide
to the above command if you want to get more info about the nodes. Also, use --show-labels
to see the labels of the nodes.
kubectl get namespaces
kubectl get deployments -n fasten
This shows all the deployments within the namespace of fasten.
kubectl describe node node-name
Replace the node-name
with the name of your desired node.
kubectl cordon node-name
Replace the node-name
with the name of your desired node.
kubectl apply -f deploy-file
Replace deploy-file
with your deployment manifest file. See here for an example of deployment manifest.
kubectl delete deployment deploy-name
Replace the deploy-name
with the name of your deployment.
kubectl scale deployment deploy-name --replicas=10
Replace the deploy-name
with the name of your deployment. Note that you can change the value of --replicas
to either increase or decrease the number of pods of your deployment.
kubectl delete pod pod-name
Replace the pod-name
with the name of the pod that you want to delete.
kubectl drain node-name
Replace the node-name
with the name of your desired node. This kills all the user pods on the specified node.
kubectl -n fasten delete pods --field-selector=status.phase!=Running
This cleans all the evicted pods within the FASTEN namespace.
kubectl logs pod-name
Replace the pod-name
with the name of your pod. Also, add -f
to the command to see the logs of a pod in real-time.
kubectl describe pod pod-name
Replace the pod-name
with the name of your pod.
kubectl top nodes
kubectl exec --stdin --tty pod-name -- /bin/bash
Replace the pod-name
with the name of your pod.