There are many options for aggregating and streaming kubernetes logs. In this part of the workshop, we are going to focus on an AWS based deployment which consists of the following:
-
Fluentd is an open source data collector providing a unified logging layer, supported by 500+ plugins connecting to many types of systems.
-
Elasticsearch is a distributed, RESTful search and analytics engine.
-
Kibana lets you visualize your Elasticsearch data.
Together, Fluentd, Elasticsearch and Kibana is also known as “EFK stack”.
In terms of architecture, Fluentd is deployed as a DaemonSet with the CloudWatch Logs plugin. The default configuration will stream system logs, Kubernetes container logs and Kubernetes API server audit logs to a CloudWatch log group. An AWS Lambda function, which is provided for you, will stream the Log Group into an Amazon Elasticsearch cluster. The logs can then be viewed using a Kibana dashboard.
In order to perform exercises in this chapter, you’ll need to deploy configurations to a Kubernetes cluster. To create an EKS-based Kubernetes cluster, use the AWS CLI (recommended). If you wish to create a Kubernetes cluster without EKS, you can instead use kops.
All configuration files for this chapter are in the 02-path-working-with-clusters/204-cluster-logging-with-EFK
directory. Make sure you change to that directory before giving any commands in this chapter.
This example creates a two instance Amazon Elasticsearch cluster named kubernetes-logs
. This cluster is created in the same region as the Kubernetes cluster and CloudWatch log group. Note that this cluster has an open access policy which will need to be locked down in production environments:
aws es create-elasticsearch-domain \ --domain-name kubernetes-logs \ --elasticsearch-version 5.5 \ --elasticsearch-cluster-config \ InstanceType=m4.large.elasticsearch,InstanceCount=2 \ --ebs-options EBSEnabled=true,VolumeType=standard,VolumeSize=100 \ --access-policies '{"Version":"2012-10-17","Statement":[{"Effect":"Allow","Principal":{"AWS":["*"]},"Action":["es:*"],"Resource":"*"}]}'
It takes a little while for the cluster to be created and arrive at an active
state. The AWS Console should show the following status when the cluster is ready:
Look for the word Active
in the right column.
You could also check this via AWS CLI:
$ aws es describe-elasticsearch-domain --domain-name kubernetes-logs --query 'DomainStatus.Processing' false
If the output value is false
that means the domain has been processed and is now available to use.
A CloudWatch log group combines log streams that share the same retention, monitoring, and access control settings.
Create a CloudWatch log group:
aws logs create-log-group --log-group-name kubernetes-logs
Fluentd log group name and stream name are configured in the file templates/fluentd-configmap.yaml
. An excerpt from the file is shown:
output.conf: | <match **> # Plugin specific settings type cloudwatch_logs log_group_name kubernetes-logs log_stream_name fluentd-cloudwatch auto_create_stream true # Buffer settings buffer_chunk_limit 2M buffer_queue_limit 32 flush_interval 10s max_retry_wait 30 disable_retry_limit num_threads 8 </match>
It uses the default log group name of kubernetes-logs
and the log stream name of fluentd-cloudwatch
.
If you’ve followed the instructions in this chapter as is, then no change is required in this configuration file. However if a different log group name is used in the previous command or a different log stream name is needed, then that needs to be configured in this configuration file.
You will need to create an IAM user and set the AWS_ACCESS_KEY
, AWS_SECRET_KEY
and AWS_REGION
in the templates/fluentd-ds.yaml
file. Other options to accomplish this (in a production environment) are to use either kubernetes secrets or EC2 IAM roles.
env: - name: FLUENTD_CONFIG value: fluentd-standalone.conf - name: AWS_REGION value: $REGION - name: AWS_ACCESS_KEY value: $ACCESS_KEY - name: AWS_SECRET_KEY value: $SECRET_KEY
If creating an IAM user, the associated policy needs the following permissions
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:DescribeLogGroups"
],
"Resource": [
"arn:aws:logs:us-east-1:<account>:log-group::log-stream:*"
]
},
{
"Effect": "Allow",
"Action": [
"logs:DescribeLogStreams"
],
"Resource": [
"arn:aws:logs:us-east-1:<account>:log-group:kubernetes-logs:log-stream:*"
]
},
{
"Effect": "Allow",
"Action": [
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": [
"arn:aws:logs:us-east-1:<account>:log-group:kubernetes-logs:log-stream:fluentd-cloudwatch"
]
}
]
}
First create the logging namespace
$ kubectl create ns logging namespace "logging" created
Create all of the necessary service accounts and roles:
$ kubectl create -f ./templates/fluentd-service-account.yaml serviceaccount "fluentd" created $ kubectl create -f ./templates/fluentd-role.yaml clusterrole "fluentd-read" created $ kubectl create -f ./templates/fluentd-role-binding.yaml clusterrolebinding "fluentd-read" created
Then deploy Fluentd:
$ kubectl create -f ./templates/fluentd-configmap.yaml configmap "fluentd-config" created $ kubectl create -f ./templates/fluentd-svc.yaml service "fluentd" created $ kubectl create -f ./templates/fluentd-ds.yaml daemonset "fluentd" created
Watch for all of the pods to change to running status:
$ kubectl get pods -w --namespace=logging NAME READY STATUS RESTARTS AGE fluentd-00zxr 1/1 Running 0 58s fluentd-05hc0 1/1 Running 0 58s fluentd-089s4 1/1 Running 0 58s fluentd-v4vjp 1/1 Running 0 58s fluentd-zv6bv 1/1 Running 0 58s
Remember, Fluentd is deployed as a DaemonSet, i.e. one pod per worker node, so your output will vary depending on the size of your cluster. In our case, a 5 node cluster is used and so 5 pods are shown in the output.
We can now login to the AWS console → Management Tools → CloudWatch → Logs → kubernetes-logs → fluentd-cloudwatch
We should start to see logs arrive into the service and can use the search feature to looks for specific logs. It looks like as shown:
CloudWatch Logs can be delivered to other services such as Amazon Elasticsearch for custom processing. This can be achieved by subscribing to a real-time feed of log events. A subscription filter defines the filter pattern to use for filtering which log events gets delivered to Elasticsearch, as well as information about where to send matching log events to.
In this section, we’ll subscribe to the CloudWatch log events from the fluent-cloudwatch
stream from the kubernetes-logs
log group. This feed will be streamed to the Elasticsearch cluster.
Original instructions for this are available at:
The instructions below show how this can be achieved for our setup:
-
Open the CloudWatch console at https://console.aws.amazon.com/cloudwatch/.
-
In the navigation pane, choose
Logs
. -
Select the log group to subscribe.
-
Choose
Actions
,Stream to Amazon Elasticsearch Service
. -
Select the IAM role
-
Click on
Next
. -
Select a Log Format:
The fields that are sent to the Elasticsearch cluster can be selected. Optionally, you can select a log stream and then click on
Test Pattern
to verify that your search filter is returning the results you expect. -
Click on
Next
-
Review all the information:
-
Click on
Next
and thenStart streaming
: -
Cloudwatch page is refreshed to show that the filter was successfully created:
In Amazon Elasticsearch console, select the Elasticsearch cluster.
Open the Kibana dashboard from the link:
Begin using the capabilites to search and visuzalize your Kubernetes cluster metrics.
Delete the resources created during this chapter:
kubectl delete ns logging
aws es delete-elasticsearch-domain --domain-name kubernetes-logs
aws logs delete-log-group --log-group-name kubernetes-logs
You are now ready to continue on with the workshop!