Features:
-
HDFS High Availability (HA) via Quorum Journal Manager. Allows automatic failover of the active HDFS NameNode to another standby NameNode in case of failures.
-
K8s persistent volumes (PVs) for metadata: NameNodes use PVs for storage, so that file system metadata are not lost even if both NameNode daemons are restarted.
-
K8s hostPath volumes for file data: DataNodes store file data on the local disks of the K8s cluster nodes using hostPath volumes.
The main entry-point is hdfs-k8s
, an umbrella chart that installs HDFS
components via subcharts:
-
hdfs-namenode-k8s
: installs NameNodes, the HDFS components that manage file system metadata. Supports high availability (HA) via Quorum Journal Manager (QJM). -
hdfs-datanode-k8s
: installs DataNodes, which are responsible for storing file data. -
hdfs-config-k8s
: a configmap containing the HDFS configuration. -
zookeeper
(from https://kubernetes-charts-incubator.storage.googleapis.com): installs ZooKeeper, used to manage NameNode failover. By default, the chart runs three ZooKeeper servers. -
hdfs-journalnode-k8s
: installs JournalNodes, which ensure that file system metadata are properly shared among the NameNodes. By default, the chart launches three JournalNode servers. -
hdfs-client-k8s
: a pod configured to run Hadoop client commands. -
hdfs-krb5-k8s
: a size-1 StatefulSet for launching a Kerberos server, which can be used to run HDFS in secure mode. Disabled by default. -
hdfs-simple-namenode-k8s
: runs a simple, non-HA HDFS setup (i.e., with a single NameNode daemon, no ZooKeeper and no JournalNode). Does not support Kerberos. Disabled by default.
Build the main chart:
$ helm repo add incubator \
https://kubernetes-charts-incubator.storage.googleapis.com/
$ helm dependency build charts/hdfs-k8s
ZooKeeper, JournalNode and NameNode pods need persistent volumes for storing
metadata. By default, the helm charts do not set the storage class name for
dynamically provisioned volumes, nor do they use selectors for static
persistent volumes. This means they will rely on a default storage class
provisioner for dynamic volumes. Or, if your cluster has statically
provisioned volumes, the chart will match existing volumes entirely based on
the size requirements. To override this default behavior, you can specify
storage volume classes for dynamic volumes, or volume selectors for static
volumes. See the values.yaml
file for details.
Launch the main chart. The chart release name (e.g., my-hdfs
), will be the
prefix of the K8s resource names:
$ helm install -n my-hdfs charts/hdfs-k8s
Wait for all pods to be ready (some may need to restart a few times).
$ kubectl get pod -l release=my-hdfs
NAME READY STATUS RESTARTS AGE
my-hdfs-client-c749d9f8f-d5pvk 1/1 Running 0 2m
my-hdfs-datanode-o7jia 1/1 Running 3 2m
my-hdfs-datanode-p5kch 1/1 Running 3 2m
my-hdfs-datanode-r3kjo 1/1 Running 3 2m
my-hdfs-journalnode-0 1/1 Running 0 2m
my-hdfs-journalnode-1 1/1 Running 0 2m
my-hdfs-journalnode-2 1/1 Running 0 1m
my-hdfs-namenode-0 1/1 Running 3 2m
my-hdfs-namenode-1 1/1 Running 3 2m
my-hdfs-zookeeper-0 1/1 Running 0 2m
my-hdfs-zookeeper-1 1/1 Running 0 2m
my-hdfs-zookeeper-2 1/1 Running 0 2m
By default, NameNodes and DataNodes use hostNetwork
, so they can see each
other's physical IP and preserve data locality. This can also be changed in
the values file.
Run a few checks on the client pod:
$ _CLIENT=$(kubectl get pods -l app=hdfs-client,release=my-hdfs -o name | \
cut -d / -f 2)
$ kubectl exec $_CLIENT -- hdfs dfsadmin -report
$ kubectl exec $_CLIENT -- hdfs haadmin -getServiceState nn0
$ kubectl exec $_CLIENT -- hdfs haadmin -getServiceState nn1
$ kubectl exec $_CLIENT -- hadoop fs -rm -r -f /tmp
$ kubectl exec $_CLIENT -- hadoop fs -mkdir /tmp
$ kubectl exec $_CLIENT -- sh -c \
"(head -c 100M < /dev/urandom > /tmp/random-100M)"
$ kubectl exec $_CLIENT -- hadoop fs -copyFromLocal /tmp/random-100M /tmp
Kerberos can be enabled by setting a few related options:
$ helm install -n my-hdfs charts/hdfs-k8s \
--set global.kerberosEnabled=true \
--set global.kerberosRealm=MYCOMPANY.COM \
--set tags.kerberos=true
This will launch all charts, including the Kerberos server. However, HDFS daemons will be blocked until Kerberos principals are created. First, create a configmap containing the common Kerberos config file:
_MY_DIR=~/krb5
mkdir -p $_MY_DIR
_KDC=$(kubectl get pod -l app=hdfs-krb5,release=my-hdfs --no-headers \
-o name | cut -d / -f 2)
_run kubectl cp $_KDC:/etc/krb5.conf $_MY_DIR/tmp/krb5.conf
_run kubectl create configmap my-hdfs-krb5-config \
--from-file=$_MY_DIR/tmp/krb5.conf
Then create the service principals and passwords. Kerberos requires service
principals to be host-specific. Some HDFS daemons are associated with the
physical host names of cluster nodes, say kube-n1.mycompany.com
, while
others are associated with virtual service names, such as
my-hdfs-namenode-0.my-hdfs-namenode.default.svc.cluster.local
. You can get
the list of these host names with something like the following:
$ _HOSTS=$(kubectl get nodes \
-o=jsonpath='{.items[*].status.addresses[?(@.type == "Hostname")].address}')
$ _HOSTS+=$(kubectl describe configmap my-hdfs-config | \
grep -A 1 -e dfs.namenode.rpc-address.hdfs-k8s \
-e dfs.namenode.shared.edits.dir |
grep "<value>" |
sed -e "s/<value>//" \
-e "s/<\/value>//" \
-e "s/:8020//" \
-e "s/qjournal:\/\///" \
-e "s/:8485;/ /g" \
-e "s/:8485\/hdfs-k8s//")
Then generate per-host principal accounts and keytab files:
$ _SECRET_CMD="kubectl create secret generic my-hdfs-krb5-keytabs"
$ for _HOST in $_HOSTS; do
kubectl exec $_KDC -- kadmin.local -q \
"addprinc -randkey hdfs/[email protected]"
kubectl exec $_KDC -- kadmin.local -q \
"addprinc -randkey HTTP/[email protected]"
kubectl exec $_KDC -- kadmin.local -q \
"ktadd -norandkey -k /tmp/$_HOST.keytab hdfs/[email protected] HTTP/[email protected]"
kubectl cp $_KDC:/tmp/$_HOST.keytab $_MY_DIR/tmp/$_HOST.keytab
_SECRET_CMD+=" --from-file=$_MY_DIR/tmp/$_HOST.keytab"
done
$ $_SECRET_CMD
This will unblock all HDFS daemon pods. Wait until they become ready. Finally, test the setup:
$ _NN0=$(kubectl get pods -l app=hdfs-namenode,release=my-hdfs -o name | \
head -1 | cut -d / -f 2)
$ kubectl exec $_NN0 -- sh -c "(apt install -y krb5-user > /dev/null)"
$ kubectl exec $_NN0 -- kinit -kt /etc/security/hdfs.keytab \
hdfs/my-hdfs-namenode-0.my-hdfs-namenode.default.svc.cluster.local@MYCOMPANY.COM
$ kubectl exec $_NN0 -- hdfs dfsadmin -report
$ kubectl exec $_NN0 -- hdfs haadmin -getServiceState nn0
$ kubectl exec $_NN0 -- hdfs haadmin -getServiceState nn1
$ kubectl exec $_NN0 -- hadoop fs -rm -r -f /tmp
$ kubectl exec $_NN0 -- hadoop fs -mkdir /tmp
$ kubectl exec $_NN0 -- hadoop fs -chmod 0777 /tmp
$ kubectl exec $_KDC -- kadmin.local -q \
"addprinc -randkey [email protected]"
$ kubectl exec $_KDC -- kadmin.local -q \
"ktadd -norandkey -k /tmp/user1.keytab [email protected]"
$ kubectl cp $_KDC:/tmp/user1.keytab $_MY_DIR/tmp/user1.keytab
$ kubectl cp $_MY_DIR/tmp/user1.keytab $_CLIENT:/tmp/user1.keytab
$ kubectl exec $_CLIENT -- sh -c "(apt install -y krb5-user > /dev/null)"
$ kubectl exec $_CLIENT -- kinit -kt /tmp/user1.keytab [email protected]
$ kubectl exec $_CLIENT -- sh -c \
"(head -c 100M < /dev/urandom > /tmp/random-100M)"
$ kubectl exec $_CLIENT -- hadoop fs -ls /
$ kubectl exec $_CLIENT -- hadoop fs -copyFromLocal /tmp/random-100M /tmp
Set global.dataNodeHostPath
to override the default data storage directories
for DataNodes. Note that you can use a list for multiple disks:
$ helm install -n my-hdfs charts/hdfs-k8s \
--set "global.dataNodeHostPath={/scratch0/hdfs-data,/scratch1/hdfs-data}"
By default, hdfs-k8s
runs a ZooKeeper chart from
https://kubernetes-charts-incubator.storage.googleapis.com. If your K8s
cluster already has a ZooKeeper quorum, you can configure hdfs-k8s
to use
that instead. For instance:
$helm install -n my-hdfs charts/hdfs-k8s \
--set condition.subchart.zookeeper=false \
--set global.zookeeperQuorumOverride=zk-0.zk-svc.default.svc.cluster.local:2181,zk-1.zk-svc.default.svc.cluster.local:2181,zk-2.zk-svc.default.svc.cluster.local:2181
Optionally, you can attach labels to some of your k8s cluster nodes so that NameNodes will always run on those nodes. This allows to provide an HDFS client outside the Kubernetes cluster with stable NameNode IP addresses.
$ kubectl label nodes YOUR-HOST-1 hdfs-namenode-selector=hdfs-namenode
$ kubectl label nodes YOUR-HOST-2 hdfs-namenode-selector=hdfs-namenode
Add the nodeSelector
option to the helm chart command:
$ helm install -n my-hdfs charts/hdfs-k8s \
--set hdfs-namenode-k8s.nodeSelector.hdfs-namenode-selector=hdfs-namenode \
...
To prevent some K8s cluster nodes from being the targets of a DataNode
deployment, label them with hdfs-datanode-exclude
:
$ kubectl label node YOUR-CLUSTER-NODE hdfs-datanode-exclude=yes
This is the simplest possible setup: one NameNode, no ZooKeepers and no JournalNodes:
$ helm install -n my-hdfs charts/hdfs-k8s \
--set tags.ha=false \
--set tags.simple=true \
--set global.namenodeHAEnabled=false
The Kerberos setup creates a K8s secret containing all the keytab files of HDFS daemon service principals. This will be mounted onto HDFS daemon pods. You may want to restrict access to this secret using k8s RBAC.
DataNode daemons run on every cluster node. They also mount k8s hostPath
local disk volumes. You may want to restrict hostPath
access via pod
security policies. See the PSP RBAC Example.