Skip to content

Best Practices

Francesco De Martino edited this page Mar 5, 2019 · 11 revisions

Master Instance Type

Although the master node doesn't execute any job, its functions and its sizing are crucial to the overall performance of the cluster.

When choosing the instance type to use for your master node you want to evaluate the following items:

  • Cluster size: the master node orchestrates the scaling logic of the cluster and is responsible of attaching new nodes to the scheduler. If you need to scale up and down the cluster of a considerable amount of nodes then you want to give the master node some extra compute capacity.
  • Shared file systems: when using shared file systems to share artefacts between compute nodes and the master node take into account that the master is the node exposing the NFS server. For this reason you want to choose an instance type with enough network bandwidth and enough dedicated EBS bandwidth to handle your workflows.
Clone this wiki locally