🎉 Please see the new and improved ClearML Helm Charts repository 🎉
The clearml-server is the backend service infrastructure for ClearML. It allows multiple users to collaborate and manage their experiments. By default, *ClearML is set up to work with the ClearML Demo Server, which is open to anyone and resets periodically. In order to host your own server, you will need to install clearml-server and point ClearML to it.
clearml-server contains the following components:
- The ClearML Web-App, a single-page UI for experiment management and browsing
- RESTful API for:
- Documenting and logging experiment information, statistics and results
- Querying experiments history, logs and results
- Locally-hosted file server for storing images and models making them easily accessible using the Web-App
Use this repository to add clearml-server to your Helm and then deploy clearml-server on Kubernetes clusters using Helm.
ClearML Server requires that you have elasticsearch, redis and mongodb services.
This chart default templates contains bitnami charts for redis and mongodb, and the official chart for elasticsearch (which is currently still beta).
You can either use the default ones, or use your own deployments and set their name and ports in the appropriate sections of this chart.
In order to use your own deployment, make sure to disable the existing one in the values.yaml
(for example, in order to disable elastic set elasticsearch.enabled = false
)
- a Kubernetes cluster
- Persistent Volumes for
pvc-apiserver.yaml
,pvc-fileserver.yaml
, andpvc-agentservices.yaml
. - Persistent volumes for elasticsearch, mongodb and redis (redis is optional).
See relevant information for each chart:
- elasticsearch
- mongodb
- redis Make sure to define the following values for each PV:
- elasticsearch - in the
values.yaml
setelasticsearch.persistence.enabled=true
and setelasticsearch.volumeClaimTemplate.storageClassName
to the storageClassName used in your elasticsearch PV. - mongodb - in order to define a persistent volume for mongodb, in the
values.yaml
setmongodb.persistence.enabled=true
and setmongodb.persistence.storageClass
to the storageClassName used in your mongodb PV. Read here for more details. - redis - in order to define a persistent volume for redis, in the
values.yaml
setredis.master.persistence.enabled=true
and setredis.master.persistence.storageClass
to the storageClassName used in your redis PV. Read here for more details.
kubectl
is installed and configured (see Install and Set Up kubectl in the Kubernetes documentation)helm
installed (see Installing Helm in the Helm documentation)
-
Add the clearml-server repository to your Helm:
helm repo add allegroai https://allegroai.github.io/clearml-server-helm-cloud-ready/
-
Confirm the clearml-server repository is now in Helm:
helm search repo clearml
The helm search results must include
allegroai/clearml-server-cloud-ready
. -
Install
clearml-server-cloud-ready
on your cluster:helm install clearml-server allegroai/clearml-server-cloud-ready --namespace=clearml --create-namespace
A clearml
namespace
is created in your cluster and clearml-server is deployed in it.
-
If you are upgrading from the single node version of ClearML Server helm charts, follow these steps first:
- Log in to the node previously labeled as
app=trains
- Copy each folder under /opt/clearml/data to it's persistent volume.
- Follow the Deploying ClearML Server instructions to deploy Clearml
- Log in to the node previously labeled as
-
Update using new or updated
values.yaml
helm upgrade clearml-server allegroai/clearml-server-cloud-ready -f new-values.yaml
-
If there are no breaking changes, you can update your deployment to match repository version:
helm upgrade clearml-server allegroai/clearml-server-cloud-ready
Important:
-
If you previously deployed a clearml-server, you may encounter errors. If so, you must first delete old deployment using the following command:
helm delete --purge clearml-server
After running the
helm delete
command, you can run thehelm install
command.
-
After clearml-server is deployed, the services expose the following node ports:
- API server on
30008
- Web server on
30080
- File server on
30081
Access clearml-server by creating a load balancer and domain name with records pointing to the load balancer.
Once you have a load balancer and domain name set up, follow these steps to configure access to clearml-server on your k8s cluster:
-
Create domain records
-
Create 3 records to be used for Web-App, File server and API access using the following rules:
app.<your domain name>
files.<your domain name>
api.<your domain name>
(for example,
app.clearml.mydomainname.com
,files.clearml.mydomainname.com
andapi.clearml.mydomainname.com
)
-
-
Point the records you created to the load balancer
-
Configure the load balancer to redirect traffic coming from the records you created:
app.<your domain name>
should be redirected to k8s cluster nodes on port30080
files.<your domain name>
should be redirected to k8s cluster nodes on port30081
api.<your domain name>
should be redirected to k8s cluster nodes on port30008
You can also configure the clearml-server for:
- fixed users (users with credentials)
- non-responsive experiment watchdog settings
For detailed instructions, see the Optional Configuration section in the clearml-server repository README file.