dmwm · todor-ivanov · Apr 21, 2023 · Apr 24, 2023 · Apr 25, 2023 · Apr 25, 2023
diff --git a/docker/pypi/wmagent/Dockerfile b/docker/pypi/wmagent/Dockerfile
@@ -1,14 +1,69 @@
+FROM registry.cern.ch/cmsweb/oracle:21_5-stable as oracle
 FROM registry.cern.ch/cmsweb/dmwm-base:pypi-20230525
 MAINTAINER Valentin Kuznetsov [email protected]
+
+# Install basic OS package dependencies
 RUN apt-get update
-RUN apt-get install -y libmariadb-dev-compat libmariadb-dev apache2-utils sudo
-ENV TAG=X.Y.Z
-RUN pip install wmagent==$TAG
-ENV WDIR=/data
-ENV USER=_wmagent
-RUN useradd ${USER} && install -o ${USER} -d ${WDIR}
-RUN echo "%$USER ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers
-USER ${USER}
-RUN sudo chown -R $USER.$USER $WDIR
-WORKDIR $WDIR
-CMD ["python3"]
+RUN apt-get install -y libmariadb-dev-compat libmariadb-dev apache2-utils hostname net-tools iputils-ping cron mariadb-server myproxy voms-clients rlwrap libaio1 procps && apt-get clean
+
+# copy oracle client:
+COPY --from=oracle /usr/lib/oracle /usr/lib/oracle
+ENV LD_LIBRARY_PATH=/usr/lib/oracle
+ENV PATH=$PATH:/usr/lib/oracle
+ENV PKG_CONFIG_PATH=/usr/lib/oracle
+
+# WMA_TAG to be passed at build time through `--build-arg WMA_TAG=<WMA_TAG>`. Default: None
+ARG WMA_TAG=None
+ENV WMA_TAG=${WMA_TAG}
+ENV WMA_USER=cmst1
+ENV WMA_GROUP=zh
+ENV WMA_UID=31961
+ENV WMA_GID=1399
+ENV WMA_ROOT_DIR=/data
+
+# Basic WMAgent directory structure passed to all scripts through env variables:
+# NOTE: Those should be static and depend only on $WMA_BASE_DIR
+ENV WMA_BASE_DIR=$WMA_ROOT_DIR/srv
+ENV WMA_ADMIN_DIR=$WMA_ROOT_DIR/admin/wmagent
+ENV WMA_CERTS_DIR=$WMA_ROOT_DIR/certs
+
+ENV WMA_HOSTADMIN_DIR=$WMA_ADMIN_DIR/hostadmin
+ENV WMA_CURRENT_DIR=$WMA_BASE_DIR/wmagent/current
+ENV WMA_INSTALL_DIR=$WMA_CURRENT_DIR/install
+ENV WMA_CONFIG_DIR=$WMA_CURRENT_DIR/config
+ENV WMA_MANAGE_DIR=$WMA_CONFIG_DIR/wmagent
+ENV WMA_DEPLOY_DIR=/usr/local
+ENV WMA_ENV_FILE=$WMA_DEPLOY_DIR/deploy/env.sh
+
+
+# Setting up users and previleges
+RUN groupadd -g ${WMA_GID} ${WMA_GROUP}
+RUN useradd -u ${WMA_UID} -g ${WMA_GID} -m ${WMA_USER}
+RUN install -o ${WMA_USER} -g ${WMA_GID} -d ${WMA_ROOT_DIR}
+RUN usermod -aG mysql ${WMA_USER}
+RUN rm -f /etc/mysql/mariadb.conf.d/50-server.cnf
+
+# Add WMA_USER to sudoers
+RUN echo "${WMA_USER} ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers
+
+# Add all deployment needed directories
+ADD bin $WMA_DEPLOY_DIR/bin
+ADD etc $WMA_DEPLOY_DIR/etc
+
+# Add install script
+ADD install.sh ${WMA_ROOT_DIR}/install.sh
+
+# Add wmagent run script
+ADD run.sh ${WMA_ROOT_DIR}/run.sh
+
+# Install the requested WMA_TAG.
+RUN ${WMA_ROOT_DIR}/install.sh -v ${WMA_TAG}
+RUN chown -R ${WMA_USER}:${WMA_GID} ${WMA_ROOT_DIR}
+
+# Switch to the runtime directory and user
+WORKDIR ${WMA_ROOT_DIR}
+USER ${WMA_USER}
+ENV USER=$WMA_USER
+
+# Define the entrypoint. All the run.sh paramters should be passed at runtime.
+ENTRYPOINT ["./run.sh"]
diff --git a/docker/pypi/wmagent/README.md b/docker/pypi/wmagent/README.md
@@ -0,0 +1,230 @@
+## WMAgent in Docker using pypi deployment method.
+
+### Requires:
+ * Docker to be installed on the host VM (vocmsXXXX)
+ * HTcondor schedd to be installed and configured at the host VM
+ * CouchDB to be installed on the host VM
+ * MariaDB to be installed on the host VM (Depends on the type of relational database to be used MariaDB/Oracle)
+ * Service certificates to be present at the host VM
+ * `WMAgent.secrets` file to be present at the host VM
+
+### The implementation is realized through the following files:
+ * `Dockerfile` - provides all basic requirements for the image and sets all common env variables to both `install.sh` and `run.sh`.
+ * `install.sh` - called through `Dockerfile` `RUN` command and provided with a single parameter at build time `WMA_TAG`
+ * `run.sh` - set as default `ENTRYPOINT` at container runtime. All agent related configuration parameters are passed as named arguments and used to (re)generate the agent configuration files. All service credentials and schedd caches are accessed via host mount points
+ * `wmagent-docker-build.sh` - simple script to be used for building a WMAgent docker image
+ * `wmagent-docker-run.sh` - simple script to be used for running a WMAgent docker container
+
+**Build options (accepted by `install.sh`):**
+* `WMA_TAG=2.2.1`
+
+**RUN options (accepted by `run.sh`):**
+* `TEAMNAME=testbed-$HOSTNAME`
+* `CENTRAL_SERVICES=cmsweb-testbed.cern.ch`
+* `AGENT_NUMBER=0`
+* `FLAVOR=mysql`
+
+
+### Building a WMAgent image
+
+The build process may happen at any machine running a Docker Engine.
+
+**Build command:**
+* Using the wrapper script to build WMAgent locally:
+```
+ssh vocms****
+cmst1
+cd /data
+git clone https://github.com/dmwm/CMSKubernetes.git
+cd /data/CMSKubernetes/docker/pypi/wmagent/
+./wmagent-docker-build.sh -v 2.2.1
+```
+* Using the wrapper script to build and upload WMAgent to registry.cern.ch:
+```
+./wmagent-docker-build.sh -v 2.2.1 -p
+```
+* Here is what is happening under the hood:
+```
+WMA_TAG=2.2.1
+docker build --network=host --progress=plain --build-arg WMA_TAG=$WMA_TAG -t wmagent:$WMA_TAG -t wmagent:latest /data/CMSKubernetes/docker/pypi/wmagent/ 2>&1 |tee /data/build-wma.log
+```
+**Partial output:**
+```
+...
+#4 [ 1/13] FROM registry.cern.ch/cmsweb/dmwm-base:pypi-20230314@sha256:71cf3825ed9acf4e84f36753365f363cfd53d933b4abf3c31ef828828e7bdf83
+#4 DONE 0.0s
+...
+#14 0.110 =======================================================
+#14 0.110 Starting new agent deployment with the following data:
+#14 0.110 -------------------------------------------------------
+#14 0.111 - WMAgent version : 2.2.1
+#14 0.113 - Python verson : Python 3.8.16
+#14 0.114 - Python Module Path : /usr/local/lib/python3.8/site-packages
+#14 0.114 =======================================================
+...
+#18 naming to docker.io/library/wmagent:2.2.1 done
+#18 DONE 3.3s
+```
+
+### Running a WMAgent container
+
+One needs to bind mount several directories from the host VM (vocmsXXXX).
+* /data/dockerMount/certs
+* /etc/condor (schedd runs on the host, not the container)
+* /tmp
+* /data/dockerMount/srv/wmagent/current/install (stateful service and component dirs)
+* /data/dockerMount/srv/wmagent/current/config (for persisting agent configuration data)
+* /data/dockerMount/admin/wmagent (in order to access the WMAgent.secrets)
+
+
+The install and config dirs will be initialized the first time you execute run.sh and a .dockerinit file will be placed to keep track of the initialization. Subsequent container restarts won't touch these directories.
+
+**Run command:**
+
+* Initialising the agent for the first time:
+```
+ssh vocms****
+cmst1
+cd /data/CMSKubernetes/docker/pypi/wmagent/
+### cleaning old agent data:
+rm -rf /data/dockerMount/srv/
+./wmagent-docker-run.sh -t <team_name> -n <agent_number> -f <db_flavour> -c <central_services> &
+```
+* Initialising the agent for the first time using a docker image from registry.cern.ch:
+```
+./wmagent-docker-run.sh -t <team_name> -n <agent_number> -f <db_flavour> -c <central_services> -p -v 2.2.1 &
+```
+* Running the agent:
+```
+./wmagent-docker-run.sh &
+```
+
+* Here is what is happening under the hood:
+```
+WMA_ROOT_DIR=/data/dockerMount
+
+dockerOpts=" \
+--network=host \
+--rm \
+--hostname=`hostname -f` \
+--name=wmagent \
+--mount type=bind,source=/etc/tnsnames.ora,target=/etc/tnsnames.ora,readonly \
+--mount type=bind,source=/etc/condor,target=/etc/condor,readonly \
+--mount type=bind,source=/tmp,target=/tmp \
+--mount type=bind,source=$WMA_ROOT_DIR/certs,target=/data/certs \
+--mount type=bind,source=$WMA_ROOT_DIR/srv/wmagent/current/install,target=/data/srv/wmagent/current/install \
+--mount type=bind,source=$WMA_ROOT_DIR/srv/wmagent/current/config,target=/data/srv/wmagent/current/config \
+--mount type=bind,source=$WMA_ROOT_DIR/admin/wmagent,target=/data/admin/wmagent/hostadmin \
+"
+
+wmaOpts=" \
+-f mysql \
+-t testbed-vocms0260 \
+-n 0 \
+-c cmsweb-testbed.cern.ch"
+
+docker run $dockerOpts wmagent $wmaOpts
+```
+
+**Partial output:**
+```
+=======================================================
+Starting WMAgent with the following initial data:
+-------------------------------------------------------
+ - WMAgent Version : 2.2.1
+ - WMAgent TeamName : testbed-vocms0260
+ - WMAgent Number : 0
+ - WMAgent Host : vocms0260.cern.ch
+ - WMAgent Config : /data/srv/wmagent/current/config
+ - WMAgent Relational DB type : oracle
+ - Python verson : Python 3.8.16
+ - Python Module Path : /usr/local/lib/python3.8/site-packages
+=======================================================
+...
+```
+
+**NOTE:**
+Currently, it is a must that only one WMAgent container should be running on a singe agent VM. It is partially guarantied by setting the `--name=wmagent` parameter at the `docker run` command above. But it is in fact possible to over come this by setting a different name of the new container, but bare in mind all unpredictable consequences of such action. If one tries tr start two containers with the same name, the expected err is:
+```
+docker run $dockerOpts wmagent:$WMA_TAG $wmaOpts
+
+docker: Error response from daemon: Conflict. The container name "/wmagent" is already in use by container "c4c64688a75b6ac8f5cc5e4c951db324b2441ec1434f2e1d604a49d8009ff2a1". You have to remove (or rename) that container to be able to reuse that name.
+See 'docker run --help'
+```
+
+
+
+
+### Checking container status
+```
+ssh vocms****
+
+docker container ps
+CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
+78d7e1baa3df wmagent:2.2.1 "./run.sh -f oracle ..." 2 hours ago Up 2 hours wmagent
+
+```
+
+## Stopping the WMAgent container
+In order to stop the WMAgent container one just needs to kill it, the `--rm` option at `docker run` commands assures we leave no leftover containers.
+
+**Shutdown command:**
+```
+docker kill wmagent
+```
+
+### Enforce container reinitialisation at the host:
+The WMAgent needs to preserve its configuration and initialisation data permanently at the host. For the purpose we use Host to Docker bind mounts.
+Once a specific WMAgent image has been run for the first time it leaves a small set of .dockerInit files at all places where permanent data(like config files and job caches) at the host is preserved.
+On any further restart of the container, hence the WMAgent itself, we do not go through all the initialisation steps again if we find the
+relevant .dockerInit file and the $WMA_BUILD_ID hash contained there matches the $WMA_BUILD_ID of the currently starting container.
+In order for one to enforce reinitialisation steps to be performed one needs to delete all .dockerInit files and restart the wmagent container.
+
+**NOTE: This reinitialisation may result in losing previous job caches and database records**
+**Reinitialisation command:**
+```
+docker kill wmagent
+
+sudo find /data/dockerMount -name .dockerInit -delete
+
+docker run $dockerOpts wmagent:$WMA_TAG $wmaOpts
+```
+
+**Partial output:**
+```
+=======================================================
+Starting WMAgent with the following initialisation data:
+-------------------------------------------------------
+ - WMAgent Version : 2.2.1
+...
+=======================================================
+-------------------------------------------------------
+Start: Performing checks for successful Docker initialisation steps...
+WMA_BUILD_ID: 110b443165e3b5a4ba569b8a1ab063a616132602e55ba06b0c3e89a01e643f31
+dockerInitId: /data/admin/wmagent/hostadmin/.dockerInit:
+...
+ERROR
+-------------------------------------------------------
+Start: Performing Docker image to Host initialisation steps
+...
+Done: Performing Docker image to Host initialisation steps
+-------------------------------------------------------
+-------------------------------------------------------
+Start: Performing checks for successful Docker initialisation steps...
+WMA_BUILD_ID: 110b443165e3b5a4ba569b8a1ab063a616132602e55ba06b0c3e89a01e643f31
+dockerInitId: 110b443165e3b5a4ba569b8a1ab063a616132602e55ba06b0c3e89a01e643f31
+OK
+-------------------------------------------------------
+...
+```
+
+### Connecting to the container
+
+First login at the VM and from there connect to the container:
+
+**Login sequence:**
+```
+docker exec -it wmagent /bin/bash
+...
+(WMAgent-2.2.1) [cmst1@vocms0260:current]$ manage status
+```