-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change WMAgent Dockerfile to call install.sh && Split the agent init to buildtime and runtime parts #1364
Closed
todor-ivanov
wants to merge
62
commits into
dmwm:master
from
todor-ivanov:Fix_WMAgentPyPi_DcokerImage
Closed
Change WMAgent Dockerfile to call install.sh && Split the agent init to buildtime and runtime parts #1364
Changes from 61 commits
Commits
Show all changes
62 commits
Select commit
Hold shift + click to select a range
bac3bc1
Change Dockerfile to call install.sh && Add initial install.sh
todor-ivanov d2e165f
Install wmagent with root in the sys default module path && Add basic…
todor-ivanov 01cb64e
Move all Dir struct at Dockerfile && Upgrade pip && Add better docstr…
todor-ivanov 5531d22
Add README file
todor-ivanov ae96d43
Add container connection instructions to the README file
todor-ivanov 11c9602
Identical option parsing for both run.sh and install.sh && Add partia…
todor-ivanov 646b396
Set WMAgent docker specific bash prompt
todor-ivanov b32c195
Move user's aliases to the install.sh script.
todor-ivanov 98e64f4
Fix basic_checks and README
todor-ivanov b7f7be4
Rename/Add WMA_* Prefix to all env vars
todor-ivanov 0d5decc
Create hostadmin mountpoint and image deployment area && Download all…
todor-ivanov e6d2b90
Implement basic Docker to Host initialisation steps and checks
todor-ivanov b960f7e
Update README
todor-ivanov 9e75581
Partial implementation of deploy_to_host && Update README
todor-ivanov 6d45160
Add runtime parameters checks && Finalize docker_to_host && Finegrain…
todor-ivanov ef4fc4e
Implement _check_wmasecrets auxiliary parser
todor-ivanov 8c981c5
Add fix about WMAgent.secrets temlpate identification for relval agents
todor-ivanov 36ab1e4
Implement doeploy_to_container function
todor-ivanov 05c2610
Add check for WMAgent.secrets checksum && Fix bug with missing .docke…
todor-ivanov 2cfcb4b
Add _init_valid aux function && Improve md5sum checks.
todor-ivanov 00dec37
Adding wmagent-docker-build.sh wmagent-docker-run.sh && Fixing bug in…
todor-ivanov abb2ca0
Add cron jobs creation at build time
todor-ivanov ef03489
Fix manage file mode && call deploy_to_agent again upon intialisation
todor-ivanov 08b2b2e
Fix WMAgent.secrets update parsing commands.
todor-ivanov c5dc401
Fix missing wmagentpy3 links
todor-ivanov 5a6df55
Add fix for missing mounts points at the host
todor-ivanov 0cf2562
Add mariaDB to the container
todor-ivanov 1d0e9d7
Add voms utils && Checks for certificate and myproxy
todor-ivanov 1247014
Improve WMAgent.secrerts parsing && Start check_databases code && Reo…
todor-ivanov f7ab0ec
Temporary fixes for broken pypi packaging
todor-ivanov ca67200
Move temp fixes to deploy_to_container && Tie install downloads to th…
todor-ivanov 5c588a1
Call activate-agent && init-agent
todor-ivanov e2fecea
Fix WMA_TAG_REG
todor-ivanov c618bb6
A really bad workaround for outdated yui library
todor-ivanov 9072f07
Add agent config tweaks && Populate agaent resource-control
todor-ivanov 55a775f
Tie default TEAMNAME with the current hostname at run.sh
todor-ivanov b4172e7
Add oracle client and databse checks && Clean leftovers and old comments
todor-ivanov 75cf2a6
Change root mountpoint to /dat/dockerMount && Typo && More comments c…
todor-ivanov e48a468
Stop downloading files from the old deployment repository && upload t…
todor-ivanov a568a0e
Move WMA_DEPLOY_DIR to /usr/local
todor-ivanov 2d7c5b8
Stop using cmsweb docker image for copying voms package files - use t…
todor-ivanov 27634ef
Start using env.sh file from the pypi package deploy/ area instead of…
todor-ivanov 7e070ca
Stop downloading utilitarian scripts and use them from the pypi packa…
todor-ivanov 33cc87c
Release manage script from origin dependency && fetch deployment and …
todor-ivanov 8fa2d29
Renew uploading agent config step
todor-ivanov 14205b5
Update README
todor-ivanov 69e67e0
Update README
todor-ivanov 525b2af
Update README
todor-ivanov b338dba
Fix changed renew_proxy path
todor-ivanov dae519b
Update README
todor-ivanov df23f19
Typo while downloading yui rpm package
todor-ivanov a2151f5
Fix permissions for editting renew_proxy.sh at runtime
todor-ivanov f6be3c9
Remove Central_cervices runtime paramter
todor-ivanov 176cb8f
Enable run/build wrapper scripts to download/upload docker images to …
todor-ivanov f589f42
Update README
todor-ivanov d149476
Add protection from missing /etc/tnsnames.ora mount for FNAL agents
todor-ivanov 0293f28
Review comments - Get rid of rpm packages && Add environment tweaks i…
todor-ivanov ef297b0
Fix bad wget download command for yui files
todor-ivanov 2bfc1f3
Fix typos && WARNING from check_docker_init
todor-ivanov d142249
Update README
todor-ivanov 88c3ad5
Resolve $WMA_ROOT_DIR at runtime for $WMA_BUILD_ID
todor-ivanov 298ef38
Export $TAG from Dockerfile && typo && TODO comments
todor-ivanov File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,14 +1,69 @@ | ||
FROM registry.cern.ch/cmsweb/oracle:21_5-stable as oracle | ||
FROM registry.cern.ch/cmsweb/dmwm-base:pypi-20230525 | ||
MAINTAINER Valentin Kuznetsov [email protected] | ||
|
||
# Install basic OS package dependencies | ||
RUN apt-get update | ||
RUN apt-get install -y libmariadb-dev-compat libmariadb-dev apache2-utils sudo | ||
ENV TAG=X.Y.Z | ||
RUN pip install wmagent==$TAG | ||
ENV WDIR=/data | ||
ENV USER=_wmagent | ||
RUN useradd ${USER} && install -o ${USER} -d ${WDIR} | ||
RUN echo "%$USER ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers | ||
USER ${USER} | ||
RUN sudo chown -R $USER.$USER $WDIR | ||
WORKDIR $WDIR | ||
CMD ["python3"] | ||
RUN apt-get install -y libmariadb-dev-compat libmariadb-dev apache2-utils hostname net-tools iputils-ping cron mariadb-server myproxy voms-clients rlwrap libaio1 procps && apt-get clean | ||
|
||
# copy oracle client: | ||
COPY --from=oracle /usr/lib/oracle /usr/lib/oracle | ||
ENV LD_LIBRARY_PATH=/usr/lib/oracle | ||
ENV PATH=$PATH:/usr/lib/oracle | ||
ENV PKG_CONFIG_PATH=/usr/lib/oracle | ||
|
||
# WMA_TAG to be passed at build time through `--build-arg WMA_TAG=<WMA_TAG>`. Default: None | ||
ARG WMA_TAG=None | ||
ENV WMA_TAG=${WMA_TAG} | ||
ENV WMA_USER=cmst1 | ||
ENV WMA_GROUP=zh | ||
ENV WMA_UID=31961 | ||
ENV WMA_GID=1399 | ||
ENV WMA_ROOT_DIR=/data | ||
|
||
# Basic WMAgent directory structure passed to all scripts through env variables: | ||
# NOTE: Those should be static and depend only on $WMA_BASE_DIR | ||
ENV WMA_BASE_DIR=$WMA_ROOT_DIR/srv | ||
ENV WMA_ADMIN_DIR=$WMA_ROOT_DIR/admin/wmagent | ||
ENV WMA_CERTS_DIR=$WMA_ROOT_DIR/certs | ||
|
||
ENV WMA_HOSTADMIN_DIR=$WMA_ADMIN_DIR/hostadmin | ||
ENV WMA_CURRENT_DIR=$WMA_BASE_DIR/wmagent/current | ||
ENV WMA_INSTALL_DIR=$WMA_CURRENT_DIR/install | ||
ENV WMA_CONFIG_DIR=$WMA_CURRENT_DIR/config | ||
ENV WMA_MANAGE_DIR=$WMA_CONFIG_DIR/wmagent | ||
ENV WMA_DEPLOY_DIR=/usr/local | ||
ENV WMA_ENV_FILE=$WMA_DEPLOY_DIR/deploy/env.sh | ||
|
||
|
||
# Setting up users and previleges | ||
RUN groupadd -g ${WMA_GID} ${WMA_GROUP} | ||
RUN useradd -u ${WMA_UID} -g ${WMA_GID} -m ${WMA_USER} | ||
RUN install -o ${WMA_USER} -g ${WMA_GID} -d ${WMA_ROOT_DIR} | ||
RUN usermod -aG mysql ${WMA_USER} | ||
RUN rm -f /etc/mysql/mariadb.conf.d/50-server.cnf | ||
|
||
# Add WMA_USER to sudoers | ||
RUN echo "${WMA_USER} ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers | ||
|
||
# Add all deployment needed directories | ||
ADD bin $WMA_DEPLOY_DIR/bin | ||
ADD etc $WMA_DEPLOY_DIR/etc | ||
|
||
# Add install script | ||
ADD install.sh ${WMA_ROOT_DIR}/install.sh | ||
|
||
# Add wmagent run script | ||
ADD run.sh ${WMA_ROOT_DIR}/run.sh | ||
|
||
# Install the requested WMA_TAG. | ||
RUN ${WMA_ROOT_DIR}/install.sh -v ${WMA_TAG} | ||
RUN chown -R ${WMA_USER}:${WMA_GID} ${WMA_ROOT_DIR} | ||
|
||
# Switch to the runtime directory and user | ||
WORKDIR ${WMA_ROOT_DIR} | ||
USER ${WMA_USER} | ||
ENV USER=$WMA_USER | ||
|
||
# Define the entrypoint. All the run.sh paramters should be passed at runtime. | ||
ENTRYPOINT ["./run.sh"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,230 @@ | ||
## WMAgent in Docker using pypi deployment method. | ||
|
||
### Requires: | ||
* Docker to be installed on the host VM (vocmsXXXX) | ||
* HTcondor schedd to be installed and configured at the host VM | ||
* CouchDB to be installed on the host VM | ||
* MariaDB to be installed on the host VM (Depends on the type of relational database to be used MariaDB/Oracle) | ||
* Service certificates to be present at the host VM | ||
* `WMAgent.secrets` file to be present at the host VM | ||
|
||
### The implementation is realized through the following files: | ||
* `Dockerfile` - provides all basic requirements for the image and sets all common env variables to both `install.sh` and `run.sh`. | ||
* `install.sh` - called through `Dockerfile` `RUN` command and provided with a single parameter at build time `WMA_TAG` | ||
* `run.sh` - set as default `ENTRYPOINT` at container runtime. All agent related configuration parameters are passed as named arguments and used to (re)generate the agent configuration files. All service credentials and schedd caches are accessed via host mount points | ||
* `wmagent-docker-build.sh` - simple script to be used for building a WMAgent docker image | ||
* `wmagent-docker-run.sh` - simple script to be used for running a WMAgent docker container | ||
|
||
**Build options (accepted by `install.sh`):** | ||
* `WMA_TAG=2.2.1` | ||
|
||
**RUN options (accepted by `run.sh`):** | ||
* `TEAMNAME=testbed-$HOSTNAME` | ||
* `CENTRAL_SERVICES=cmsweb-testbed.cern.ch` | ||
* `AGENT_NUMBER=0` | ||
* `FLAVOR=mysql` | ||
|
||
|
||
### Building a WMAgent image | ||
|
||
The build process may happen at any machine running a Docker Engine. | ||
|
||
**Build command:** | ||
* Using the wrapper script to build WMAgent locally: | ||
``` | ||
ssh vocms**** | ||
cmst1 | ||
cd /data | ||
git clone https://github.com/dmwm/CMSKubernetes.git | ||
cd /data/CMSKubernetes/docker/pypi/wmagent/ | ||
./wmagent-docker-build.sh -v 2.2.1 | ||
``` | ||
* Using the wrapper script to build and upload WMAgent to registry.cern.ch: | ||
``` | ||
./wmagent-docker-build.sh -v 2.2.1 -p | ||
``` | ||
* Here is what is happening under the hood: | ||
``` | ||
WMA_TAG=2.2.1 | ||
docker build --network=host --progress=plain --build-arg WMA_TAG=$WMA_TAG -t wmagent:$WMA_TAG -t wmagent:latest /data/CMSKubernetes/docker/pypi/wmagent/ 2>&1 |tee /data/build-wma.log | ||
``` | ||
**Partial output:** | ||
``` | ||
... | ||
#4 [ 1/13] FROM registry.cern.ch/cmsweb/dmwm-base:pypi-20230314@sha256:71cf3825ed9acf4e84f36753365f363cfd53d933b4abf3c31ef828828e7bdf83 | ||
#4 DONE 0.0s | ||
... | ||
#14 0.110 ======================================================= | ||
#14 0.110 Starting new agent deployment with the following data: | ||
#14 0.110 ------------------------------------------------------- | ||
#14 0.111 - WMAgent version : 2.2.1 | ||
#14 0.113 - Python verson : Python 3.8.16 | ||
#14 0.114 - Python Module Path : /usr/local/lib/python3.8/site-packages | ||
#14 0.114 ======================================================= | ||
... | ||
#18 naming to docker.io/library/wmagent:2.2.1 done | ||
#18 DONE 3.3s | ||
``` | ||
|
||
### Running a WMAgent container | ||
|
||
One needs to bind mount several directories from the host VM (vocmsXXXX). | ||
* /data/dockerMount/certs | ||
* /etc/condor (schedd runs on the host, not the container) | ||
* /tmp | ||
* /data/dockerMount/srv/wmagent/current/install (stateful service and component dirs) | ||
* /data/dockerMount/srv/wmagent/current/config (for persisting agent configuration data) | ||
* /data/dockerMount/admin/wmagent (in order to access the WMAgent.secrets) | ||
|
||
|
||
The install and config dirs will be initialized the first time you execute run.sh and a .dockerinit file will be placed to keep track of the initialization. Subsequent container restarts won't touch these directories. | ||
|
||
**Run command:** | ||
|
||
* Initialising the agent for the first time: | ||
``` | ||
ssh vocms**** | ||
cmst1 | ||
cd /data/CMSKubernetes/docker/pypi/wmagent/ | ||
### cleaning old agent data: | ||
rm -rf /data/dockerMount/srv/ | ||
./wmagent-docker-run.sh -t <team_name> -n <agent_number> -f <db_flavour> -c <central_services> & | ||
``` | ||
* Initialising the agent for the first time using a docker image from registry.cern.ch: | ||
``` | ||
./wmagent-docker-run.sh -t <team_name> -n <agent_number> -f <db_flavour> -c <central_services> -p -v 2.2.1 & | ||
``` | ||
* Running the agent: | ||
``` | ||
./wmagent-docker-run.sh & | ||
``` | ||
|
||
* Here is what is happening under the hood: | ||
``` | ||
WMA_ROOT_DIR=/data/dockerMount | ||
|
||
dockerOpts=" \ | ||
--network=host \ | ||
--rm \ | ||
--hostname=`hostname -f` \ | ||
--name=wmagent \ | ||
--mount type=bind,source=/etc/tnsnames.ora,target=/etc/tnsnames.ora,readonly \ | ||
--mount type=bind,source=/etc/condor,target=/etc/condor,readonly \ | ||
--mount type=bind,source=/tmp,target=/tmp \ | ||
--mount type=bind,source=$WMA_ROOT_DIR/certs,target=/data/certs \ | ||
--mount type=bind,source=$WMA_ROOT_DIR/srv/wmagent/current/install,target=/data/srv/wmagent/current/install \ | ||
--mount type=bind,source=$WMA_ROOT_DIR/srv/wmagent/current/config,target=/data/srv/wmagent/current/config \ | ||
--mount type=bind,source=$WMA_ROOT_DIR/admin/wmagent,target=/data/admin/wmagent/hostadmin \ | ||
" | ||
|
||
wmaOpts=" \ | ||
-f mysql \ | ||
-t testbed-vocms0260 \ | ||
-n 0 \ | ||
-c cmsweb-testbed.cern.ch" | ||
|
||
docker run $dockerOpts wmagent $wmaOpts | ||
``` | ||
|
||
**Partial output:** | ||
``` | ||
======================================================= | ||
Starting WMAgent with the following initial data: | ||
------------------------------------------------------- | ||
- WMAgent Version : 2.2.1 | ||
- WMAgent TeamName : testbed-vocms0260 | ||
- WMAgent Number : 0 | ||
- WMAgent Host : vocms0260.cern.ch | ||
- WMAgent Config : /data/srv/wmagent/current/config | ||
- WMAgent Relational DB type : oracle | ||
- Python verson : Python 3.8.16 | ||
- Python Module Path : /usr/local/lib/python3.8/site-packages | ||
======================================================= | ||
... | ||
``` | ||
|
||
**NOTE:** | ||
Currently, it is a must that only one WMAgent container should be running on a singe agent VM. It is partially guarantied by setting the `--name=wmagent` parameter at the `docker run` command above. But it is in fact possible to over come this by setting a different name of the new container, but bare in mind all unpredictable consequences of such action. If one tries tr start two containers with the same name, the expected err is: | ||
``` | ||
docker run $dockerOpts wmagent:$WMA_TAG $wmaOpts | ||
|
||
docker: Error response from daemon: Conflict. The container name "/wmagent" is already in use by container "c4c64688a75b6ac8f5cc5e4c951db324b2441ec1434f2e1d604a49d8009ff2a1". You have to remove (or rename) that container to be able to reuse that name. | ||
See 'docker run --help' | ||
``` | ||
|
||
|
||
|
||
|
||
### Checking container status | ||
``` | ||
ssh vocms**** | ||
|
||
docker container ps | ||
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES | ||
78d7e1baa3df wmagent:2.2.1 "./run.sh -f oracle ..." 2 hours ago Up 2 hours wmagent | ||
|
||
``` | ||
|
||
## Stopping the WMAgent container | ||
In order to stop the WMAgent container one just needs to kill it, the `--rm` option at `docker run` commands assures we leave no leftover containers. | ||
|
||
**Shutdown command:** | ||
``` | ||
docker kill wmagent | ||
``` | ||
|
||
### Enforce container reinitialisation at the host: | ||
The WMAgent needs to preserve its configuration and initialisation data permanently at the host. For the purpose we use Host to Docker bind mounts. | ||
Once a specific WMAgent image has been run for the first time it leaves a small set of .dockerInit files at all places where permanent data(like config files and job caches) at the host is preserved. | ||
On any further restart of the container, hence the WMAgent itself, we do not go through all the initialisation steps again if we find the | ||
relevant .dockerInit file and the $WMA_BUILD_ID hash contained there matches the $WMA_BUILD_ID of the currently starting container. | ||
In order for one to enforce reinitialisation steps to be performed one needs to delete all .dockerInit files and restart the wmagent container. | ||
|
||
**NOTE: This reinitialisation may result in losing previous job caches and database records** | ||
**Reinitialisation command:** | ||
``` | ||
docker kill wmagent | ||
|
||
sudo find /data/dockerMount -name .dockerInit -delete | ||
|
||
docker run $dockerOpts wmagent:$WMA_TAG $wmaOpts | ||
``` | ||
|
||
**Partial output:** | ||
``` | ||
======================================================= | ||
Starting WMAgent with the following initialisation data: | ||
------------------------------------------------------- | ||
- WMAgent Version : 2.2.1 | ||
... | ||
======================================================= | ||
------------------------------------------------------- | ||
Start: Performing checks for successful Docker initialisation steps... | ||
WMA_BUILD_ID: 110b443165e3b5a4ba569b8a1ab063a616132602e55ba06b0c3e89a01e643f31 | ||
dockerInitId: /data/admin/wmagent/hostadmin/.dockerInit: | ||
... | ||
ERROR | ||
------------------------------------------------------- | ||
Start: Performing Docker image to Host initialisation steps | ||
... | ||
Done: Performing Docker image to Host initialisation steps | ||
------------------------------------------------------- | ||
------------------------------------------------------- | ||
Start: Performing checks for successful Docker initialisation steps... | ||
WMA_BUILD_ID: 110b443165e3b5a4ba569b8a1ab063a616132602e55ba06b0c3e89a01e643f31 | ||
dockerInitId: 110b443165e3b5a4ba569b8a1ab063a616132602e55ba06b0c3e89a01e643f31 | ||
OK | ||
------------------------------------------------------- | ||
... | ||
``` | ||
|
||
### Connecting to the container | ||
|
||
First login at the VM and from there connect to the container: | ||
|
||
**Login sequence:** | ||
``` | ||
docker exec -it wmagent /bin/bash | ||
... | ||
(WMAgent-2.2.1) [cmst1@vocms0260:current]$ manage status | ||
``` |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to keep this line otherwise this will break the WMCore CD pipeline, see:
https://github.com/dmwm/WMCore/blob/master/.github/workflows/docker_images_template.yaml#L29
Perhaps you can do something like:
and it should resolve the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has not been done. Unresolving it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somehow, I have lost that change in my previous commit. Sorry, my bad. Fixing it with my next one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code won't work! WMCore GH action workflow will update TAG but not WMA_TAG, which is the one actually used during the whole process.