-
Notifications
You must be signed in to change notification settings - Fork 15
Specific instructions to run harvester with ssh rpc middleware
FaHui Lin edited this page May 23, 2019
·
6 revisions
Basic description of SSH RPC middleware: see here
The queue configuration depends on the architecture of the cluster (HPC) and the use case.
Consider an HPC:
- No outbound connectivity on login nodes and worker nodes
- One can access login nodes via SSH
- Login nodes have the same environment and mount the same shared filesystem as worker nodes do
- Allowing service process to run on login nodes (no cputime limit per process or other limitations)
- With DTNs (data transfer nodes) which has outbound connectivity and grid data transfer tools (globus, gfal, xroot, etc.)
- DTNs are accessible from login nodes
Then it suffices to run harvester rpc_bot process on the login node of HPC, and let all harvester plugins run on the login node (run remotely).
That is, harvester runs the following plugins remotely:
- submitter
- monitor
- sweeper
- messenger
- preparator
- stager
The queue configuration (partial) may look like this:
"preparator": {
"name": "SomePreparator",
"module": "pandaharvester.harvesterpreparator.some_preparator",
"basePath": "/some/remote/base/path",
"middleware": "rpc"
},
"submitter": {
"name":"SlurmSubmitter",
"module":"pandaharvester.harvestersubmitter.slurm_submitter",
"nCore": 9600,
"nCorePerNode": 48,
"templateFile": "/some/remote/template.sh",
"middleware": "rpc"
},
"messenger": {
"name": "SharedFileMessenger",
"module": "pandaharvester.harvestermessenger.shared_file_messenger",
"accessPoint": "/some/remote/path/${workerID}",
"middleware": "rpc"
},
"stager": {
"name":"SomeStager",
"module":"pandaharvester.harvesterstager.some_stager",
"middleware": "rpc"
},
"monitor": {
"name":"SlurmMonitor",
"module":"pandaharvester.harvestermonitor.slurm_monitor",
"middleware": "rpc"
},
"sweeper": {
"name": "SlurmSweeper",
"module": "pandaharvester.harvestersweeper.slurm_sweeper",
"middleware": "rpc"
},
"rpc": {
"name": "RpcHerder",
"module": "pandaharvester.harvestermiddleware.rpc_herder",
"remoteHost": "some.remote.host",
"remoteBindPort": 18861,
"numTunnels": 3,
"sshUserName": "someusername",
"sshPassword": null,
"privateKey": "/some/private/key",
"passPhrase": "somepassphrase",
"jumpHost": "some.jump.host",
"jumpPort": 22
}
Note that the paths set for the plugin with rpc (e.g. messenger accessPoint
) are the remote ones; i.e. on the HPC side.
Getting started |
---|
Installation and configuration |
Testing and running |
Debugging |
Work with Middleware |
Admin FAQ |
Development guides |
---|
Development workflow |
Tagging |
Production & commissioning |
---|
Scale up submission |
Condor experiences |
Commissioning on the grid |
Production servers |
Service monitoring |
Auto Queue Configuration with CRIC |
SSH+RPC middleware setup |
Kubernetes section |
---|
Kubernetes setup |
X509 credentials |
AWS setup |
GKE setup |
CERN setup |
CVMFS installation |
Generic service accounts |
Advanced payloads |
---|
Horovod integration |