-
Notifications
You must be signed in to change notification settings - Fork 1
Example setup
Some BIIGLE modules, such as biigle/maia
, may require GPU computing resources. This wiki describes how to set up a BIIGLE instance with access to GPU computing resources by setting up a working instance with the biigle/maia
module.
This wiki assumes that you already have a working BIIGLE instance. Head over to biigle\distribution
to get started if you don't have a running instance.
In addition, you need the NVIDIA GPU drivers, Docker, nvidia-docker2 and Docker Compose installed on the GPU machine(s) that should perform GPU computing. You can find example installation instructions here.
BIIGLE can be flexibly configured to use GPU computing resources in various scenarios. The GPU can be located on the same machine on which the BIIGLE instance is running (Same Host GPU), the GPU can be located on another machine (Single Remote Host GPU) or many GPUs can be located on many remote machines (Many Remote Host GPUs).
To enable the flexible configuration, BIIGLE makes extensive use of job queues. Jobs that should be executed on a GPU are submitted to a special "GPU" queue. Once a job is finished, the results are submitted back to a special "GPU Response" queue. Read on for a detailed description of the setup for the different scenarios:
This is the simplest setup. In this scenario, the GPU is available on the same host that runs the BIIGLE instance. However, BIIGLE still makes use of special queues for GPU jobs and responses to distinguish between regular jobs and those that should be executed on a GPU. This requires a dedicated queue worker for GPU jobs. Follow these steps for the setup:
To install the biigle/maia
in your BIIGLE instance, take a look at the installation instructions. Since we are working with a BIIGLE production configuration based on biigle\distribution
the installation works as follows:
-
Add
biigle/maia
to the list of Composer requirements inbuild.dockerfile
. -
Append the following line to the modification command for the service providers array:
&& sed -i '/Insert Biigle module service providers/i Biigle\\Modules\\Maia\\MaiaServiceProvider::class,' config/app.php
-
Configure the storage disks (step 4) required by
biigle/maia
infilesystems.php
.
By default biigle/maia
submits jobs that should be executed on a GPU to the default
queue of the gpu
connection. In this scenario, no extra gpu
connection is required. Instead, the regular redis
connection can be used. To distinguish the jobs, we use a new gpu
queue for this connection. To implement the configuration, add this to the build/.env
file:
MAIA_REQUEST_QUEUE=gpu
MAIA_REQUEST_CONNECTION=redis
Similarly by default, biigle/maia
submits GPU results to the default
queue of the gpu-response
connection. Here, we can use the default
queue of the redis
connection. Again, append this to build/.env
:
MAIA_RESPONSE_QUEUE=default
MAIA_RESPONSE_CONNECTION=redis
Finally, biigle/maia
has some configuration options that can be adjusted to the computing capabilities of the machine and GPU. MAIA_AVAILABLE_BYTES
can be set to the available memory of the GPU. As a rule of thumb, you should set this option to 1 GB less than the actual size of the GPU memory, as it can't be used in its entirety. MAIA_MAX_WORKERS
can be set to the number of CPU cores that MAIA is allowed to use during processing. As BIIGLE runs on the same machine, you should spare some cores for this. An example configuration for a machine with 16 GB GPU memory and 14 CPU cores looks like this:
MAIA_AVAILABLE_BYTES=15E+9
MAIA_MAX_WORKERS=10
Now we have to add a new queue worker to your production setup, which is actually able to process jobs on the GPU. Create the file build/gpu-worker.dockerfile
with the following content:
FROM biigle/build-dist AS intermediate
FROM docker.pkg.github.com/biigle/gpus/gpus-worker
COPY --from=intermediate /etc/localtime /etc/localtime
COPY --from=intermediate /etc/timezone /etc/timezone
COPY --from=intermediate /var/www /var/www
Then add the following line to build/build.sh
:
docker build -f gpu-worker.dockerfile -t biigle/gpu-worker-dist:$VERSION .
Finally, add the new queue worker service to the file docker-compose.yml
:
gpu-worker:
image: biigle/gpu-worker-dist
user: ${USER_ID}:${GROUP_ID}
runtime: nvidia
restart: always
depends_on:
- cache
volumes_from:
- app
init: true
command: "php -d memory_limit=1G artisan queue:work --queue=gpu --sleep=5 --tries=1 --timeout=0"
This service will start a Docker container with access to the GPU which processes all jobs that are submitted to the new gpu
queue.
To finish the setup, perform the updating steps (including database migrations) for your BIIGLE instance. This will build and start the new GPU worker service, as well as apply the database migrations of biigle/maia
.
This is probably the most common scenario, where the GPU is available on a different machine than the one that runs the BIIGLE instance. Here, too, communication between the machines happens through job queues. To enable queued jobs being sent from one host to another, we developed the biigle/laravel-remote-queue
package. Follow these steps for the setup:
Install biigle/maia
in your BIIGLE instance as described above. You don't have to configure the module here in this scenario.
biigle/laravel-remote-queue
is a special queue driver for Laravel which enables the submission of queued jobs to another machine. In this scenario, the driver is used for the communication between the BIIGLE instance and the GPU machine. The installation in your BIIGLE instance works as follows:
-
Add
biigle/laravel-remote-queue
to the list of Composer requirements inbuild.dockerfile
. -
Add the following configuration options to the
build/.env
file:REMOTE_QUEUE_LISTEN=true REMOTE_QUEUE_ACCEPT_TOKENS=<token1>
You can generate the random string for
<token1>
with the commandhead -c 32 /dev/urandom | base64
. -
Create the file
build/config/queue.php
. Append the following line tobuild.dockerfile
:COPY config/queue.php /var/www/config/queue.php
Then add this to the
build/.env
file:QUEUE_GPU_TOKEN=<token2> QUEUE_GPU_URL=http://<gpu-ip>/api/v1/remote-queue/
You can generate the random string for
<token2>
in the same way than<token1>
of the previous step.<gpu-ip>
should be the IP address of the GPU machine.
To finish the installation of biigle/maia
and biigle/laravel-remote-queue
, perform the updating steps (including database migrations) for your BIIGLE instance.
biigle/gpus-distribution
is the production setup for the BIIGLE "GPU server" which is meant to run on the machine with the GPU. The job of the GPU server is to accept queued jobs that should run on a GPU and to return the results as response jobs. This production setup already comes with biigle/maia
included.
Follow the installation instructions to set up the GPU server on the machine with GPU. In step 6 of the instructions, set the following configuration options:
REMOTE_QUEUE_ACCEPT_TOKENS=<token2>
QUEUE_GPU_RESPONSE_URL=http://<biigle-ip>/api/v1/remote-queue/
QUEUE_GPU_RESPONSE_TOKEN=<token1>
<token1>
and <token2>
are the random strings from the previous steps. <biigle-ip>
is the IP address of your machine running the BIIGLE instance. In addition, configure the options MAIA_MAX_WORKERS
and MAIA_AVAILABLE_BYTES
as described above.
Important: The GPU server needs to be able to access the image files in the same way than the BIIGLE instance. This could be achieved with a shared filesystem, an (S)FTP server, a cloud object storage service or with the exclusive use of remote volumes in your BIIGLE instance. To enable the GPU server to access the same storage location for images than the BIIGLE instance, copy the storage disk configuration of filesystems.php
to your production setup of the GPU server and append the following line to build.dockerfile
of the GPU server:
COPY config/filesystems.php /var/www/config/filesystems.php
That's it!
If you plan to make extensive use of GPU computing resources in BIIGLE, it's advisable to use more than a single GPU. In this scenario, the biigle/laravel-round-robin-queue
package is used in addition to
biigle/laravel-remote-queue
to distribute new queued GPU jobs evenly to many machines with GPU. Follow these steps for the setup:
Configure your BIIGLE instance and each of the GPU machines as described in the Single Remote Host GPU scenario. biigle/gpus-distribution
needs to be installed on every GPU machine.
biigle/laravel-round-robin-queue
is a special queue driver for Laravel which handles the even distribution of queued jobs to a set of sub-queues. In this scenario, the driver is used to distribute GPU jobs evenly to the remote queues which are connected with the GPU machines. The installation in your BIIGLE instance works as follows:
-
Add
biigle/laravel-round-robin-queue
to the list of Composer requirements inbuild.dockerfile
. -
Modify the file
build/config/queue.php
and replace the existinggpu
connection with the following new connections:'gpu-1' => [ 'driver' => 'remote', 'queue' => 'default', 'url' => env('QUEUE_GPU1_URL'), 'token' => env('QUEUE_GPU_TOKEN'), ], 'gpu-2' => [ 'driver' => 'remote', 'queue' => 'default', 'url' => env('QUEUE_GPU2_URL'), 'token' => env('QUEUE_GPU_TOKEN'), ], 'gpu' => [ 'driver' => 'roundrobin', 'queue' => 'default', 'connections' => ['gpu-1', 'gpu-2'], ],
This assumes two remote hosts (
gpu-1
andgpu-2
) with GPUs. Add more connections for more hosts. -
Update the configuration options in the
build/.env
file:QUEUE_GPU1_URL=http://<gpu-1-ip>/api/v1/remote-queue/ QUEUE_GPU2_URL=http://<gpu-2-ip>/api/v1/remote-queue/
<gpu-1-ip>
and<gpu-2-ip>
are the IP addresses of the GPU machines, respectively. Add more variables if you use more machines.
That's it! GPU jobs will now be processed on all available GPU machines.