Skip to content
/ pfms Public

A "pf" service that executes inference on uploaded data

Notifications You must be signed in to change notification settings

FNNDSC/pfms

Repository files navigation

pfms

Build

a FastAPI REST service that provides somewhat bespoke inference services on medical data

Abstract

pfms is a FastAPI application that provides REST services for segmentation on input image data, in particular spleen volumetric data. Simplistically, it can be thought of as a so -called Model Server although the technical semantics of that statement are not really correct. Several API endpoints are provided, suited to consumption by software clients. A typical life-cycle involves uploading a very specific neural network weights file, which is used to initialize the inference engine. Then, NIfTI volumes can be uploaded to an inference API route, which returns a segmented NIfTI volume.

Conceptualization

Conventional MLops uses specialized terminology, such as "model server", "inference endpoint", within the context of specialized image processing. Generally, a "model server" is a server that can accept an image, run some pre-trained "inference" (almost always to perform image segmentation), and return the results. To be general, MLops servers typcially communicate data as JSON representations.

The basic idea is simple: a client communicates with some remote server using http POST requests to send image data to a specific API endpoint associated with a specific set of operations. The server performs some operation on this data, and returns processsed image data in response.

pfms Specificities

Broadly speaking, pfms provides this exact behavior. However, it is uniquely tailored to providing services within the context of the pl-monai_spleenseg ChRIS plugin. Indeed, pfms uses this exact plugin as a internal module to perform the same segmentation. Moreover, unlike more conventional MLop "model servers", pfms accepts as input NIfTI volumes and returns NIfTI volumes as resultants. This is considerably more efficient than a JSON serialization and deserialization of payload data to encode an image.

pfms Deployment

local build

To build a local version, clone this repo and then

set UID=$(id -u) # for bash/zsh
set UID (id -u)  # for fish shell
docker build --build-arg UID=UID -t local/pfms .

dockerhub

To use the version available on dockerhub (note, might not be available at time of reading):

docker pull fnndsc/pfms

running

To start the services

SESSIONUSER=localuser
docker run --gpus all --privileged          \
        --env SESSIONUSER=$SESSIONUSER      \
        --name pfms --rm -it -d             \
        -p 2024:2024                        \
        local/pfms /start-reload.sh

To start with source code debugging and live refreshing:

SESSIONUSER=localuser
docker run --gpus all --privileged          \
        --env SESSIONUSER=$SESSIONUSER      \
        --name pfms --rm -it -d             \
        -p 2024:2024                        \
        -v $PWD/pfms:/app:ro
        local/pfms /start-reload.sh

(note if you pulled from dockerhub, use fnndsc/pfms instead of local/pfms)

pfms Usage

Upload model file in pth format

pfms can host/provide multiple "models" -- a model is understood here to be simply a pre-trained weights file in pth format as generated by pl-monai_spleenseg during a training phase. This pth file can be uploaded to pfms by POSTing the file to this endpoint:

POST :2024/api/v1/spleenseg/modelpth/?modelID=<modelID>

Note the that the URL query parameter of modelID is used to "name" this model.

Segment a NIfTI spleen volume

To run the segmentation on a volume using a model file, POST the NIfTI volume to the correct API endpoint, naming the model to use as a query parameter:

POST :2024/api/v1/spleenseg/NIfTIinference/?modelID=<modelID>

Here, a NIfTI volume is passed as a FileUpload request. The pfms instance will save/unpack this file within itself, and then run the pl-monai_spleenseg inference mode using as model weights the pth file associated with <modelID>. The resultant NIfTI file, stored within the server, is then read and streamed back to the caller, which will typically save this file to disk or do further processing.

Note that this call will block until processing complete! Processing (depending on network speed, etc) is typically less than 30 seconds.

Other API endpoints

Get a list of available "models"

To get a list of available models to use, simply GET the modelpth endpoint:

GET :2024/api/v1/spleenseg/modelpth

-30-