A Spring boot
application serves REST APIs for Machine Learning Model Inference.
Two main tasks:
- Handle image file uploaded by users and deliver preprocess request to Kafka streams.
- Listen Kafka brokers to receive inference result messages. Store the inference result into Database, and serve the associated queries from users.
The configuration, including kafka connection parameters, can be found in application.properties.
Successful response is returned with status code 200. Any application failure will result in HTTP response with status code 400, and response body:
// JSON:
{
"error": String
}
Register the inference service.
Request body:
// JSON (Empty Body):
{
}
A UUID
identifier is returned in json format
/// JSON:
{
"uuid": String
}
The identifier will be used in the rest of the APIs.
Upload image for inference.
Request body is FormData
with params:
uuid
, String. The service registration identifier.seq
, Int. The sequence number defined by user, which will be included in inference result.topk
, int. The number of top 'K' predictions in inference result.payload
, MultipartFile. The bytes of the image file.
No response data.
Get the inference results corresponding to the requests previously made.
Request body:
// JSON:
{
"uuid": String,
"seqStart": int,
"seqEnd": int
}
uuid
. The service registration identifier.seqStart
andseqEnd
defines the range of the sequence number specified in the request previously made.
Response body:
// JSON:
{
"items": [
{
"seq": int,
"predictions": String[]
},
...
]
}
The items
might not include the inference result consecutively because the inference could still be under progress. It also implies the possibility that items
could empty array.
mvn package
The built jar file can be found in target
folder. E.g. target/inference-server-0.0.1-SNAPSHOT.jar
.
java -jar <path to .jar file>