Repo overview for solving Kaggle's Tweet Sentiment Extraction Kaggle's Tweet Sentiment Extraction Competition.
- Trained TensorFlow roBERTa model (from Huggingface)
- Deployed via a Flask webservice in a Docker container
- Unittests
To start the project, follow these steps:
-
Clone Repo:
git clone https://github.com/erlbacsi/ML_Workflow.git
-
Install dependencies:
Dependencies are all in the ‘requirements.txt’ file. This is used to build the Docker image
docker build -t webservice_image .
After that, the Docker container can be built to start the web service in the container. To do this, the script ‘webservice.py’ is executed
docker run -d -p 5000:5000 --name webservice_image webservice
After the container has been started, the web service is active and can now respond to corresponding requests.
This is how the web service can be used:
-
Prepare sentiment input
The input is expected in JSON format. The PowerShell script ‘shell_request.ps1’ is already available for this purpose. This script contains a dictionary into which any text can be inserted after the key ‘text’. In addition, the model requires input for the key ‘sentiment’. A distinction is made between three types of text. The key terms for this are ‘positive’, ‘neutral’ and ‘negative’.
-
Send a request
The script can then simply be started in a PowerShell.
.\shell_request.ps1
The script sends the text to the model and receives a corresponding response, which is then displayed in the PowerShell. In addition, the response time in seconds is displayed.
The script ‘test_webservice.py’ is located within the Docker container and checks the web service for some test cases. To test the web service, a command line is opened on the Docker container.
docker exec -it webservice
The script is located in the ‘usr/src/app’ folder and can simply be started via the CMD.
python test-webservice.py -v
The unit tests check the web service once for correct input and once for incorrect input. The different cases can be seen in the Python script.
- Using GPU (CUDA) for model prediction and computation
- Caching from frequent requests
- Quantisation of the ML model parameters (lower computation time and memory usage)