ML_Workflow

Repo overview for solving Kaggle's Tweet Sentiment Extraction Kaggle's Tweet Sentiment Extraction Competition.

Trained TensorFlow roBERTa model (from Huggingface)
Deployed via a Flask webservice in a Docker container
Unittests

Installation

To start the project, follow these steps:

Clone Repo:

git clone https://github.com/erlbacsi/ML_Workflow.git

Install dependencies:

Dependencies are all in the ‘requirements.txt’ file. This is used to build the Docker image
```
docker build -t webservice_image .
```
After that, the Docker container can be built to start the web service in the container. To do this, the script ‘webservice.py’ is executed
```
docker run -d -p 5000:5000 --name webservice_image webservice
```
After the container has been started, the web service is active and can now respond to corresponding requests.

Usage

This is how the web service can be used:

Prepare sentiment input

The input is expected in JSON format. The PowerShell script ‘shell_request.ps1’ is already available for this purpose. This script contains a dictionary into which any text can be inserted after the key ‘text’. In addition, the model requires input for the key ‘sentiment’. A distinction is made between three types of text. The key terms for this are ‘positive’, ‘neutral’ and ‘negative’.
Send a request

The script can then simply be started in a PowerShell.
```
.\shell_request.ps1
```
The script sends the text to the model and receives a corresponding response, which is then displayed in the PowerShell. In addition, the response time in seconds is displayed.

Unittests

The script ‘test_webservice.py’ is located within the Docker container and checks the web service for some test cases. To test the web service, a command line is opened on the Docker container.

 docker exec -it webservice

The script is located in the ‘usr/src/app’ folder and can simply be started via the CMD.

python test-webservice.py -v

The unit tests check the web service once for correct input and once for incorrect input. The different cases can be seen in the Python script.

Latency time improvements

Using GPU (CUDA) for model prediction and computation
Caching from frequent requests
Quantisation of the ML model parameters (lower computation time and memory usage)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

ML_Workflow

Installation

Usage

Unittests

Latency time improvements

Files

README.md

Latest commit

History

README.md

File metadata and controls

ML_Workflow

Installation

Usage

Unittests

Latency time improvements