Skip to content

Repo for solving Kaggle's Tweet Sentiment Extraction

Notifications You must be signed in to change notification settings

erlbacsi/ML_Workflow

Repository files navigation

ML_Workflow

Repo overview for solving Kaggle's Tweet Sentiment Extraction Kaggle's Tweet Sentiment Extraction Competition.

  • Trained TensorFlow roBERTa model (from Huggingface)
  • Deployed via a Flask webservice in a Docker container
  • Unittests

Installation

To start the project, follow these steps:

  1. Clone Repo:

    git clone https://github.com/erlbacsi/ML_Workflow.git
  2. Install dependencies:

    Dependencies are all in the ‘requirements.txt’ file. This is used to build the Docker image

    docker build -t webservice_image .

    After that, the Docker container can be built to start the web service in the container. To do this, the script ‘webservice.py’ is executed

    docker run -d -p 5000:5000 --name webservice_image webservice

    After the container has been started, the web service is active and can now respond to corresponding requests.

Usage

This is how the web service can be used:

  1. Prepare sentiment input

    The input is expected in JSON format. The PowerShell script ‘shell_request.ps1’ is already available for this purpose. This script contains a dictionary into which any text can be inserted after the key ‘text’. In addition, the model requires input for the key ‘sentiment’. A distinction is made between three types of text. The key terms for this are ‘positive’, ‘neutral’ and ‘negative’.

  2. Send a request

    The script can then simply be started in a PowerShell.

    .\shell_request.ps1

    The script sends the text to the model and receives a corresponding response, which is then displayed in the PowerShell. In addition, the response time in seconds is displayed.

Unittests

The script ‘test_webservice.py’ is located within the Docker container and checks the web service for some test cases. To test the web service, a command line is opened on the Docker container.

 docker exec -it webservice

The script is located in the ‘usr/src/app’ folder and can simply be started via the CMD.

python test-webservice.py -v

The unit tests check the web service once for correct input and once for incorrect input. The different cases can be seen in the Python script.

Latency time improvements

  • Using GPU (CUDA) for model prediction and computation
  • Caching from frequent requests
  • Quantisation of the ML model parameters (lower computation time and memory usage)

About

Repo for solving Kaggle's Tweet Sentiment Extraction

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published