Skip to content

Commit

Permalink
docker proj
Browse files Browse the repository at this point in the history
  • Loading branch information
alonitac committed Jan 8, 2024
1 parent 9d7e766 commit 4c969bc
Show file tree
Hide file tree
Showing 8 changed files with 396 additions and 0 deletions.
186 changes: 186 additions & 0 deletions docker_project/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,186 @@
# Object Detection Service


## Background

In this project, you'll design, develop and deploy an object detection service that consists of multiple containerized microservices.

Users send images through an interactive Telegram bot (the bot you've implemented in the Python project), the service detects objects in the image and send the results to the user.

The service consists of 3 microservices:

- `polybot`: Telegram Bot container.
- `yolo5`: Image prediction container based on the Yolo5 pre-train deep learning model.
- `mongo`: MongoDB cluster to store data.

## Preliminaries

Create a dedicated GitHub repo for the project (or use the same GitHub repo from the previous Python project and utilize your Telegram bot implementation).

## Implementation guidelines

### The `mongo` microservice

MongoDB is a [document](https://www.mongodb.com/document-databases), [NoSQL](https://www.mongodb.com/nosql-explained/nosql-vs-sql) database, offers high availability deployment using multiple replica sets.
**High availability** (HA) indicates a system designed for durability and redundancy.
A **replica set** is a group of MongoDB servers, called nodes, containing an identical copy of the data.
If one of the servers fails, the other two will pick up the load while the crashed one restarts, without any data loss.

Follow the official docs to deploy containerized MongoDB cluster on your local machine.
Please note that the mongo deployment should be configured **to persist the data that was stored in it**.

https://www.mongodb.com/compatibility/deploying-a-mongodb-cluster-with-docker

Got HA mongo deployment? great, let's move on...

### The `yolo5` microservice

[Yolo5](https://github.com/ultralytics/yolov5) is a state-of-the-art object detection AI model. It is known for its high accuracy object detection in images and videos.
You'll work with a lightweight model that can detect [80 objects](https://github.com/ultralytics/yolov5/blob/master/data/coco128.yaml) while running on your old, poor, CPU machine.

The service files are under the `docker_project/yolo5` directory. Copy these files to your repo.

#### Develop the app

The `yolo5/app.py` app is a flask based webserver, with a single endpoint `/predict`, which can be used to predict objects in images.

To use this endpoint, you don't send the image directly in the HTTP request. Instead, you attach a query parameter called `imgName` to the URL (e.g. `localhost:8081/predict?imgName=street.jpeg`), which represents an image name stored in an **S3 bucket**.
The service downloads this image from the S3 bucket and detect objects in it.

Take a look on the code, and complete the `# TODO`s. Feel free to change/add any functionality as you wish!

#### Build and run the app

The `yolo5` app can be running only as a Docker container. This is because the app depends on many files that don't exist on your local machine, but do exist in the [`ultralytics/yolov5`](https://hub.docker.com/r/ultralytics/yolov5) base image.

Take a look at the provided `Dockerfile`, it's already implemented for you, no need to touch.

If you run the container on your local machine, you may need to **mount** (as a volume) the directory containing the AWS credentials on your local machine (`$HOME/.aws/credentials`) to allow the container communicate with S3.

**Note: Never build a docker image with AWS credentials stored in it! Never commit AWS credentials in your source code! Never!**

Once the image was built and run successfully, you can communicate with it directly by:

```bash
curl -X POST localhost:8081/predict?imgName=street.jpeg
```

For example, here is an image and the corresponding results summary:

<img src="../.img/street.jpeg" width="60%">

```json
{
"prediction_id": "9a95126c-f222-4c34-ada0-8686709f6432",
"original_img_path": "data/images/street.jpeg",
"predicted_img_path": "static/data/9a95126c-f222-4c34-ada0-8686709f6432/street.jpeg",
"labels": [
{
"class": "person",
"cx": 0.0770833,
"cy": 0.673675,
"height": 0.0603291,
"width": 0.0145833
},
{
"class": "traffic light",
"cx": 0.134375,
"cy": 0.577697,
"height": 0.0329068,
"width": 0.0104167
},
{
"class": "potted plant",
"cx": 0.984375,
"cy": 0.778793,
"height": 0.095064,
"width": 0.03125
},
{
"class": "stop sign",
"cx": 0.159896,
"cy": 0.481718,
"height": 0.0859232,
"width": 0.053125
},
{
"class": "car",
"cx": 0.130208,
"cy": 0.734918,
"height": 0.201097,
"width": 0.108333
},
{
"class": "bus",
"cx": 0.285417,
"cy": 0.675503,
"height": 0.140768,
"width": 0.0729167
}
],
"time": 1692016473.2343626
}
```

The model detected a _person_, _traffic light_, _potted plant_, _stop sign_, _car_, and a _bus_. Try it yourself with different images.

### The `polybot` microservice

You can either integrate your bot implementation from the previous Python project, or use the code sample given to you under `docker_project/polybot` directory.

In case you use the code sample, make sure you have Telegram bot token, and you know how to expose your bot using `ngrok` when running it locally.

In the sample code, under `bot.py` you'll find the class `ObjectDetectionBot` with a `handle_message()` method that handles incoming messages from end-users.
When users send an image to the bot, you have to upload this image to S3 and perform an HTTP request to the `yolo5` service to predict the objects in this image.

Complete the `# TODO`s in `bot.py` to achieve this goal (or implement equivalent steps if you use your own bot implementation).

Here is an end-to-end example of how it may look like when all your microservices are running. Feel free to send the results to the user in any other form.

<img src="../.img/polysample.jpg" width="30%">

## Deploy the service in a single EC2 instance as a Docker Compose project

Create a Docker Compose project in the `docker-compose.yaml` file to provision the service (all 3 microservices) in a single command (`docker compose up`).
Deploy the compose project in a single EC2 instance located in a public subnet.

Deployment notes:

- Don't configure your compose file to build the images. Instead, push the `yolo5` and `polybot` images to DockerHub or an [ECR](https://docs.aws.amazon.com/AmazonECR/latest/userguide/getting-started-console.html) repo and use these images.
- Attach an IAM role with the relevant permissions (e.g. read/write access to S3). Don't manage AWS credentials yourself, and never hard-code AWS credentials in the `docker-compose.yaml` file.
- Don't hard-code your telegram token in the compose file, this is a sensitive data. [Read here](https://docs.docker.com/compose/use-secrets/) how to provide your compose project this data in a safe way.
- Use `snyk` to clean your images from any HIGH and CRITICAL security vulnerabilities.

#### Exposing the bot to Telegram server

You can expose the polybot to Telegram servers by Ngrok, as done in the previous exercise (install and launch ngrok on the EC2 instance).

Alternatively, you can use the instance's **public IP address** as the registered bot app URL in Telegram servers.
This requires some code changes in `polybot/app.py`.

Since the IP address may be changed, you should retrieve the public IP dynamically when the app is launching. You can get the instance public IP **from within** the instance by:

```python
import requests

# reference https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html
TELEGRAM_APP_URL = requests.get('http://169.254.169.254/latest/meta-data/public-ipv4').text
```

In addition, your flask webserver should listen to HTTPS requests (Telegram doesn't accept unsecure HTTP communication).
For that, you should generate **self-signed certificate**, and use it when running the flask, as well as setting the webhook in Telegram.

Here is a simple working example:
https://github.com/eternnoir/pyTelegramBotAPI/blob/master/examples/webhook_examples/webhook_flask_echo_bot.py


## Submission

You have to present your work to the course staff, in a **15 minutes demo**. Your presentations would be evaluated according to the below list, in order of priority:

1. Showcasing a live, working demo of your work. Both locally and in the cloud.
2. Demonstrating deep understanding of the system.
3. Applying best practices and clean work.
4. Successful integration of a new feature, idea, or extension. Be creative!

## Good luck
Empty file.
27 changes: 27 additions & 0 deletions docker_project/polybot/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
import flask
from flask import request
import os
from bot import ObjectDetectionBot

app = flask.Flask(__name__)

TELEGRAM_TOKEN = os.environ['TELEGRAM_TOKEN']
TELEGRAM_APP_URL = os.environ['TELEGRAM_APP_URL']


@app.route('/', methods=['GET'])
def index():
return 'Ok'


@app.route(f'/{TELEGRAM_TOKEN}/', methods=['POST'])
def webhook():
req = request.get_json()
bot.handle_message(req['message'])
return 'Ok'


if __name__ == "__main__":
bot = ObjectDetectionBot(TELEGRAM_TOKEN, TELEGRAM_APP_URL)

app.run(host='0.0.0.0', port=8443)
77 changes: 77 additions & 0 deletions docker_project/polybot/bot.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
import telebot
from loguru import logger
import os
import time
from telebot.types import InputFile


class Bot:

def __init__(self, token, telegram_chat_url):
# create a new instance of the TeleBot class.
# all communication with Telegram servers are done using self.telegram_bot_client
self.telegram_bot_client = telebot.TeleBot(token)

# remove any existing webhooks configured in Telegram servers
self.telegram_bot_client.remove_webhook()
time.sleep(0.5)

# set the webhook URL
self.telegram_bot_client.set_webhook(url=f'{telegram_chat_url}/{token}/', timeout=60)

logger.info(f'Telegram Bot information\n\n{self.telegram_bot_client.get_me()}')

def send_text(self, chat_id, text):
self.telegram_bot_client.send_message(chat_id, text)

def send_text_with_quote(self, chat_id, text, quoted_msg_id):
self.telegram_bot_client.send_message(chat_id, text, reply_to_message_id=quoted_msg_id)

def is_current_msg_photo(self, msg):
return 'photo' in msg

def download_user_photo(self, msg):
"""
Downloads the photos that sent to the Bot to `photos` directory (should be existed)
:return:
"""
if not self.is_current_msg_photo(msg):
raise RuntimeError(f'Message content of type \'photo\' expected')

file_info = self.telegram_bot_client.get_file(msg['photo'][-1]['file_id'])
data = self.telegram_bot_client.download_file(file_info.file_path)
folder_name = file_info.file_path.split('/')[0]

if not os.path.exists(folder_name):
os.makedirs(folder_name)

with open(file_info.file_path, 'wb') as photo:
photo.write(data)

return file_info.file_path

def send_photo(self, chat_id, img_path):
if not os.path.exists(img_path):
raise RuntimeError("Image path doesn't exist")

self.telegram_bot_client.send_photo(
chat_id,
InputFile(img_path)
)

def handle_message(self, msg):
"""Bot Main message handler"""
logger.info(f'Incoming message: {msg}')
self.send_text(msg['chat']['id'], f'Your original message: {msg["text"]}')


class ObjectDetectionBot(Bot):
def handle_message(self, msg):
logger.info(f'Incoming message: {msg}')

if self.is_current_msg_photo(msg):
photo_path = self.download_user_photo(msg)

# TODO upload the photo to S3
# TODO send a request to the `yolo5` service for prediction
# TODO send results to the Telegram end-user
5 changes: 5 additions & 0 deletions docker_project/polybot/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pyTelegramBotAPI>=4.12.0
loguru>=0.7.0
requests>=2.31.0
flask>=2.3.2
matplotlib
10 changes: 10 additions & 0 deletions docker_project/yolo5/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
FROM ultralytics/yolov5:latest-cpu
WORKDIR /usr/src/app
RUN pip install --upgrade pip
COPY requirements.txt .
RUN pip install -r requirements.txt
RUN curl -L https://github.com/ultralytics/yolov5/releases/download/v6.1/yolov5s.pt -o yolov5s.pt

COPY . .

CMD ["python3", "app.py"]
83 changes: 83 additions & 0 deletions docker_project/yolo5/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
import time
from pathlib import Path
from flask import Flask, request
from detect import run
import uuid
import yaml
from loguru import logger
import os

images_bucket = os.environ['BUCKET_NAME']

with open("data/coco128.yaml", "r") as stream:
names = yaml.safe_load(stream)['names']

app = Flask(__name__)

@app.route('/predict', methods=['POST'])
def predict():
# Generates a UUID for this current prediction HTTP request. This id can be used as a reference in logs to identify and track individual prediction requests.
prediction_id = str(uuid.uuid4())

logger.info(f'prediction: {prediction_id}. start processing')

# Receives a URL parameter representing the image to download from S3
img_name = request.args.get('imgName')

# TODO download img_name from S3, store the local image path in original_img_path
# The bucket name should be provided as an env var BUCKET_NAME.
original_img_path = ...

logger.info(f'prediction: {prediction_id}/{original_img_path}. Download img completed')

# Predicts the objects in the image
run(
weights='yolov5s.pt',
data='data/coco128.yaml',
source=original_img_path,
project='static/data',
name=prediction_id,
save_txt=True
)

logger.info(f'prediction: {prediction_id}/{original_img_path}. done')

# This is the path for the predicted image with labels
# The predicted image typically includes bounding boxes drawn around the detected objects, along with class labels and possibly confidence scores.
predicted_img_path = Path(f'static/data/{prediction_id}/{original_img_path}')

# TODO Uploads the predicted image (predicted_img_path) to S3 (be careful not to override the original image).

# Parse prediction labels and create a summary
pred_summary_path = Path(f'static/data/{prediction_id}/labels/{original_img_path.split(".")[0]}.txt')
if pred_summary_path.exists():
with open(pred_summary_path) as f:
labels = f.read().splitlines()
labels = [line.split(' ') for line in labels]
labels = [{
'class': names[int(l[0])],
'cx': float(l[1]),
'cy': float(l[2]),
'width': float(l[3]),
'height': float(l[4]),
} for l in labels]

logger.info(f'prediction: {prediction_id}/{original_img_path}. prediction summary:\n\n{labels}')

prediction_summary = {
'prediction_id': prediction_id,
'original_img_path': original_img_path,
'predicted_img_path': predicted_img_path,
'labels': labels,
'time': time.time()
}

# TODO store the prediction_summary in MongoDB

return prediction_summary
else:
return f'prediction: {prediction_id}/{original_img_path}. prediction result not found', 404


if __name__ == "__main__":
app.run(host='0.0.0.0', port=8081)
Loading

0 comments on commit 4c969bc

Please sign in to comment.