From 1552ea13cc034cc1a3f9ec687b285bac33ab998b Mon Sep 17 00:00:00 2001 From: Sethupathi Asokan Date: Sun, 18 Aug 2024 20:12:34 +0530 Subject: [PATCH 01/12] Update README.md --- README.md | 331 +++++------------------------------------------------- 1 file changed, 27 insertions(+), 304 deletions(-) diff --git a/README.md b/README.md index 4921494..3864d58 100644 --- a/README.md +++ b/README.md @@ -1,315 +1,38 @@ -# AI Deploy - Tutorial - Deploy Whisper Speech Recognition Model +#README +Voice to text from major languages supported by whisper model, this applicationn will transcibe the uploaded or recored audio to its original language, translate to Englis and and summarise in English. -> [!primary] -> -> AI Deploy is covered by **[OVHcloud Public Cloud Special Conditions](https://storage.gra.cloud.ovh.net/v1/AUTH_325716a587c64897acbef9a4a4726e38/contracts/d2a208c-Conditions_particulieres_OVH_Stack-WE-9.0.pdf)**. -> +**Supported Languages (by whispher model)** +1. Indian Languages: + _Hindi, Kannada, Marathi and Tamil_ -## Introduction +2. Other languages: + _Afrikaans, Arabic, Armenian, Azerbaijani, Belarusian, Bosnian, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, Galician, German, Greek, Hebrew, Hungarian, Icelandic, Indonesian, Italian, Japanese, Kazakh, Korean, Latvian, Lithuanian, Macedonian, Malay, Maori, Nepali, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovenian, Spanish, Swahili, Swedish, Tagalog, Thai, Turkish, Ukrainian, Urdu, Vietnamese, and Welsh._ -[OpenAI's Whisper](https://openai.com/research/whisper) large-v3 is the latest iteration of the open-source Whisper Automatic Speech Recognition (ASR) model. -Released on November 6, 2023, Whisper large-v3 has gained recognition in various benchmarks as the top-performing automatic speech recognition (ASR) model. This is why it represents the state of the art in the ASR field. More generally, Whisper has become a powerful tool for transcribing speech in approximately 100 languages, including non-english speech transcription and translation into English. -![Overview](images/speech-to-text.png){.thumbnail} +**Supported audio format** +_m4a, mp3, webm, mp4, mpga, wav, mpeg_ -## Objective +**Pre-requisite:** +python 3.8.1 or above +Ollam + > curl -fsSL https://ollama.com/install.sh | sh +Llama 3.1 model + > ollama run llama3.1 -In this tutorial, we will guide you through the process of deploying the Whisper large-v3 model in a simple application on AI Deploy for production use. +**Setup** +Clone this github repository +> git clone -Unlike a locally hosted application, using AI Deploy offers extremely fast inference since it is deployed on powerful resources (GPUs), and other features like sharing capabilities. You will also learn how to build and use a custom Docker image as part of the deployment process. +Create python virtual environment +> python3 -m venv lingo +> source lingo/bin/activate -## Requirements +Install dependencies +> pip install -r requirements.txton -To build and deploy your Whisper app, you need: +Execute the appllilcation +> python3 app.py -- Access to the [OVHcloud Control Panel](https://www.ovh.com/auth/?action=gotomanager&from=https://www.ovh.co.uk/&ovhSubsidiary=GB) -- An AI Deploy Project created inside a [Public Cloud project](https://www.ovhcloud.com/en-gb/public-cloud/) in your OVHcloud account -- A [user for AI Deploy](/pages/public_cloud/ai_machine_learning/gi_01_manage_users) -- [The OVHcloud AI CLI](/pages/public_cloud/ai_machine_learning/cli_10_howto_install_cli) installed on your local computer -- [Docker](https://www.docker.com/get-started) installed on your local computer, **or** access to a Debian Docker Instance, which is available on the [Public Cloud](https://www.ovh.com/manager/public-cloud/) - -## Instructions - -You are going to follow different steps to deploy our Whisper application: - -- [Whisper Model Sslection](#step-1-whisper-model-selection) -- [Download Whisper Model in the Object Storage](#step-2-download-whisper-model-in-the-object-storage) -- [Whisper app development](#step-3-whisper-app-development) -- [Whisper app deployment](#step-4-whisper-app-deployment) - -### Step 1: Whisper Model selection - -While there are various implementations of the Whisper model (Distil-Whisper, WhisperX, faster-whisper, Whisper JAX, ...), we will use the [original Whisper implementation](https://github.com/openai/whisper) for this tutorial. Feel free to explore other implementations based on your needs (memory constraints and desired inference speed (live transcription or post-transcription)). - -Original Whisper implementation comes in five distinct model sizes (`tiny`, `base`, `small`, `medium`, and `large`), each designed with specific use cases in mind. Each size has an English-only counterpart (`tiny.en` for `tiny` size for example), which demonstrates optimized performance for English applications, outperforming the multilingual variant. - -Memory requirements and relative inference speeds are key considerations when selecting your Whisper model. The tiny and base models, requiring approximately 1 GB of VRAM, offer speeds up to 32x and 16x, respectively, compared to the large model. As the models increase in size, the required VRAM and relative speed change accordingly. - -In this tutorial, we will deploy the latest version of the `large` Whisper model in order to obtain the best possible transcription quality, whatever the language spoken. However, you can easily change to another model. - -### Step 2: Download Whisper Model in the Object Storage - -Once you have selected your model, it is time to download it for use in inference. - -This can be done within the deployed application, by installing the `open-whisper` library and executing the following python code that downloads the Whisper model: - -```bash -pip install -U openai-whisper -``` - -```python -import whisper -model_path = "whisper_model" -model_id = 'large-v3' - -# Download model -model = whisper.load_model(model_id, download_root=model_path) -``` - -As AI Deploy is based on Docker images, it is advisable to avoid directly downloading the model within the Python code and the Docker image for better manageability and flexibility. Moreover, you should not store the Whisper model directly in the Docker image because it will significantly increase the size of the Docker image. - -To do things more efficiently, it is better to save the model in a remote storage, like the [OVHcloud Object Storage](https://www.ovhcloud.com/en-gb/public-cloud/object-storage/). This storage will be linked to the AI Deploy app, which will allow the use of the Whisper model within the app. This way, you can easily access the model and make updates without messing with the Docker container itself. - -#### Create an Object Storage bucket for you Whisper model - -You can create your Object Storage bucket using either the UI (OVHcloud Control Panel) or the `ovhai` CLI, which can be downloaded [here](/pages/public_cloud/ai_machine_learning/cli_10_howto_install_cli). - -> [!tabs] -> **Using the Control Panel (UI)** ->> ->> If you do not feel comfortable with commands, this method may be more intuitive. ->> ->> First, go to the `Public Cloud`{.action} section of the [OVHcloud Control Panel](https://www.ovh.com/auth/?action=gotomanager&from=https://www.ovh.co.uk/&ovhSubsidiary=GB). ->> ->> Then, select the `Object Storage`{.action} section (in the Storage category) and create a new object container by clicking `Storage`{.action} > `Object Storage`{.action} > `Create an object container`{.action}. ->> ->> ![image](images/new-object-container.png){.thumbnail} ->> ->> You can create the bucket that will store your Whisper model. Select the container *type* and the *datastore_alias* that match your needs, and name it as you wish. *`GRA` alias and `whisper-model` name will be used in this tutorial.* ->> -> **Using ovhai CLI** ->> ->> To follow this part, make sure you have installed the [ovhai CLI](https://cli.bhs.ai.cloud.ovh.net/) on your computer or on an instance. ->> ->> As in the Control Panel, you will have to specify the `datastore_alias` and the `name` of your bucket. Create your Object Storage bucket as follows: ->> ->> ```bash ->> ovhai bucket create ->> ``` ->> ->> You can access the full alias list by running: ->> ->> ```bash ->> ovhai datastore list ->> ``` ->> ->> To avoid encountering latencies, a good practice is to use the same alias as the one on which you are going to deploy your AI Solutions. ->> ->> *`GRA` alias and `whisper-model` will be used in this tutorial.* ->> ->> For your information, the previous command is applicable to both Swift and S3 buckets. However, it's important to note that for S3 usage, a proper configuration is necessary. If S3 is not configured yet and you wish to use it, please read the [S3 compliance guide](/pages/public_cloud/ai_machine_learning/gi_08_s3_compliance). - -#### Download whisper in the created bucket - -To download the model, we will use AI Training. The created job will be based on an official OVHcloud Docker image `ovhcom/ai-training-pytorch` which is compliant with the product. - -Using 5 CPUs will be enough to download the model, ensuring sufficient memory resources are available. - -Two volumes will be associated with this job: - -- The first volume will contain the `ai-training-examples` [GitHub repository](https://github.com/ovh/ai-training-examples/), containing the Python script used to download the Whisper model. For security, this volume is configured with Read-Only (RO) permission, as we just need access to the Python script, located at `ai-training-examples/apps/streamlit/whisper/download_model.py`. -- The second volume corresponds to the created bucket, where we will store the Whisper model once downloaded. Therefore, this bucket needs to be with Read-Write (RW) permission, since we will write in it. Make sure to name it correctly, and specify the alias you used during bucket creation. - -To launch this job, we will use a `pip install` command, to install the necessary libraries (from the `ai-training-examples/apps/streamlit/whisper/requirements.txt` file) and, subsequently, we will execute the python script for Whisper download. - -These steps can be summed up in the following command, which contains all the points mentioned above: - -```bash -ovhai job run ovhcom/ai-training-pytorch \ - --cpu 5 \ - --volume https://github.com/ovh/ai-training-examples.git:/workspace/github_repo:RO \ - --volume whisper-model@GRA/:/workspace/whisper-model:RW \ - -- bash -c 'pip install -r /workspace/github_repo/apps/streamlit/whisper/requirements.txt && python /workspace/github_repo/apps/streamlit/whisper/download_model.py large-v3 /workspace/whisper-model' -``` - -> [!warning] -> **Warning** -> In the last line of the command above, you can see that the `download_model.py` file takes two arguments. -> -> The first one indicates which model you want to download (here `large-v3`). You can change it to one of the followings: `tiny.en`, `tiny`, `base.en`, `base`, `small.en`, `small`, `medium.en`, `medium`, `large-v1`, `large-v2`, `large-v3`, `large`. -> -> The second one indicates where the model will be saved. **It must be the same place** as where you mounted the first volume. This will allow the model to be backed up in the bucket, and not in the job's ephemeral storage. Here, we use the `/workspace/whisper-model` path. -> -> Moreover, you may have to **change the bucket name and datastore_alias** of the last `--volume` parameter, based on the name you gave to it and the alias where you created it. - -The job will then be launched. It will take a few minutes for the two volumes to be added to the job, the environment to be installed and the Whisper model to be downloaded and synchronized with your bucket (`FINALZING` status). - -If you have configured your volumes correctly with the right permissions, and given the right paths to the python script, then you should see your Whisper model in your bucket. This can be checked on the Control Panel, or with the CLI with the following command that will list all the objects of your bucket: - -```bash -`ovhai bucket object list @` -``` - -*Following the example given in this tutorial, we will use: - -```bash -`ovhai bucket object list whisper-model@GRA` -``` - -> [!primary] -> **Why is nothing returned when I run the command?** -> -> If nothing is returned when you list your bucket objects, do not hesitate to check your job logs by running `ovhai job logs ` to see if the model has been downloaded well. -> -> If an exception occured during the model download, it is probably because you do not have enough memory in your AI Training job. To be sure, check the URL monitoring to see if the total memory has been reached. -> -> If the model has been successfully downloaded, but you don't see anything when you list your bucket objects, it may be because the model's output path differs from your attached bucket path. They need to match for the model to be saved in the correct location. Otherwise, the model might not have been saved in the right place. - -### Step 3: Whisper app development - -Now that the Whisper model is in a bucket, you can easily attach it to an AI Deploy app, using the same method as you just did with AI Training. - -But in order to interact with the model, we are going to build a very simple application, using [Streamlit](https://streamlit.io/), which will allow a user to upload an audio file and then transcribe it. - -#### Building a Docker image - -As for AI Training, you will need to use a Docker image when using AI Deploy. However you can't use the `ovhcom/ai-training-pytorch` Docker image this time. Indeed, using the Whisper model requires some system packages to be installed, such as `ffmpeg`, which is not the case in the previous image. This is why you need to build a new Docker image, dedicated to your AI project. - -To do this, you will need [Docker](https://www.docker.com/get-started), either installed directly on your computer, or using a Debian Docker Instance, available on the [Public Cloud](https://www.ovh.com/manager/public-cloud/). - -The Dockerfile to use is already provided. Clone the [ai-training-examples GitHub repository](https://github.com/ovh/ai-training-examples/): - -```console -git clone https://github.com/ovh/ai-training-examples.git -``` - -Then run `cd ai-training-examples/apps/streamlit/whisper/` to move to the Whisper app folder. If you run `ls` to list the existing files, you should see several ones, including the following: - -- `app.py`: The python code of the Whisper application. -- `requirements.txt`: Contains the libraries needed by the Whisper application (streamlit, openai-whisper, ...). -- `Dockerfile`: Contains all the commands you could call on the command line to run your application (installing `requirements.txt`, running the python script, ...). - -Then, launch the following command to build your application image. The created image will be named `whisper_app`. You can change it if you wish, but make sure you keep the same identifier throughout the next commands. - -```console -docker build . -t whisper_app:latest -``` - -> [!primary] -> **Command explanation** -> -> - The dot `.` argument indicates that our build context (place of the **Dockerfile** and other needed files) is the current directory. -> -> - The `-t` argument allows us to choose the identifier to give to our image. Usually image identifiers are composed of a **name** and a **version tag** `:`. For this example we chose **whisper_app:latest**. - -During the build process, Docker reads the instructions from the Dockerfile and executes them step by step. For your information, here are the steps of this Dockerfile: - -```dockerfile -# ЁЯР│ Base image (official small Python image) -FROM python:3.10-slim - -# Install missing system packages (ffmpeg is needed for Whisper and is not installed in the small Python image) -RUN apt-get update && \ - apt-get install -y ffmpeg libsndfile1-dev - -# ЁЯС▒ Set the working directory inside the container -WORKDIR /workspace - -# ЁЯРН Copy all folder files (requirements.txt, python code, ...) into the /workspace folder -ADD . /workspace - -# ЁЯУЪ Install the Python dependencies -RUN pip install -r requirements.txt - -# ЁЯФС Give correct access rights to the OVHcloud user -RUN chown -R 42420:42420 /workspace -ENV HOME=/workspace - -# ЁЯМР Set default values for 'model_id' & 'model_path' variables. Will be changeable using --env parameter when launching AI Deploy app -ENV MODEL_ID="large-v3" -ENV MODEL_PATH="/workspace/whisper-model" - -# ЁЯЪА Define the command to run the Streamlit application when the container is launched -CMD [ "streamlit" , "run" , "/workspace/app.py", "--server.address=0.0.0.0", "$MODEL_ID", "$MODEL_PATH"] -``` - -#### Push the image into a registry - -Once your image is built, you will need to tag it and push it to a registry. Several registries can be used (OVHcloud Managed Private Registry, Docker Hub, GitHub packages, ...). In this tutorial, we will use the **OVHcloud shared registry**. - -> [!warning] -> **Warning** -> The shared registry should only be used for testing purposes. Please consider creating and attaching your own registry. More information about this can be found [here](/pages/public_cloud/ai_machine_learning/gi_07_manage_registry). The images pushed to this registry are for AI Tools workloads only, and will not be accessible for external uses. - -You can find the address of your shared registry by launching this command: - -```console -ovhai registry list -``` - -Log in on your shared registry with your usual OpenStack credentials: - -```console -docker login -u -p -``` - -Tag the compiled image and push it into your shared registry: - -```console -docker tag whisper_app:latest /whisper_app:latest -docker push /whisper_app:latest -``` - -### Step 4: Whisper app deployment - -Once your image has been pushed, it can be used to deploy new AI solutions. - -Run the following command to deploy your Whisper speech-to-text application by running your customised Docker image: - -```console -ovhai app run /whisper_app:latest \ - --name whisper_app \ - --gpu 1 \ - --default-http-port 8501 \ - --volume whisper-model@GRA/:/workspace/whisper-model:RO \ - --env MODEL_ID="large-v3" \ - --env MODEL_PATH="/workspace/whisper-model" -``` - -> [!primary] -> **Parameters explanation** -> -> - `/whisper_app:latest` is the image on which the app is based. -> -> - `--name whisper_app` is an optional argument that allows you to give your app a custom name, making it easier to manage all your apps. -> -> - `--default-http-port 8501` indicates that the port to reach on the app URL is `8501` (Default Streamlit port). -> -> - `--gpu 1` indicates that we request 1 GPU for that app. -> -> - `--volume` allows you to specify what volume you want to add to your app. As mentioned, we add the **Whisper** bucket we created, in `RO` mode. This means that we will only be able to read the data from these volumes and not modify them. This will enable the model to be used. As before, make sure to use the same bucket name and alias as when you created the bucket. -> -> `--env` parameter allows to set environment variables within the Docker container, that will be accessed in the Python scripts. Two values are specified `MODEL_ID` & `MODEL_PATH`. This will allow the code to load the model from the right place, depending on the one you have chosen (`MODEL_ID`=`"large-v3"`), and the path where you have mounted your Whisper model volume (`MODEL_PATH`=`"/workspace/whisper-model"`). -> -> - Consider adding the `--unsecure-http` attribute if you want your application to be reachable without any authentication. - -Once your application is in `RUNNING` status, you can transcribe audio files using the powerful Whisper model! - -![Overview](images/whisper-app-overview.png){.thumbnail} - -## Go further - -This tutorial has walked you through deploying the openAI Whisper model implementation for speech-to-text transcription in a Streamlit app. - -To continue your project, you can add many more features, such as the specification of the spoken language, or options for translating the transcript and many more! - -You could also combine your transcripts with an LLM. Learn how to [Deploy LLaMA 2 on AI Deploy](/pages/public_cloud/ai_machine_learning/deploy_tuto_15_streamlit_chatbot_llama_v2). - -If you need training or technical assistance to implement our solutions, contact your sales representative or click on [this link](https://www.ovhcloud.com/en-gb/professional-services/) to get a quote and ask our Professional Services experts for a custom analysis of your project. - -## Feedback - -Please send us your questions, feedback and suggestions to improve the service: - -- On the OVHcloud [Discord server](https://discord.com/invite/vXVurFfwe9) +It will execute and prints the url in the output console, copy the url and paste in the brwoser + From e986514b4d3741ee67072d336a39278aad342c05 Mon Sep 17 00:00:00 2001 From: Sethupathi Asokan Date: Sun, 18 Aug 2024 20:13:27 +0530 Subject: [PATCH 02/12] Update README.md Updated the format --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 3864d58..6395940 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -#README +# README Voice to text from major languages supported by whisper model, this applicationn will transcibe the uploaded or recored audio to its original language, translate to Englis and and summarise in English. **Supported Languages (by whispher model)** From 80780455fea03190e48baf2c47888b2f96369450 Mon Sep 17 00:00:00 2001 From: Sethupathi Asokan Date: Sun, 18 Aug 2024 20:50:15 +0530 Subject: [PATCH 03/12] Updated the virtual environment steps --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 6395940..baf48db 100644 --- a/README.md +++ b/README.md @@ -25,7 +25,8 @@ Clone this github repository > git clone Create python virtual environment -> python3 -m venv lingo +> python3 -m lingo +Activate the virtual environment > source lingo/bin/activate Install dependencies From f60d7e78fd5dba4f45bfba9a8b28e32cb7b5144c Mon Sep 17 00:00:00 2001 From: Sethupathi Asokan Date: Sun, 18 Aug 2024 20:52:35 +0530 Subject: [PATCH 04/12] Fixed typo --- README.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index baf48db..7eace59 100644 --- a/README.md +++ b/README.md @@ -17,6 +17,7 @@ _m4a, mp3, webm, mp4, mpga, wav, mpeg_ python 3.8.1 or above Ollam > curl -fsSL https://ollama.com/install.sh | sh + Llama 3.1 model > ollama run llama3.1 @@ -26,13 +27,14 @@ Clone this github repository Create python virtual environment > python3 -m lingo + Activate the virtual environment > source lingo/bin/activate Install dependencies -> pip install -r requirements.txton +> pip install -r requirements.txt -Execute the appllilcation +Execute the applilcation > python3 app.py It will execute and prints the url in the output console, copy the url and paste in the brwoser From 8013dad210e840a3e2b28510ef413c43f7b49ec9 Mon Sep 17 00:00:00 2001 From: Sethupathi Asokan Date: Sun, 18 Aug 2024 21:03:09 +0530 Subject: [PATCH 05/12] updated the format --- README.md | 26 +++++++++++++++++++------- 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index 7eace59..c5b0734 100644 --- a/README.md +++ b/README.md @@ -11,31 +11,43 @@ Voice to text from major languages supported by whisper model, this applicationn **Supported audio format** + _m4a, mp3, webm, mp4, mpga, wav, mpeg_ **Pre-requisite:** + python 3.8.1 or above + + sudo apt-get update + sudo apt-get install python3.8.1 + Ollam - > curl -fsSL https://ollama.com/install.sh | sh + + curl -fsSL https://ollama.com/install.sh | sh Llama 3.1 model - > ollama run llama3.1 + + ollama run llama3.1 **Setup** Clone this github repository -> git clone + git clone Create python virtual environment -> python3 -m lingo + + python3 -m lingo Activate the virtual environment -> source lingo/bin/activate + + source lingo/bin/activate Install dependencies -> pip install -r requirements.txt + + pip install -r requirements.txt Execute the applilcation -> python3 app.py + + python3 app.py It will execute and prints the url in the output console, copy the url and paste in the brwoser From 6b21e54854357dcbdfd062dc4a0b1a4ea4179016 Mon Sep 17 00:00:00 2001 From: Sethupathi Asokan Date: Sun, 18 Aug 2024 21:16:16 +0530 Subject: [PATCH 06/12] fixed syntax for virtual environment --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index c5b0734..436314e 100644 --- a/README.md +++ b/README.md @@ -35,7 +35,7 @@ Clone this github repository Create python virtual environment - python3 -m lingo + python3 -m lingo . Activate the virtual environment From ba3ffd02fe0619ef4442c57409ff26dec441adcf Mon Sep 17 00:00:00 2001 From: Sethupathi Asokan Date: Mon, 19 Aug 2024 21:44:27 +0530 Subject: [PATCH 07/12] Revamped the whole project, now you can record or upload a audio file --- transcribe_audio.py | 5 ----- translate_text.py | 17 ----------------- 2 files changed, 22 deletions(-) delete mode 100644 transcribe_audio.py delete mode 100644 translate_text.py diff --git a/transcribe_audio.py b/transcribe_audio.py deleted file mode 100644 index beb3623..0000000 --- a/transcribe_audio.py +++ /dev/null @@ -1,5 +0,0 @@ -def transcribe_audio(model, audio_data): - options = dict(beam_size=5, best_of=5) - transcribe_options = dict(task="transcribe", **options) - transcript = model.transcribe(audio_data, **transcribe_options) - return transcript["text"] diff --git a/translate_text.py b/translate_text.py deleted file mode 100644 index 5c2d5b5..0000000 --- a/translate_text.py +++ /dev/null @@ -1,17 +0,0 @@ -#import openai -from config import openai_api_key - -#openai.api_key = openai_api_key - -def translate_text(text): - prompt = f"Translate the following text from given audio language to English:\n\n{text}" - response = openai.chat.completions.create( - model="gpt-3.5-turbo", - messages=[ - {"role": "system", "content": "You are a helpful assistant that translates text from Indian languages to English."}, - {"role": "user", "content": prompt} - ], - max_tokens=150 - ) - translated_text = response.choices[0].message.content - return translated_text From 127e1ee5f6b14aa0d5b6d674d7d71c9bea7c4cfb Mon Sep 17 00:00:00 2001 From: Sethupathi Asokan Date: Tue, 20 Aug 2024 15:18:00 +0530 Subject: [PATCH 08/12] Revamped the whole project, now you can record or upload a audio file --- app.py | 183 ++++++++++++++++++++++++---------------------- download_model.py | 4 +- load_model.py | 6 +- requirements.txt | 7 +- 4 files changed, 103 insertions(+), 97 deletions(-) diff --git a/app.py b/app.py index 3d86a1b..26ccb3a 100644 --- a/app.py +++ b/app.py @@ -1,95 +1,104 @@ -import streamlit as st -st.set_page_config(layout="wide") from dotenv import load_dotenv -import os -import numpy as np -import librosa -import io -#import openai -import whisper +import whisper +import gradio as gr +import ollama + # Import configurations and functions from modules from config import openai_api_key, model_id, model_path from load_model import load_model -from transcribe_audio import transcribe_audio -from extract_entities import extract_entities -from translate_text import translate_text # Assuming this is where you translate text +#from extract_entities import extract_entities # Load environment variables load_dotenv() -# Set OpenAI API key -#openai.api_key = openai_api_key - -# Initialize session state variables -if "transcription_text" not in st.session_state: - st.session_state.transcription_text = "" -if "summary" not in st.session_state: - st.session_state.summary = "" -if "detailed_transcription" not in st.session_state: - st.session_state.detailed_transcription = "" -if "show_detailed" not in st.session_state: - st.session_state.show_detailed = False - -# Main function to run the Streamlit app -def main(): - st.markdown("

Speech to Text App

", unsafe_allow_html=True) - - # Load the Whisper model - st.write(model_id) - model = load_model(model_id, model_path) - - # File uploader for audio files - st.write("Upload an audio file:") - audio_file = st.file_uploader("Select an audio", type=["mp3", "wav"]) - - audio_data = None - - if audio_file: - # Process uploaded audio file - audio_bytes = audio_file.read() - st.audio(audio_bytes) - audio_file = io.BytesIO(audio_bytes) - try: - audio_data, _ = librosa.load(audio_file, sr=18000) - except Exception as e: - st.error(f"Error loading audio file: {e}") - result = model.transcribe(audio_data) - #result = model.transcribe(audio_data) - st.session_state_detailed_transcription = result.text - # Perform transcription and other tasks on button click - #if audio is not None and st.button("Transcribe"): - #with st.spinner("Transcribing audio..."): - #try: - #st.session_state.transcription_text = transcribe_audio(model, audio_data) - - #with st.spinner("Summarizing..."): - #summary, detailed_transcription = extract_entities(st.session_state.transcription_text) - #st.session_state.summary = summary - #st.session_state.detailed_transcription = st.session_state.transcription_text - #st.session_state.show_detailed = False - #st.rerun() - #except Exception as e: - #st.error(f"Error during transcription or entity extraction: {e}") - - # display summary - #if st.session_state.summary: - #st.write("**Summary:**") - #translated_summary = translate_text(st.session_state.summary) - #st.markdown(translated_summary.replace("\n", " \n")) - - # Button - if st.session_state.summary and st.button("View detailed transcription"): - st.session_state.show_detailed = True - st.rerun() - - # display detailed transcription - if st.session_state.show_detailed: - st.write("Detailed view:") - st.write("**Original language:**") - st.markdown(st.session_state.detailed_transcription.replace("\n", " \n")) - st.write("**Translated to English:**") - translated_detailed = translate_text(st.session_state.detailed_transcription) - st.markdown(translated_detailed.replace("\n", " \n")) - -if __name__ == "__main__": - main() +#Load whisher model +model = load_model(model_id, model_path) + +#transcripe the audio to its original language +def transcribe(audio): + result = model.transcribe(audio) + transcription = result["text"] + #translation = translate_with_whisper(audio) + print("Transcribe: "+transcription) + translation = translate_with_ollama(transcription) + print("Translation: "+translation) + summary = summarize(translation) + return [transcription, translation, summary] + +#translate the audio file to English language using whisper model +def translate_with_whisper(audio): + options = dict(beam_size=5, best_of=5) + translate_options = dict(task="translate", **options) + result = model.transcribe(audio,**translate_options) + return result["text"] + +#translate the text from transciption to English language +def translate_with_ollama(text): + #Uncomment any of the below line to test the translation if don't have an audio file + #text = "рдореБрд▓рд╛рдЦрддрдХрд╛рд░: рдЖрдореНрд╣реА рдЬреЗрдиреАрд╢реА рдмреЛрд▓рдд рдЖрд╣реЛрдд. рддрд░ рддреБрдордЪрд╛ рд╕рдзреНрдпрд╛рдЪрд╛ рд╡реНрдпрд╡рд╕рд╛рдп рдХрд╛рдп рдЖрд╣реЗ?\nрдореБрд▓рд╛рдЦрдд рдШреЗрдгрд╛рд░рд╛: рдореА рдПрдХ рд╕рд╣рд╛рдпреНрдпрдХ рджрд┐рдЧреНрджрд░реНрд╢рдХ рдЖрд╣реЗ [???] рдореБрд│рд╛рдд [???].\nрдореБрд▓рд╛рдЦрддрдХрд░реНрддрд╛: рдЖрдгрд┐ рдореНрд╣рдгреВрди рддреБрдореНрд╣реА рдпрд╛ рд╡рд┐рд╢рд┐рд╖реНрдЯ рдкрджрд╛рд╡рд░ рдХрд┐рддреА рдХрд╛рд│ рдЖрд╣рд╛рдд?\nрдореБрд▓рд╛рдЦрдд рдШреЗрдгрд╛рд░рд╛: рдЖрдореНрд╣реА рдЬреЗрдиреАрд╢реА рдмреЛрд▓рдд рдЖрд╣реЛрдд. . рддрд░ рддреБрдордЪрд╛ рд╕рдзреНрдпрд╛рдЪрд╛ рд╡реНрдпрд╡рд╕рд╛рдп рдХрд╛рдп рдЖрд╣реЗ?\nрдореБрд▓рд╛рдЦрдд рдШреЗрдгрд╛рд░рд╛: рдореА рдПрдХ рд╕рд╣рд╛рдпреНрдпрдХ рджрд┐рдЧреНрджрд░реНрд╢рдХ рдЖрд╣реЗ [???] рдореБрд│рд╛рдд [???].\nрдореБрд▓рд╛рдЦрдд рдШреЗрдгрд╛рд░рд╛: рдЖрдгрд┐ рдореНрд╣рдгреВрди рддреБрдореНрд╣реА рдпрд╛ рд╡рд┐рд╢рд┐рд╖реНрдЯ рдкрджрд╛рд╡рд░ рдХрд┐рддреА рдХрд╛рд│ рдЖрд╣рд╛рдд?\nрдореБрд▓рд╛рдЦрдд рдШреЗрдгрд╛рд░рд╛: рдореА рдпрд╛ рд╡рд┐рд╢рд┐рд╖реНрдЯ рдкрджрд╛рд╡рд░ рдЖрд╣реЗ. рд╕реНрдерд┐рддреА - рек рд╡рд░реНрд╖реЗ рдЭрд╛рд▓реА рдЖрд╣реЗрдд.\nрдореБрд▓рд╛рдЦрддрдХрд╛рд░: рддреБрдореНрд╣реА рдЖрдЬрдкрд░реНрдпрдВрддрдЪреНрдпрд╛ рддреБрдордЪреНрдпрд╛ рдХрд╛рдорд╛рдЪреЗ рдХрд┐рдВрд╡рд╛ рдХрд░рд┐рдЕрд░рдЪреНрдпрд╛ рдЗрддрд┐рд╣рд╛рд╕рд╛рдЪреЗ рдереЛрдбрдХреНрдпрд╛рдд рд╡рд░реНрдгрди рдХрд░реВ рд╢рдХрддрд╛ рдХрд╛?\nрдореБрд▓рд╛рдЦрдд рдШреЗрдгрд╛рд░рд╛: рдорд╛рдЭреА рдХрд╛рд░рдХреАрд░реНрдж рдЕрд╢реА рдЖрд╣реЗ рдХреА, рд╕реБрд░реБрд╡рд╛рддреАрдЪреНрдпрд╛ рдХрд╛рд│рд╛рдд рддреА рд╕реНрдкреЗрд╢рд▓рд╛рдпрдЭреЗрд╢рдирдЪреНрдпрд╛ рд╡реЗрдЧрд╡реЗрдЧрд│реНрдпрд╛ рдХреНрд╖реЗрддреНрд░рд╛рдд рдЕрдиреЗрдХ рдХреНрд▓рд┐рдирд┐рдХрд▓ рднреВрдорд┐рдХрд╛рдВрд╕рд╣ рд╕реБрд░реВ рдЭрд╛рд▓реА. рдирд░реНрд╕рд┐рдВрдЧ [???] рдЙрджреНрдпреЛрдЧ рдЖрдгрд┐ [???] рдкреНрд▓реЕрд╕реНрдЯрд┐рдХ [???] рд╕рдВрдкреВрд░реНрдг рднрд┐рдиреНрди рдкреНрд░рдХрд╛рд░ рдЖрдгрд┐ рдирдВрддрд░ рдореА рдкреБрдирд░реНрд╡рд╕рди рдордзреНрдпреЗ рд╣рд▓рд╡рд┐рд▓реЗ. рдореА 15 рд╡рд░реНрд╖реЗ [???] рд╣реЛрддреЛ рдЖрдгрд┐ рдирдВрддрд░ рдореБрд│рд╛рдд рдореА 26 рд╡рд░реНрд╖рд╛рдВрдЪрд╛ рд╣реЛрддреЛ рддреЗрд╡реНрд╣рд╛ рд╡реНрдпрд╡рд╕реНрдерд╛рдкрдирд╛рдЪреНрдпрд╛ рднреВрдорд┐рдХреЗрдд рдЧреЗрд▓реЛ рдЖрдгрд┐ рдЧреЗрд▓реА 27 рд╡рд░реНрд╖реЗ рд╡реНрдпрд╡рд╕реНрдерд╛рдкрдирд╛рдЪреНрдпрд╛ рднреВрдорд┐рдХреЗрдд рдХрд╛рдо рдХрд░рдд рд╣реЛрддреЛ рдЖрдгрд┐ рд╣реЛрдп, рд╕реБрдорд╛рд░реЗ 20 рд╡рд░реНрд╖реЗ рдореА рд╡реНрдпрд╡рд╕реНрдерд╛рдкрдирд╛рдд рдХрд╛рдо рдХрд░рдд рдЖрд╣реЗ рдЖрдгрд┐ рдореБрд│рд╛рдд рддреЗ рдХреЗрд▓реЗ рдкреБрдирд░реНрд╡рд╕рди рд╕рдореБрдкрджреЗрд╢рдирд╛рддреАрд▓ рдПрдХ рдкреБрдирд░реНрд╡рд╕рди рдкрджрд╡реА, рдЬреА рдорд▓рд╛ рд╡рд╛рдЯрддреЗ [???] рдЦреЗрд│рд▓реА, рддреНрдпрд╛рдореБрд│реЗ [???] рдирд░реНрд╕рд┐рдВрдЧрдЪреНрдпрд╛ рдкрд╛рд░рдВрдкрд╛рд░рд┐рдХ рдХреНрд▓рд┐рдирд┐рдХрд▓ рднреВрдорд┐рдХреЗрдкрд╛рд╕реВрди рдХрд╛рд╣реА рдмрд╛рдмрддреАрдд рднрд┐рдиреНрди рд╡рд┐рд▓рдВрдм.\nрдореБрд▓рд╛рдЦрддрдХрд╛рд░: рд╣реЛрдп, рдорд▓рд╛ рдкрд╣рд┐рд▓реА рдЧреЛрд╖реНрдЯ рд╡рд╛рдЯрддреЗ [???] рд╡реНрдпрд╛рдкрдХрдкрдгреЗ рд╕рд╛рдВрдЧрд╛рдпрдЪреЗ рддрд░, рддреБрдореНрд╣реА рдирд░реНрд╕рд┐рдВрдЧрдордзреНрдпреЗ рдЬрд╛рдгреНрдпрд╛рдЪреЗ рдХрд╛ рдирд┐рд╡рдбрд▓реЗ?\nрдореБрд▓рд╛рдЦрдд рдШреЗрдгрд╛рд░рд╛: рдХрд╛рд░рдг рдорд▓рд╛ рдкреНрд░рд╢рд┐рдХреНрд╖рдг рдЖрдгрд┐ рдиреЛрдХрд░реАрд╕рд╛рдареА рдПрдХрд╛рдЪ рд╡реЗрд│реА рдкреИрд╕реЗ рджрд┐рд▓реЗ рдЧреЗрд▓реЗ. 70 рдЪреНрдпрд╛ рджрд╢рдХрд╛рдд, рдиреЛрдХрд▒реНрдпрд╛ рдЖрдгрд┐ рдиреЛрдХрд░реАрдордзреНрдпреЗ рдЧреЛрд╖реНрдЯреА рдЦреВрдк рд╡реЗрдЧрд│реНрдпрд╛ рд╣реЛрддреНрдпрд╛... рдореБрд▓рд╛рдЦрдд рдШреЗрдгрд╛рд░рд╛: рдореА рдпрд╛ рд╡рд┐рд╢рд┐рд╖реНрдЯ рд╕реНрдерд┐рддреАрдд рдЖрд╣реЗ - рддреНрдпрд╛рд▓рд╛ 4 рд╡рд░реНрд╖реЗ рдЭрд╛рд▓реА рдЖрд╣реЗрдд.]n рдореБрд▓рд╛рдЦрддрдХрд╛рд░: рддреБрдореНрд╣реА рддреБрдордЪреНрдпрд╛ рдХрд╛рдорд╛рдЪреЗ рдХрд┐рдВрд╡рд╛ рдХрд░рд┐рдЕрд░рдЪреНрдпрд╛ рдЗрддрд┐рд╣рд╛рд╕рд╛рдЪреЗ рдереЛрдбрдХреНрдпрд╛рдд рд╡рд░реНрдгрди рдХрд░реВ рд╢рдХрддрд╛ рдХрд╛? рдЖрдЬрдкрд░реНрдпрдВрдд?\nрдореБрд▓рд╛рдЦрдд рдШреЗрдгрд╛рд░рд╛: рдорд╛рдЭреА рдХрд╛рд░рдХреАрд░реНрдж рдЕрд╢реА рдЖрд╣реЗ рдХреА, рд╕реБрд░реБрд╡рд╛рддреАрд▓рд╛ рдирд░реНрд╕рд┐рдВрдЧ [???] рдЙрджреНрдпреЛрдЧ рдЖрдгрд┐ [???] рдкреНрд▓реЕрд╕реНрдЯрд┐рдХ [???] рддреЗ рд╕рдВрдкреВрд░реНрдг рднрд┐рдиреНрди рдХреНрд╖реЗрддреНрд░рд╛рдд рд╡рд┐рд╢реЗрд╖реАрдХрд░рдгрд╛рдЪреНрдпрд╛ рд╡рд┐рд╡рд┐рдз рдХреНрд╖реЗрддреНрд░рд╛рдВрдордзреНрдпреЗ рдХреНрд▓рд┐рдирд┐рдХрд▓ рднреВрдорд┐рдХрд╛рдВрдкрд╛рд╕реВрди рд╕реБрд░реБрд╡рд╛рдд рдЭрд╛рд▓реА. рд╡рд┐рд╡рд┐рдзрддрд╛ рдЖрдгрд┐ рдирдВрддрд░ рдореА рдкреБрдирд░реНрд╡рд╕рди рдордзреНрдпреЗ рд╣рд▓рд╡рд┐рд▓реЗ. рдореА 15 рд╡рд░реНрд╖реЗ [???] рд╣реЛрддреЛ рдЖрдгрд┐ рдирдВрддрд░ рдореБрд│рд╛рдд рдореА 26 рд╡рд░реНрд╖рд╛рдВрдЪрд╛ рд╣реЛрддреЛ рддреЗрд╡реНрд╣рд╛ рд╡реНрдпрд╡рд╕реНрдерд╛рдкрдирд╛рдЪреНрдпрд╛ рднреВрдорд┐рдХреЗрдд рдЧреЗрд▓реЛ рдЖрдгрд┐ рдЧреЗрд▓реА 27 рд╡рд░реНрд╖реЗ рд╡реНрдпрд╡рд╕реНрдерд╛рдкрдирд╛рдЪреНрдпрд╛ рднреВрдорд┐рдХреЗрдд рдХрд╛рдо рдХрд░рдд рд╣реЛрддреЛ рдЖрдгрд┐ рд╣реЛрдп, рд╕реБрдорд╛рд░реЗ 20 рд╡рд░реНрд╖реЗ рдореА рд╡реНрдпрд╡рд╕реНрдерд╛рдкрдирд╛рдд рдХрд╛рдо рдХрд░рдд рдЖрд╣реЗ рдЖрдгрд┐ рдореБрд│рд╛рдд рддреЗ рдХреЗрд▓реЗ рдкреБрдирд░реНрд╡рд╕рди рд╕рдореБрдкрджреЗрд╢рдирд╛рддреАрд▓ рдПрдХ рдкреБрдирд░реНрд╡рд╕рди рдкрджрд╡реА, рдЬреА рдорд▓рд╛ рд╡рд╛рдЯрддреЗ [???] рдЦреЗрд│рд▓реА, рддреНрдпрд╛рдореБрд│реЗ [???] рдирд░реНрд╕рд┐рдВрдЧрдЪреНрдпрд╛ рдкрд╛рд░рдВрдкрд╛рд░рд┐рдХ рдХреНрд▓рд┐рдирд┐рдХрд▓ рднреВрдорд┐рдХреЗрдкрд╛рд╕реВрди рдХрд╛рд╣реА рдмрд╛рдмрддреАрдд рднрд┐рдиреНрди рд╡рд┐рд▓рдВрдм.\nрдореБрд▓рд╛рдЦрддрдХрд╛рд░: рд╣реЛрдп, рдорд▓рд╛ рдкрд╣рд┐рд▓реА рдЧреЛрд╖реНрдЯ рд╡рд╛рдЯрддреЗ [???] рд╡реНрдпрд╛рдкрдХрдкрдгреЗ рд╕рд╛рдВрдЧрд╛рдпрдЪреЗ рддрд░, рддреБрдореНрд╣реА рдирд░реНрд╕рд┐рдВрдЧрдордзреНрдпреЗ рдЬрд╛рдгреНрдпрд╛рдЪреЗ рдХрд╛ рдирд┐рд╡рдбрд▓реЗ?\nрдореБрд▓рд╛рдЦрдд рдШреЗрдгрд╛рд░рд╛: рдХрд╛рд░рдг рдорд▓рд╛ рдкреНрд░рд╢рд┐рдХреНрд╖рдг рдЖрдгрд┐ рдиреЛрдХрд░реАрд╕рд╛рдареА рдПрдХрд╛рдЪ рд╡реЗрд│реА рдкреИрд╕реЗ рджрд┐рд▓реЗ рдЧреЗрд▓реЗ. 70 рдЪреНрдпрд╛ рджрд╢рдХрд╛рдд, рдиреЛрдХрд▒реНрдпрд╛ рдЖрдгрд┐ рд░реЛрдЬрдЧрд╛рд░рд╛рдордзреНрдпреЗ рдЧреЛрд╖реНрдЯреА рдЦреВрдк рд╡реЗрдЧрд│реНрдпрд╛ рд╣реЛрддреНрдпрд╛..." + #text="рдирд╛рдЧрдкреВрд░:рдорд╣рд╛рд╡рд┐рдХрд╛рд╕ рдЖрдШрд╛рдбреАрдЪрд╛ рдореБрдВрдмрдИрдд рдореЛрдард╛ рдореЗрд│рд╛рд╡рд╛ рдЭрд╛рд▓рд╛. рдпрд╛ рдореЗрд│рд╛рд╡реНрдпрд╛рдд рдореБрдЦреНрдпрдордВрддреНрд░рд┐рдкрджрд╛рдЪрд╛ рдЙрдореЗрджрд╡рд╛рд░ рдХреЛрдг? рд╣рд╛ рд╡рд┐рд╖рдп рдЬрд╛рд╕реНрдд рдЧрд╛рдЬрд▓рд╛. рдЙрджреНрдзрд╡ рдард╛рдХрд░реЗрдВрдиреА рдореБрдЦреНрдпрдордВрддреНрд░рд┐рдкрджрд╛рдЪрд╛ рдЙрдореЗрджрд╡рд╛рд░ рдХреЛрдг рд╣реЗ рдЬрд╛рд╣реАрд░ рдХрд░рд╛, рдореА рдкрд╛рдареАрдВрдмрд╛ рджреЗрддреЛ рдЕрд╕реЗ рд╡рдХреНрддрд╡реНрдп рдХреЗрд▓реЗ. рдкрдг рддреНрдпрд╛рд▓рд╛ рд░рд╛рд╖реНрдЯреНрд░рд╡рд╛рджреА рдХрд╛рдБрдЧреНрд░реЗрд╕рдЪреЗ рдЕрдзреНрдпрдХреНрд╖ рд╢рд░рдж рдкрд╡рд╛рд░ рдпрд╛рдВрдиреА рдкреНрд░рддрд┐рд╕рд╛рдж рджрд┐рд▓рд╛ рдирд╛рд╣реА. рддреНрдпрд╛рдВрдирд╛ рдпрд╛рд╡рд░ рдмреЛрд▓рдгреЗрдЪ рдЯрд╛рд│рд▓реЗ. рддрд░ рдХрд╛рдБрдЧреНрд░реЗрд╕ рдкреНрд░рджреЗрд╢рд╛рдзреНрдпрдХреНрд╖ рдирд╛рдирд╛ рдкрдЯреЛрд▓реЗрдВрдиреА рдорд╛рддреНрд░ рдЖрдзреА рдирд┐рд╡рдбрдгреВрдХ рдЬрд┐рдВрдХреВ рдирдВрддрд░ рдореБрдЦреНрдпрдордВрддреНрд░реА рдард░рд╡реВ рдЕрд╢реА рднреВрдорд┐рдХрд╛ рдорд╛рдВрдбрд▓реА. рдпрд╛рд╡рд░ рдЖрддрд╛ рд╕рдВрдЬрдп рд░рд╛рдКрдд рдпрд╛рдВрдиреА рдкреНрд░рддрд┐рдХреНрд░рд┐рдпрд╛ рджрд┐рд▓реА рдЖрд╣реЗ. рдорд▓рд╛ рдореБрдЦреНрдпрдордВрддреНрд░реА рд╡реНрд╣рд╛рдпрдЪреЗ рдЖрд╣реЗ рдЕрд╕реЗ рдЙрджреНрдзрд╡ рдард╛рдХрд░реЗ рдХрдзреАрдЪ рдмреЛрд▓рд▓реЗ рдирд╛рд╣реАрдд рдЕрд╕рдВ рд░рд╛рдКрдд рдпрд╛рдВрдиреА рд╕реНрдкрд╖реНрдЯ рдХреЗрд▓реЗ рдЖрд╣реЗ. рдорд╛рддреНрд░ рддреНрдпрд╛ рдкреБрдвреЗ рдЬреЗ рд╡рдХреНрддрд╡реНрдп рддреНрдпрд╛рдВрдиреА рдХреЗрд▓рдВ рддреНрдпрд╛рдореБрд│реЗ рдорд╡рд┐рдЖрдордзреНрдпреЗ рдкреБрдиреНрд╣рд╛ рдПрдХрджрд╛ рдард┐рдгрдЧреА рдкрдбрдгреНрдпрд╛рдЪреА рд╢рдХреНрдпрддрд╛ рдЖрд╣реЗ. " + print("In side translate: "+text) + response = ollama.generate(model= "llama3.1", prompt = "Translate the following text to Engolish:"+text+"\n SUMMARY:\n") + translation = response["response"] + return translation + + +#Using Ollama and llama3.1 modle, summarize the English translation +def summarize(text): + #Uncomment the following line to test the summarization if don't the audio file + #text = "Interviewer: We are talking with Jenny. So what is your current occupation?\nInterviewee: I am an assistant director [???] basically [???].\nInterviewer: And so how long you have been in this particular position?\nInterviewer: We are talking with Jenny. So what is your current occupation?\nInterviewee: I am an assistant director [???] basically [???].\nInterviewer: And so how long you have been in this particular position?\nInterviewee: I am in this particular position - it has been 4 years.\nInterviewer: Are you able to briefly describe your work or career history to date?\nInterviewee: My career has been, it initially started with a lot of clinical roles in very different areas of specialization within the nursing [???] industry and [???] plastics [???] to whole different variety and then I moved into rehabs. I was [???] for 15 years and then basically moved into management role when I was 26 and had been working in the management role for the last 27 years and yeah, about 20 years I have been working in management, and basically did a rehab degree in rehab counseling, which I think played a [???], so [???] sort of varying delay in some respects from the traditional clinical role of nursing.\nInterviewer: Yeah, sort of I guess first thing [???] broadly speaking, why did you choose to go into nursing?\nInterviewee: Because I was paid to train and employed at the same time. In the 70s, things were very different in [???] of jobs and employment...nterviewee: I am in this particular position - it has been 4 years.]nInterviewer: Are you able to briefly describe your work or career history to date?\nInterviewee: My career has been, it initially started with a lot of clinical roles in very different areas of specialization within the nursing [???] industry and [???] plastics [???] to whole different variety and then I moved into rehabs. I was [???] for 15 years and then basically moved into management role when I was 26 and had been working in the management role for the last 27 years and yeah, about 20 years I have been working in management, and basically did a rehab degree in rehab counseling, which I think played a [???], so [???] sort of varying delay in some respects from the traditional clinical role of nursing.\nInterviewer: Yeah, sort of I guess first thing [???] broadly speaking, why did you choose to go into nursing?\nInterviewee: Because I was paid to train and employed at the same time. In the 70s, things were very different in [???] of jobs and employment..." + #response = ollama.generate(model='llama3.1', prompt="Write a concise summary of the text that cover the key points of the textspacing_size=gr.themes.sizes.spacing_sm, radius_size=gr.themes.sizes.radius_none.\n SUMMARY: \N"+text) + print("In side summrisation: "+text) + response = ollama.generate(model= "llama3.1", prompt = "summarize the following text:"+text+"\n SUMMARY:\n") + summary = response["response"] + return summary + + +#UI with tabs, +theme = gr.themes.Glass(spacing_size="lg", radius_size="lg",primary_hue="blue", font=["Optima","Candara"]) +with gr.Blocks(theme) as block: + #Tab for recording the audio and upload it for transription, translation and summarization + with gr.Tab("Record"): + with gr.Row(): + + inp_audio = gr.Audio( + label="Input Video", + type="filepath", + sources = ["microphone"], + elem_classes=["primary"] + ) + with gr.Row(): + out_transcribe = gr.TextArea(label="Transcipt") + out_translate = gr.TextArea(label="Translate") + with gr.Row(): + out_summary = gr.TextArea(label="Call Summary") + with gr.Row(): + submit_btn = gr.Button("Submit") + + submit_btn.click(transcribe, inputs=[inp_audio], outputs=[out_transcribe,out_translate, out_summary]) + + #Tab for uploading the audio file for transription, translation and summarization + with gr.Tab("Upload"): + with gr.Row(): + + inp_audio_file = gr.File( + label="Upload Audio File", + type="filepath", + file_types=["m4a","mp3","webm","mp4","mpga","wav","mpeg"], + ) + with gr.Row(): + out_transcribe = gr.TextArea(label="Transcipt") + out_translate = gr.TextArea(label="Translate") + with gr.Row(): + out_summary = gr.TextArea(label="Call Summary") + with gr.Row(): + submit_btn = gr.Button("Submit") + + + + submit_btn.click(transcribe, inputs=[inp_audio_file], outputs=[out_transcribe,out_translate, out_summary]) + + + + +block.launch(debug = True) diff --git a/download_model.py b/download_model.py index e7ab869..2398a50 100644 --- a/download_model.py +++ b/download_model.py @@ -2,8 +2,8 @@ # # Check if two command-line arguments are provided if len(sys.argv) !=3: - print("Usage: python main.py ") - print("Example: python main.py large-v3 /workspace/whisper-model/") + print("Usage: python download_model.py ") + print("Example: python download_model.py large-v3 /workspace/whisper-model/") sys.exit(1) # Check if the model path ends with '/' diff --git a/load_model.py b/load_model.py index f2a3417..74b93ac 100644 --- a/load_model.py +++ b/load_model.py @@ -1,15 +1,13 @@ import torch import whisper -import numpy as np -import streamlit as st -@st.cache(allow_output_mutation=True) +#load the whisper model from net if it isn't stored locally def load_model(model_id, model_path): + #check GPU is avaialbe device = "cuda" if torch.cuda.is_available() else "cpu" model = whisper.load_model(model_id, device=device, download_root=model_path) print( f"Model will be run on {device}\n" f"Model is {'multilingual' if model.is_multilingual else 'English-only'} " - f"and has {sum(np.prod(p.shape) for p in model.parameters()):,} parameters." ) return model diff --git a/requirements.txt b/requirements.txt index d2a2ab1..680842c 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,5 +1,4 @@ openai-whisper -streamlit -librosa -pytest -pytube \ No newline at end of file +ollama +gradio +load_dotenv From 86fa42385f0bbba2e28750a8fe5d83593b459f1a Mon Sep 17 00:00:00 2001 From: Sethupathi Asokan Date: Tue, 20 Aug 2024 17:00:45 +0530 Subject: [PATCH 09/12] Corrected the venv instruction, added ollama links for Mac and windows --- README.md | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 436314e..941bd92 100644 --- a/README.md +++ b/README.md @@ -15,15 +15,19 @@ Voice to text from major languages supported by whisper model, this applicationn _m4a, mp3, webm, mp4, mpga, wav, mpeg_ **Pre-requisite:** - +Note: Following instructions a for linux, python 3.8.1 or above sudo apt-get update sudo apt-get install python3.8.1 Ollam - +For Linux: curl -fsSL https://ollama.com/install.sh | sh +For Mac: + https://ollama.com/download/Ollama-darwin.zip +For Windows + https://ollama.com/download/OllamaSetup.exe Llama 3.1 model @@ -35,7 +39,7 @@ Clone this github repository Create python virtual environment - python3 -m lingo . + python3 -m venv lingo . Activate the virtual environment @@ -49,5 +53,5 @@ Execute the applilcation python3 app.py -It will execute and prints the url in the output console, copy the url and paste in the brwoser +It will execute and prints the url, http://localhost:77434, in the output console, copy the url and paste in the brwoser From 9c14d3928da742c68934c4ed1c425aff9d484db0 Mon Sep 17 00:00:00 2001 From: Sethupathi Asokan Date: Tue, 20 Aug 2024 17:07:05 +0530 Subject: [PATCH 10/12] Added prerequisite for ffmpeg --- README.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/README.md b/README.md index 941bd92..444bf01 100644 --- a/README.md +++ b/README.md @@ -21,6 +21,9 @@ python 3.8.1 or above sudo apt-get update sudo apt-get install python3.8.1 +ffmpeg + sudo apt update && sudo apt install ffmpeg + Ollam For Linux: curl -fsSL https://ollama.com/install.sh | sh From 7b5e0d76f5216840b54aee511866aef5ab4622d8 Mon Sep 17 00:00:00 2001 From: Sethupathi Asokan Date: Tue, 20 Aug 2024 17:09:00 +0530 Subject: [PATCH 11/12] Updated the format --- README.md | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 444bf01..7130e8a 100644 --- a/README.md +++ b/README.md @@ -22,14 +22,21 @@ python 3.8.1 or above sudo apt-get install python3.8.1 ffmpeg + sudo apt update && sudo apt install ffmpeg Ollam + For Linux: + curl -fsSL https://ollama.com/install.sh | sh + For Mac: + https://ollama.com/download/Ollama-darwin.zip -For Windows + +For Windows: + https://ollama.com/download/OllamaSetup.exe Llama 3.1 model From 43ca4896820d4282fa430654e76e7e6271350a6e Mon Sep 17 00:00:00 2001 From: Sethupathi Asokan Date: Tue, 20 Aug 2024 17:22:04 +0530 Subject: [PATCH 12/12] removed print statements and corrected typo --- app.py | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/app.py b/app.py index 26ccb3a..3624f34 100644 --- a/app.py +++ b/app.py @@ -19,9 +19,7 @@ def transcribe(audio): result = model.transcribe(audio) transcription = result["text"] #translation = translate_with_whisper(audio) - print("Transcribe: "+transcription) translation = translate_with_ollama(transcription) - print("Translation: "+translation) summary = summarize(translation) return [transcription, translation, summary] @@ -37,8 +35,7 @@ def translate_with_ollama(text): #Uncomment any of the below line to test the translation if don't have an audio file #text = "рдореБрд▓рд╛рдЦрддрдХрд╛рд░: рдЖрдореНрд╣реА рдЬреЗрдиреАрд╢реА рдмреЛрд▓рдд рдЖрд╣реЛрдд. рддрд░ рддреБрдордЪрд╛ рд╕рдзреНрдпрд╛рдЪрд╛ рд╡реНрдпрд╡рд╕рд╛рдп рдХрд╛рдп рдЖрд╣реЗ?\nрдореБрд▓рд╛рдЦрдд рдШреЗрдгрд╛рд░рд╛: рдореА рдПрдХ рд╕рд╣рд╛рдпреНрдпрдХ рджрд┐рдЧреНрджрд░реНрд╢рдХ рдЖрд╣реЗ [???] рдореБрд│рд╛рдд [???].\nрдореБрд▓рд╛рдЦрддрдХрд░реНрддрд╛: рдЖрдгрд┐ рдореНрд╣рдгреВрди рддреБрдореНрд╣реА рдпрд╛ рд╡рд┐рд╢рд┐рд╖реНрдЯ рдкрджрд╛рд╡рд░ рдХрд┐рддреА рдХрд╛рд│ рдЖрд╣рд╛рдд?\nрдореБрд▓рд╛рдЦрдд рдШреЗрдгрд╛рд░рд╛: рдЖрдореНрд╣реА рдЬреЗрдиреАрд╢реА рдмреЛрд▓рдд рдЖрд╣реЛрдд. . рддрд░ рддреБрдордЪрд╛ рд╕рдзреНрдпрд╛рдЪрд╛ рд╡реНрдпрд╡рд╕рд╛рдп рдХрд╛рдп рдЖрд╣реЗ?\nрдореБрд▓рд╛рдЦрдд рдШреЗрдгрд╛рд░рд╛: рдореА рдПрдХ рд╕рд╣рд╛рдпреНрдпрдХ рджрд┐рдЧреНрджрд░реНрд╢рдХ рдЖрд╣реЗ [???] рдореБрд│рд╛рдд [???].\nрдореБрд▓рд╛рдЦрдд рдШреЗрдгрд╛рд░рд╛: рдЖрдгрд┐ рдореНрд╣рдгреВрди рддреБрдореНрд╣реА рдпрд╛ рд╡рд┐рд╢рд┐рд╖реНрдЯ рдкрджрд╛рд╡рд░ рдХрд┐рддреА рдХрд╛рд│ рдЖрд╣рд╛рдд?\nрдореБрд▓рд╛рдЦрдд рдШреЗрдгрд╛рд░рд╛: рдореА рдпрд╛ рд╡рд┐рд╢рд┐рд╖реНрдЯ рдкрджрд╛рд╡рд░ рдЖрд╣реЗ. рд╕реНрдерд┐рддреА - рек рд╡рд░реНрд╖реЗ рдЭрд╛рд▓реА рдЖрд╣реЗрдд.\nрдореБрд▓рд╛рдЦрддрдХрд╛рд░: рддреБрдореНрд╣реА рдЖрдЬрдкрд░реНрдпрдВрддрдЪреНрдпрд╛ рддреБрдордЪреНрдпрд╛ рдХрд╛рдорд╛рдЪреЗ рдХрд┐рдВрд╡рд╛ рдХрд░рд┐рдЕрд░рдЪреНрдпрд╛ рдЗрддрд┐рд╣рд╛рд╕рд╛рдЪреЗ рдереЛрдбрдХреНрдпрд╛рдд рд╡рд░реНрдгрди рдХрд░реВ рд╢рдХрддрд╛ рдХрд╛?\nрдореБрд▓рд╛рдЦрдд рдШреЗрдгрд╛рд░рд╛: рдорд╛рдЭреА рдХрд╛рд░рдХреАрд░реНрдж рдЕрд╢реА рдЖрд╣реЗ рдХреА, рд╕реБрд░реБрд╡рд╛рддреАрдЪреНрдпрд╛ рдХрд╛рд│рд╛рдд рддреА рд╕реНрдкреЗрд╢рд▓рд╛рдпрдЭреЗрд╢рдирдЪреНрдпрд╛ рд╡реЗрдЧрд╡реЗрдЧрд│реНрдпрд╛ рдХреНрд╖реЗрддреНрд░рд╛рдд рдЕрдиреЗрдХ рдХреНрд▓рд┐рдирд┐рдХрд▓ рднреВрдорд┐рдХрд╛рдВрд╕рд╣ рд╕реБрд░реВ рдЭрд╛рд▓реА. рдирд░реНрд╕рд┐рдВрдЧ [???] рдЙрджреНрдпреЛрдЧ рдЖрдгрд┐ [???] рдкреНрд▓реЕрд╕реНрдЯрд┐рдХ [???] рд╕рдВрдкреВрд░реНрдг рднрд┐рдиреНрди рдкреНрд░рдХрд╛рд░ рдЖрдгрд┐ рдирдВрддрд░ рдореА рдкреБрдирд░реНрд╡рд╕рди рдордзреНрдпреЗ рд╣рд▓рд╡рд┐рд▓реЗ. рдореА 15 рд╡рд░реНрд╖реЗ [???] рд╣реЛрддреЛ рдЖрдгрд┐ рдирдВрддрд░ рдореБрд│рд╛рдд рдореА 26 рд╡рд░реНрд╖рд╛рдВрдЪрд╛ рд╣реЛрддреЛ рддреЗрд╡реНрд╣рд╛ рд╡реНрдпрд╡рд╕реНрдерд╛рдкрдирд╛рдЪреНрдпрд╛ рднреВрдорд┐рдХреЗрдд рдЧреЗрд▓реЛ рдЖрдгрд┐ рдЧреЗрд▓реА 27 рд╡рд░реНрд╖реЗ рд╡реНрдпрд╡рд╕реНрдерд╛рдкрдирд╛рдЪреНрдпрд╛ рднреВрдорд┐рдХреЗрдд рдХрд╛рдо рдХрд░рдд рд╣реЛрддреЛ рдЖрдгрд┐ рд╣реЛрдп, рд╕реБрдорд╛рд░реЗ 20 рд╡рд░реНрд╖реЗ рдореА рд╡реНрдпрд╡рд╕реНрдерд╛рдкрдирд╛рдд рдХрд╛рдо рдХрд░рдд рдЖрд╣реЗ рдЖрдгрд┐ рдореБрд│рд╛рдд рддреЗ рдХреЗрд▓реЗ рдкреБрдирд░реНрд╡рд╕рди рд╕рдореБрдкрджреЗрд╢рдирд╛рддреАрд▓ рдПрдХ рдкреБрдирд░реНрд╡рд╕рди рдкрджрд╡реА, рдЬреА рдорд▓рд╛ рд╡рд╛рдЯрддреЗ [???] рдЦреЗрд│рд▓реА, рддреНрдпрд╛рдореБрд│реЗ [???] рдирд░реНрд╕рд┐рдВрдЧрдЪреНрдпрд╛ рдкрд╛рд░рдВрдкрд╛рд░рд┐рдХ рдХреНрд▓рд┐рдирд┐рдХрд▓ рднреВрдорд┐рдХреЗрдкрд╛рд╕реВрди рдХрд╛рд╣реА рдмрд╛рдмрддреАрдд рднрд┐рдиреНрди рд╡рд┐рд▓рдВрдм.\nрдореБрд▓рд╛рдЦрддрдХрд╛рд░: рд╣реЛрдп, рдорд▓рд╛ рдкрд╣рд┐рд▓реА рдЧреЛрд╖реНрдЯ рд╡рд╛рдЯрддреЗ [???] рд╡реНрдпрд╛рдкрдХрдкрдгреЗ рд╕рд╛рдВрдЧрд╛рдпрдЪреЗ рддрд░, рддреБрдореНрд╣реА рдирд░реНрд╕рд┐рдВрдЧрдордзреНрдпреЗ рдЬрд╛рдгреНрдпрд╛рдЪреЗ рдХрд╛ рдирд┐рд╡рдбрд▓реЗ?\nрдореБрд▓рд╛рдЦрдд рдШреЗрдгрд╛рд░рд╛: рдХрд╛рд░рдг рдорд▓рд╛ рдкреНрд░рд╢рд┐рдХреНрд╖рдг рдЖрдгрд┐ рдиреЛрдХрд░реАрд╕рд╛рдареА рдПрдХрд╛рдЪ рд╡реЗрд│реА рдкреИрд╕реЗ рджрд┐рд▓реЗ рдЧреЗрд▓реЗ. 70 рдЪреНрдпрд╛ рджрд╢рдХрд╛рдд, рдиреЛрдХрд▒реНрдпрд╛ рдЖрдгрд┐ рдиреЛрдХрд░реАрдордзреНрдпреЗ рдЧреЛрд╖реНрдЯреА рдЦреВрдк рд╡реЗрдЧрд│реНрдпрд╛ рд╣реЛрддреНрдпрд╛... рдореБрд▓рд╛рдЦрдд рдШреЗрдгрд╛рд░рд╛: рдореА рдпрд╛ рд╡рд┐рд╢рд┐рд╖реНрдЯ рд╕реНрдерд┐рддреАрдд рдЖрд╣реЗ - рддреНрдпрд╛рд▓рд╛ 4 рд╡рд░реНрд╖реЗ рдЭрд╛рд▓реА рдЖрд╣реЗрдд.]n рдореБрд▓рд╛рдЦрддрдХрд╛рд░: рддреБрдореНрд╣реА рддреБрдордЪреНрдпрд╛ рдХрд╛рдорд╛рдЪреЗ рдХрд┐рдВрд╡рд╛ рдХрд░рд┐рдЕрд░рдЪреНрдпрд╛ рдЗрддрд┐рд╣рд╛рд╕рд╛рдЪреЗ рдереЛрдбрдХреНрдпрд╛рдд рд╡рд░реНрдгрди рдХрд░реВ рд╢рдХрддрд╛ рдХрд╛? рдЖрдЬрдкрд░реНрдпрдВрдд?\nрдореБрд▓рд╛рдЦрдд рдШреЗрдгрд╛рд░рд╛: рдорд╛рдЭреА рдХрд╛рд░рдХреАрд░реНрдж рдЕрд╢реА рдЖрд╣реЗ рдХреА, рд╕реБрд░реБрд╡рд╛рддреАрд▓рд╛ рдирд░реНрд╕рд┐рдВрдЧ [???] рдЙрджреНрдпреЛрдЧ рдЖрдгрд┐ [???] рдкреНрд▓реЕрд╕реНрдЯрд┐рдХ [???] рддреЗ рд╕рдВрдкреВрд░реНрдг рднрд┐рдиреНрди рдХреНрд╖реЗрддреНрд░рд╛рдд рд╡рд┐рд╢реЗрд╖реАрдХрд░рдгрд╛рдЪреНрдпрд╛ рд╡рд┐рд╡рд┐рдз рдХреНрд╖реЗрддреНрд░рд╛рдВрдордзреНрдпреЗ рдХреНрд▓рд┐рдирд┐рдХрд▓ рднреВрдорд┐рдХрд╛рдВрдкрд╛рд╕реВрди рд╕реБрд░реБрд╡рд╛рдд рдЭрд╛рд▓реА. рд╡рд┐рд╡рд┐рдзрддрд╛ рдЖрдгрд┐ рдирдВрддрд░ рдореА рдкреБрдирд░реНрд╡рд╕рди рдордзреНрдпреЗ рд╣рд▓рд╡рд┐рд▓реЗ. рдореА 15 рд╡рд░реНрд╖реЗ [???] рд╣реЛрддреЛ рдЖрдгрд┐ рдирдВрддрд░ рдореБрд│рд╛рдд рдореА 26 рд╡рд░реНрд╖рд╛рдВрдЪрд╛ рд╣реЛрддреЛ рддреЗрд╡реНрд╣рд╛ рд╡реНрдпрд╡рд╕реНрдерд╛рдкрдирд╛рдЪреНрдпрд╛ рднреВрдорд┐рдХреЗрдд рдЧреЗрд▓реЛ рдЖрдгрд┐ рдЧреЗрд▓реА 27 рд╡рд░реНрд╖реЗ рд╡реНрдпрд╡рд╕реНрдерд╛рдкрдирд╛рдЪреНрдпрд╛ рднреВрдорд┐рдХреЗрдд рдХрд╛рдо рдХрд░рдд рд╣реЛрддреЛ рдЖрдгрд┐ рд╣реЛрдп, рд╕реБрдорд╛рд░реЗ 20 рд╡рд░реНрд╖реЗ рдореА рд╡реНрдпрд╡рд╕реНрдерд╛рдкрдирд╛рдд рдХрд╛рдо рдХрд░рдд рдЖрд╣реЗ рдЖрдгрд┐ рдореБрд│рд╛рдд рддреЗ рдХреЗрд▓реЗ рдкреБрдирд░реНрд╡рд╕рди рд╕рдореБрдкрджреЗрд╢рдирд╛рддреАрд▓ рдПрдХ рдкреБрдирд░реНрд╡рд╕рди рдкрджрд╡реА, рдЬреА рдорд▓рд╛ рд╡рд╛рдЯрддреЗ [???] рдЦреЗрд│рд▓реА, рддреНрдпрд╛рдореБрд│реЗ [???] рдирд░реНрд╕рд┐рдВрдЧрдЪреНрдпрд╛ рдкрд╛рд░рдВрдкрд╛рд░рд┐рдХ рдХреНрд▓рд┐рдирд┐рдХрд▓ рднреВрдорд┐рдХреЗрдкрд╛рд╕реВрди рдХрд╛рд╣реА рдмрд╛рдмрддреАрдд рднрд┐рдиреНрди рд╡рд┐рд▓рдВрдм.\nрдореБрд▓рд╛рдЦрддрдХрд╛рд░: рд╣реЛрдп, рдорд▓рд╛ рдкрд╣рд┐рд▓реА рдЧреЛрд╖реНрдЯ рд╡рд╛рдЯрддреЗ [???] рд╡реНрдпрд╛рдкрдХрдкрдгреЗ рд╕рд╛рдВрдЧрд╛рдпрдЪреЗ рддрд░, рддреБрдореНрд╣реА рдирд░реНрд╕рд┐рдВрдЧрдордзреНрдпреЗ рдЬрд╛рдгреНрдпрд╛рдЪреЗ рдХрд╛ рдирд┐рд╡рдбрд▓реЗ?\nрдореБрд▓рд╛рдЦрдд рдШреЗрдгрд╛рд░рд╛: рдХрд╛рд░рдг рдорд▓рд╛ рдкреНрд░рд╢рд┐рдХреНрд╖рдг рдЖрдгрд┐ рдиреЛрдХрд░реАрд╕рд╛рдареА рдПрдХрд╛рдЪ рд╡реЗрд│реА рдкреИрд╕реЗ рджрд┐рд▓реЗ рдЧреЗрд▓реЗ. 70 рдЪреНрдпрд╛ рджрд╢рдХрд╛рдд, рдиреЛрдХрд▒реНрдпрд╛ рдЖрдгрд┐ рд░реЛрдЬрдЧрд╛рд░рд╛рдордзреНрдпреЗ рдЧреЛрд╖реНрдЯреА рдЦреВрдк рд╡реЗрдЧрд│реНрдпрд╛ рд╣реЛрддреНрдпрд╛..." #text="рдирд╛рдЧрдкреВрд░:рдорд╣рд╛рд╡рд┐рдХрд╛рд╕ рдЖрдШрд╛рдбреАрдЪрд╛ рдореБрдВрдмрдИрдд рдореЛрдард╛ рдореЗрд│рд╛рд╡рд╛ рдЭрд╛рд▓рд╛. рдпрд╛ рдореЗрд│рд╛рд╡реНрдпрд╛рдд рдореБрдЦреНрдпрдордВрддреНрд░рд┐рдкрджрд╛рдЪрд╛ рдЙрдореЗрджрд╡рд╛рд░ рдХреЛрдг? рд╣рд╛ рд╡рд┐рд╖рдп рдЬрд╛рд╕реНрдд рдЧрд╛рдЬрд▓рд╛. рдЙрджреНрдзрд╡ рдард╛рдХрд░реЗрдВрдиреА рдореБрдЦреНрдпрдордВрддреНрд░рд┐рдкрджрд╛рдЪрд╛ рдЙрдореЗрджрд╡рд╛рд░ рдХреЛрдг рд╣реЗ рдЬрд╛рд╣реАрд░ рдХрд░рд╛, рдореА рдкрд╛рдареАрдВрдмрд╛ рджреЗрддреЛ рдЕрд╕реЗ рд╡рдХреНрддрд╡реНрдп рдХреЗрд▓реЗ. рдкрдг рддреНрдпрд╛рд▓рд╛ рд░рд╛рд╖реНрдЯреНрд░рд╡рд╛рджреА рдХрд╛рдБрдЧреНрд░реЗрд╕рдЪреЗ рдЕрдзреНрдпрдХреНрд╖ рд╢рд░рдж рдкрд╡рд╛рд░ рдпрд╛рдВрдиреА рдкреНрд░рддрд┐рд╕рд╛рдж рджрд┐рд▓рд╛ рдирд╛рд╣реА. рддреНрдпрд╛рдВрдирд╛ рдпрд╛рд╡рд░ рдмреЛрд▓рдгреЗрдЪ рдЯрд╛рд│рд▓реЗ. рддрд░ рдХрд╛рдБрдЧреНрд░реЗрд╕ рдкреНрд░рджреЗрд╢рд╛рдзреНрдпрдХреНрд╖ рдирд╛рдирд╛ рдкрдЯреЛрд▓реЗрдВрдиреА рдорд╛рддреНрд░ рдЖрдзреА рдирд┐рд╡рдбрдгреВрдХ рдЬрд┐рдВрдХреВ рдирдВрддрд░ рдореБрдЦреНрдпрдордВрддреНрд░реА рдард░рд╡реВ рдЕрд╢реА рднреВрдорд┐рдХрд╛ рдорд╛рдВрдбрд▓реА. рдпрд╛рд╡рд░ рдЖрддрд╛ рд╕рдВрдЬрдп рд░рд╛рдКрдд рдпрд╛рдВрдиреА рдкреНрд░рддрд┐рдХреНрд░рд┐рдпрд╛ рджрд┐рд▓реА рдЖрд╣реЗ. рдорд▓рд╛ рдореБрдЦреНрдпрдордВрддреНрд░реА рд╡реНрд╣рд╛рдпрдЪреЗ рдЖрд╣реЗ рдЕрд╕реЗ рдЙрджреНрдзрд╡ рдард╛рдХрд░реЗ рдХрдзреАрдЪ рдмреЛрд▓рд▓реЗ рдирд╛рд╣реАрдд рдЕрд╕рдВ рд░рд╛рдКрдд рдпрд╛рдВрдиреА рд╕реНрдкрд╖реНрдЯ рдХреЗрд▓реЗ рдЖрд╣реЗ. рдорд╛рддреНрд░ рддреНрдпрд╛ рдкреБрдвреЗ рдЬреЗ рд╡рдХреНрддрд╡реНрдп рддреНрдпрд╛рдВрдиреА рдХреЗрд▓рдВ рддреНрдпрд╛рдореБрд│реЗ рдорд╡рд┐рдЖрдордзреНрдпреЗ рдкреБрдиреНрд╣рд╛ рдПрдХрджрд╛ рдард┐рдгрдЧреА рдкрдбрдгреНрдпрд╛рдЪреА рд╢рдХреНрдпрддрд╛ рдЖрд╣реЗ. " - print("In side translate: "+text) - response = ollama.generate(model= "llama3.1", prompt = "Translate the following text to Engolish:"+text+"\n SUMMARY:\n") + response = ollama.generate(model= "llama3.1", prompt = "Translate the following text to English:"+text+"\n SUMMARY:\n") translation = response["response"] return translation @@ -48,7 +45,6 @@ def summarize(text): #Uncomment the following line to test the summarization if don't the audio file #text = "Interviewer: We are talking with Jenny. So what is your current occupation?\nInterviewee: I am an assistant director [???] basically [???].\nInterviewer: And so how long you have been in this particular position?\nInterviewer: We are talking with Jenny. So what is your current occupation?\nInterviewee: I am an assistant director [???] basically [???].\nInterviewer: And so how long you have been in this particular position?\nInterviewee: I am in this particular position - it has been 4 years.\nInterviewer: Are you able to briefly describe your work or career history to date?\nInterviewee: My career has been, it initially started with a lot of clinical roles in very different areas of specialization within the nursing [???] industry and [???] plastics [???] to whole different variety and then I moved into rehabs. I was [???] for 15 years and then basically moved into management role when I was 26 and had been working in the management role for the last 27 years and yeah, about 20 years I have been working in management, and basically did a rehab degree in rehab counseling, which I think played a [???], so [???] sort of varying delay in some respects from the traditional clinical role of nursing.\nInterviewer: Yeah, sort of I guess first thing [???] broadly speaking, why did you choose to go into nursing?\nInterviewee: Because I was paid to train and employed at the same time. In the 70s, things were very different in [???] of jobs and employment...nterviewee: I am in this particular position - it has been 4 years.]nInterviewer: Are you able to briefly describe your work or career history to date?\nInterviewee: My career has been, it initially started with a lot of clinical roles in very different areas of specialization within the nursing [???] industry and [???] plastics [???] to whole different variety and then I moved into rehabs. I was [???] for 15 years and then basically moved into management role when I was 26 and had been working in the management role for the last 27 years and yeah, about 20 years I have been working in management, and basically did a rehab degree in rehab counseling, which I think played a [???], so [???] sort of varying delay in some respects from the traditional clinical role of nursing.\nInterviewer: Yeah, sort of I guess first thing [???] broadly speaking, why did you choose to go into nursing?\nInterviewee: Because I was paid to train and employed at the same time. In the 70s, things were very different in [???] of jobs and employment..." #response = ollama.generate(model='llama3.1', prompt="Write a concise summary of the text that cover the key points of the textspacing_size=gr.themes.sizes.spacing_sm, radius_size=gr.themes.sizes.radius_none.\n SUMMARY: \N"+text) - print("In side summrisation: "+text) response = ollama.generate(model= "llama3.1", prompt = "summarize the following text:"+text+"\n SUMMARY:\n") summary = response["response"] return summary