Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster GPU execution #7

Open
Andie-Squirrel opened this issue Mar 4, 2023 · 0 comments
Open

Faster GPU execution #7

Andie-Squirrel opened this issue Mar 4, 2023 · 0 comments

Comments

@Andie-Squirrel
Copy link

Not necessarily an issue but I didn't know where else to post (I'm still new to GitHub methodology).

After prompting ChatGPT, I got this code which managed to decrease the amount of processing time by a considerable amount:

import transformers
from transformers import utils, pipeline, set_seed
import torch
from flask import Flask, request, render_template, session, redirect


app = Flask(__name__)

# Set the secret key for the session
app.secret_key = 'your-secret-key'

MODEL_NAME = "facebook/opt-125m" 

# Initialize the chat history
history = ["Human: Can you tell me the weather forecast for tomorrow?\nBot: Try checking a weather app like a normal person.\nHuman: Can you help me find a good restaurant in the area\nBot: Try asking someone with a functioning sense of taste.\n"]
generator = pipeline('text-generation', model=f"{MODEL_NAME}", do_sample=True, device=0) # Use the first available GPU


# Define the chatbot logic
def chatbot_response(input_text, history):
    # Concatenate the input text and history list
    input_text = "\n".join(history) + "\nHuman: " + input_text + " Bot: "
    set_seed(32)
    response_text = generator(input_text, max_length=1024, num_beams=1, num_return_sequences=1)[0]['generated_text']
    # Extract the bot's response from the generated text
    response_text = response_text.split("Bot:")[-1]
    # Cut off any "Human:" or "human:" parts from the response
    response_text = response_text.split("Human:")[0]
    response_text = response_text.split("human:")[0]
    return response_text


@app.route('/', methods=['GET', 'POST'])
def index():
    global history  # Make the history variable global
    if request.method == 'POST':
        input_text = request.form['input_text']
        response_text = chatbot_response(input_text, history)
        # Append the input and response to the chat history
        history.append(f"Human: {input_text}")
        history.append(f"Bot: {response_text}")
    else:
        input_text = ''
        response_text = ''
    # Render the template with the updated chat history
    return render_template('index.html', input_text=input_text, response_text=response_text, history=history)


@app.route('/reset', methods=['POST'])
def reset():
    global history  # Make the history variable global
    history = ["Bot: Hello, how can I help you today? I am a chatbot designed to assist with a variety of tasks and answer questions. You can ask me about anything from general knowledge to specific topics, and I will do my best to provide a helpful and accurate response. Please go ahead and ask me your first question.\n"]
    # Redirect to the chat page
    return redirect('/')


if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5001)

This way it takes more advantage of the GPU instead of RAM memory.

The only difference I see is in line 16.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant