diff --git a/backend/promtfoo/README.md b/backend/promtfoo/README.md
new file mode 100644
index 00000000..cf3b4fda
--- /dev/null
+++ b/backend/promtfoo/README.md
@@ -0,0 +1,21 @@
+# Promptfoo
+
+Promptfoo is a CLI and library for evaluating and red-teaming LLM apps.
+
+See https://www.promptfoo.dev/docs/intro/
+
+## Setup
+
+### Install Promptfoo
+Install promptfoo by running `npx install promptfoo`
+
+### Activate Python venv
+Promptfoo must be run in a python virtual environment as python is used to load the jinja prompt templates.
+See [Running Locally](../README.md)
+
+## Run Promptfoo
+Promptfoo configuration (e.g. LLM model) can be set in `promptfooconfig.yaml`
+
+* Use `promptfoo eval` to run all promptfoo tests.
+* Use `promptfoo eval -c generate_message_suggestions_config.yaml` to run a specific test suite.
+* Use `promptfoo view` to view the results in browser.
diff --git a/backend/src/prompts/generate_message_suggestions_config.yaml b/backend/promtfoo/generate_message_suggestions_config.yaml
similarity index 95%
rename from backend/src/prompts/generate_message_suggestions_config.yaml
rename to backend/promtfoo/generate_message_suggestions_config.yaml
index 87933278..cfb16d52 100644
--- a/backend/src/prompts/generate_message_suggestions_config.yaml
+++ b/backend/promtfoo/generate_message_suggestions_config.yaml
@@ -5,7 +5,7 @@ providers:
     config:
       temperature: 0
 
-prompts: file://get_generate_message_suggestions_prompt.py:get_prompt
+prompts: file://prompt_foo_runner.py:generate_message_suggestions
 
 tests:
   - description: "test the output has the correct format and content when there is no chat history "
diff --git a/backend/src/prompts/get_generate_message_suggestions_prompt.py b/backend/promtfoo/prompt_foo_runner.py
similarity index 50%
rename from backend/src/prompts/get_generate_message_suggestions_prompt.py
rename to backend/promtfoo/prompt_foo_runner.py
index 1f7a178c..1763c366 100644
--- a/backend/src/prompts/get_generate_message_suggestions_prompt.py
+++ b/backend/promtfoo/prompt_foo_runner.py
@@ -1,9 +1,16 @@
-from prompting import PromptEngine
+import sys
+import os
+sys.path.append("../")
+from dotenv import load_dotenv, find_dotenv  # noqa: E402
+from src.prompts.prompting import PromptEngine  # noqa: E402
 
+load_dotenv(find_dotenv())
+
+OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
 engine = PromptEngine()
 
 
-def get_prompt(context):
+def generate_message_suggestions(context):
     chat_history = context["vars"]["chatHistory"]
 
     system_prompt = engine.load_prompt("generate_message_suggestions", chat_history=chat_history)
diff --git a/backend/src/prompts/promptfooconfig.yaml b/backend/promtfoo/promptfooconfig.yaml
similarity index 100%
rename from backend/src/prompts/promptfooconfig.yaml
rename to backend/promtfoo/promptfooconfig.yaml
diff --git a/backend/requirements.txt b/backend/requirements.txt
index cd69944d..0ff63e12 100644
--- a/backend/requirements.txt
+++ b/backend/requirements.txt
@@ -5,9 +5,6 @@ pycodestyle==2.11.1
 python-dotenv==1.0.1
 neo4j==5.18.0
 ruff==0.3.5
-pytest==8.1.1
-pytest-mock==3.14.0
-pytest-asyncio==0.23.7
 jinja2==3.1.3
 websockets==12.0
 azure-core==1.30.1
@@ -26,3 +23,7 @@ pypdf==4.3.1
 hiredis==3.0.0
 redis==5.0.8
 
+# tests
+pytest==8.1.1
+pytest-mock==3.14.0
+pytest-asyncio==0.23.7
diff --git a/backend/src/prompts/README.md b/backend/src/prompts/README.md
deleted file mode 100644
index 53c18ae1..00000000
--- a/backend/src/prompts/README.md
+++ /dev/null
@@ -1,10 +0,0 @@
-To get started, set your OPENAI_API_KEY environment variable, or other required keys for the providers you selected.
-
-install promptfoo by running `npx install promptfoo`
-
-Next, edit promptfooconfig.yaml.
-
-Then run `promptfoo eval` to run promptfooconfig.yaml
-ro run a specific file for example `generate_message_suggestions_config.yaml` run `promptfoo eval -c generate_message_suggestions_config.yaml`
-
-Afterwards, you can view the results by running `promptfoo view`