Skip to content

Commit

Permalink
Fs-119 Using file search tool for report (#56)
Browse files Browse the repository at this point in the history
  • Loading branch information
dianaPrahoveanu-SL authored Dec 19, 2024
1 parent 72d8eaa commit d66d59f
Show file tree
Hide file tree
Showing 18 changed files with 241 additions and 399 deletions.
Binary file not shown.
198 changes: 4 additions & 194 deletions backend/promptfoo/report_agent_config.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
description: "Test Report Agent Prompts"

providers:
- id: mistral:mistral-large-latest
- id: openai:gpt-4o-mini
config:
temperature: 0

Expand All @@ -12,102 +12,7 @@ tests:
vars:
user_prompt_template: "create-report-user-prompt"
system_prompt_template: "create-report-system-prompt"
user_prompt_args:
document_text: "Published September 2024 Carbon Reduction Plan
Supplier name: Amazon Web Services EU SARL (UK Branch) (“AWS UK”)
Publication date: September 30, 2024
Commitment to Achieving Net Zero
AWS UK, as part of Amazon.com, Inc. (“Amazon”), is committed to achieving net -zero
emissions by 2040. In 2019, Amazon co -founded The Climate Pledge, a public commitment
to innovate, use our scale for good and go faster to address the urgency of the climate crisis
to reach net -zero carbon across the entire organization by 2040. Since committing to the
Pledge, we’ve changed how we conduct our business and the running of our operations, and
we’ve increased funding and implementation of new technologies and services that
decarbonize and help preserve the natural world, alon gside the ambitious goals outlined in
The Climate Pledge. We’re fully committed to our goals and our work to build a better planet.
Baseline Emissions Footprint
Base Year emissions are a record of the greenhouse gases that have been produced in the
past an d are the reference point against which emissions reduction can be measured.
Baseline Year: 2020
Additional Details relating to the Baseline Emissions calculations:
AWS UK utilized January 1, 2020 to December 31, 2020 as the baseline year for emissions
reporting under this Carbon Reduction Plan. Our plan includes emissions data from relevant
affiliate companies helping to provide AWS UK’s services to our customers. We ’ve included both
location -based and market -based method Scope 2 emissions in the following tables. AWS UK
benefits from contractual arrangements entered into by our affiliate(s) for renewable electricity
and/or renewable attributes that are reflected in t he market -based data set. More information
about our corporate carbon footprint and methodology can be found on our website .
Our baseline year does not include Scope 1 emissions. In 2022 we updated our methodology
and Scope 1 emissions are now included in total emissions for AWS UK
Published September 2024 Baseline year emissions:
EMISSIONS TOTAL (tCO 2e)
Scope 1 0
Scope 2 61,346 – Location -based method
2,813 – Market -based method
Scope 3 (Included
Sources) 3,770
Total Emissions 65,116 – Location -based method
6,583 – Market -based method
Current Emissions Reporting
Reporting Year: 202 3 (January 1, 202 3 to December 31, 202 3)
EMISSIONS TOTAL (tCO 2e)
Scope 1 2,23 3
Scope 2 126,755 – Location -based method
0 – Market -based method
Scope 3 (Included
Sources) 13,188
Total Emissions 142,17 6 – Location -based method
15,42 1 – Market -based method
Published September 2024 Emissions Reduction Targets
In 2019, we set an ambitious goal to match 100% of the electricity we use with renewable
energy by 2030. This goal includes all data centres , logistics facilities, physical stores, and
corporate offices, as well as on -site charg ing points and our financially integrated subsidiaries.
We are proud to have achieved this goal in 2023, seven years early, with 100% of the electricity
consum ed by Amazon matched with renewable energy sources.
Amazon continue s to be transparent and share our progress to reach net -zero carbon in our
annual Sustainability Report , which also includes details on how we measure carbon .
Carbon Reduction Projects
Completed Carbon Reduction Initiatives
Amazon continues to take actions across our operations to drive carbon reduction around the
world, including in the UK. As of January 202 4, Amazon’s renewable energy portfolio includes
243 wind and solar farms and 2 70 rooftop solar projects, totalling 513 projects and 28
gigawatts of renewable energy capacity. This includes several utility -scale renewable energy
projects located within the UK:
•In 2019, Amazon announced our first power purchase agreement in the UK, located in
Kintyre Peninsula, Scotland. The “Amazon Wind Farm Scotland – Beinn an Tuirc 3”
began o perating in 2021, providing 50 megawatts (MW) of new renewable capacity to
the electricity grid with expected generation of 168,000 megawatt hours (MWh) of
clean energy annually. That’s enough to power 46,000 UK homes every year.
•In December 2020, Amazon a nnounced a two -phase renewable energy project located
in South Lanarkshire, Scotland, the Kennoxhead wind farm. Kennoxhead will be the
largest single -site onshore wind project in the UK, enabled through corporate
procurement. Once fully operational, Kenno xhead will produce 129 MW of renewable
capacity and is expected to generate 439,000 MWh of clean energy annually. Phase 1
(60 MW) began operating in 2022, and Phase 2 (69 MW) will begin operations in 2024 .
•In 2022, Amazon announced its first project in Nor thern Ireland, a 16 MW onshore
windfarm in Co Antrim.
•In 2022, Amazon also announced a new 473 MW offshore wind farm, Moray West,
located off the coast of Scotland . Amazon expects completion of Moray West in 2024.
This is Amazon’s largest project in Scotland and the largest corporate renewable
energy deal announced by any company in the UK to date.
•In 2023, Amazon announced a new 47 MW solar farm, Warl ey located in Essex.
This project is expected to be operational in 2024.
Published September 2024 Declaration and Sign Off
This Carbon Reduction Plan has been completed in accordance with PPN 06/21 and
associated guidance and reporting standard for Carbon Reduction Plans.
Emiss ions have been reported and recorded in accordance with the published reporting
standard for Carbon Reduction Plans and the GHG Reporting Protocol corporate standard1
and uses the appropri ate Government emission conversion factors for greenhouse gas
company reporting2.
Scope 1 and Scope 2 emissions have been reported in accordance with S ECR requirements,
and the required subset of Scope 3 emissions have been reported in accordance with the
published reporting standard for Carbon Reduction Plans and the Corporate Value Chain
(Scope 3) Standard3.
This Carbon Reduction Plan has been reviewed and signed off by the board of directors (or
equivalent management body)."
file_attachment: "../library/AstraZeneca-Sustainability-Report-2023.pdf"
assert:
- type: contains-all
value:
Expand All @@ -122,106 +27,11 @@ equivalent management body)."
vars:
user_prompt_template: "find-company-name-from-file-user-prompt"
system_prompt_template: "find-company-name-from-file-system-prompt"
user_prompt_args:
file_content: "Published September 2024 Carbon Reduction Plan
Supplier name: Amazon Web Services EU SARL (UK Branch) (“AWS UK”)
Publication date: September 30, 2024
Commitment to Achieving Net Zero
AWS UK, as part of Amazon.com, Inc. (“Amazon”), is committed to achieving net -zero
emissions by 2040. In 2019, Amazon co -founded The Climate Pledge, a public commitment
to innovate, use our scale for good and go faster to address the urgency of the climate crisis
to reach net -zero carbon across the entire organization by 2040. Since committing to the
Pledge, we’ve changed how we conduct our business and the running of our operations, and
we’ve increased funding and implementation of new technologies and services that
decarbonize and help preserve the natural world, alon gside the ambitious goals outlined in
The Climate Pledge. We’re fully committed to our goals and our work to build a better planet.
Baseline Emissions Footprint
Base Year emissions are a record of the greenhouse gases that have been produced in the
past an d are the reference point against which emissions reduction can be measured.
Baseline Year: 2020
Additional Details relating to the Baseline Emissions calculations:
AWS UK utilized January 1, 2020 to December 31, 2020 as the baseline year for emissions
reporting under this Carbon Reduction Plan. Our plan includes emissions data from relevant
affiliate companies helping to provide AWS UK’s services to our customers. We ’ve included both
location -based and market -based method Scope 2 emissions in the following tables. AWS UK
benefits from contractual arrangements entered into by our affiliate(s) for renewable electricity
and/or renewable attributes that are reflected in t he market -based data set. More information
about our corporate carbon footprint and methodology can be found on our website .
Our baseline year does not include Scope 1 emissions. In 2022 we updated our methodology
and Scope 1 emissions are now included in total emissions for AWS UK
Published September 2024 Baseline year emissions:
EMISSIONS TOTAL (tCO 2e)
Scope 1 0
Scope 2 61,346 – Location -based method
2,813 – Market -based method
Scope 3 (Included
Sources) 3,770
Total Emissions 65,116 – Location -based method
6,583 – Market -based method
Current Emissions Reporting
Reporting Year: 202 3 (January 1, 202 3 to December 31, 202 3)
EMISSIONS TOTAL (tCO 2e)
Scope 1 2,23 3
Scope 2 126,755 – Location -based method
0 – Market -based method
Scope 3 (Included
Sources) 13,188
Total Emissions 142,17 6 – Location -based method
15,42 1 – Market -based method
Published September 2024 Emissions Reduction Targets
In 2019, we set an ambitious goal to match 100% of the electricity we use with renewable
energy by 2030. This goal includes all data centres , logistics facilities, physical stores, and
corporate offices, as well as on -site charg ing points and our financially integrated subsidiaries.
We are proud to have achieved this goal in 2023, seven years early, with 100% of the electricity
consum ed by Amazon matched with renewable energy sources.
Amazon continue s to be transparent and share our progress to reach net -zero carbon in our
annual Sustainability Report , which also includes details on how we measure carbon .
Carbon Reduction Projects
Completed Carbon Reduction Initiatives
Amazon continues to take actions across our operations to drive carbon reduction around the
world, including in the UK. As of January 202 4, Amazon’s renewable energy portfolio includes
243 wind and solar farms and 2 70 rooftop solar projects, totalling 513 projects and 28
gigawatts of renewable energy capacity. This includes several utility -scale renewable energy
projects located within the UK:
•In 2019, Amazon announced our first power purchase agreement in the UK, located in
Kintyre Peninsula, Scotland. The “Amazon Wind Farm Scotland – Beinn an Tuirc 3”
began o perating in 2021, providing 50 megawatts (MW) of new renewable capacity to
the electricity grid with expected generation of 168,000 megawatt hours (MWh) of
clean energy annually. That’s enough to power 46,000 UK homes every year.
•In December 2020, Amazon a nnounced a two -phase renewable energy project located
in South Lanarkshire, Scotland, the Kennoxhead wind farm. Kennoxhead will be the
largest single -site onshore wind project in the UK, enabled through corporate
procurement. Once fully operational, Kenno xhead will produce 129 MW of renewable
capacity and is expected to generate 439,000 MWh of clean energy annually. Phase 1
(60 MW) began operating in 2022, and Phase 2 (69 MW) will begin operations in 2024 .
•In 2022, Amazon announced its first project in Nor thern Ireland, a 16 MW onshore
windfarm in Co Antrim.
•In 2022, Amazon also announced a new 473 MW offshore wind farm, Moray West,
located off the coast of Scotland . Amazon expects completion of Moray West in 2024.
This is Amazon’s largest project in Scotland and the largest corporate renewable
energy deal announced by any company in the UK to date.
•In 2023, Amazon announced a new 47 MW solar farm, Warl ey located in Essex.
This project is expected to be operational in 2024.
Published September 2024 Declaration and Sign Off
This Carbon Reduction Plan has been completed in accordance with PPN 06/21 and
associated guidance and reporting standard for Carbon Reduction Plans.
Emiss ions have been reported and recorded in accordance with the published reporting
standard for Carbon Reduction Plans and the GHG Reporting Protocol corporate standard1
and uses the appropri ate Government emission conversion factors for greenhouse gas
company reporting2.
Scope 1 and Scope 2 emissions have been reported in accordance with S ECR requirements,
and the required subset of Scope 3 emissions have been reported in accordance with the
published reporting standard for Carbon Reduction Plans and the Corporate Value Chain
(Scope 3) Standard3.
This Carbon Reduction Plan has been reviewed and signed off by the board of directors (or
equivalent management body)."
file_attachment: "../library/AstraZeneca-Sustainability-Report-2023.pdf"
assert:
- type: is-json
value:
required: ["company_name"]
type: object
- type: javascript
value: JSON.parse(output).company_name === "Amazon"
value: JSON.parse(output).company_name === "AstraZeneca"
27 changes: 11 additions & 16 deletions backend/src/agents/report_agent.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import json
import logging

from src.llm.llm import LLMFile
from src.agents import Agent
from src.prompts import PromptEngine

Expand All @@ -9,25 +10,19 @@


class ReportAgent(Agent):
async def create_report(self, file_content: str, materiality_topics: dict[str, str]) -> str:
user_prompt = engine.load_prompt(
"create-report-user-prompt",
document_text=file_content,
materiality_topics=materiality_topics
async def create_report(self, file: LLMFile, materiality_topics: dict[str, str]) -> str:
return await self.llm.chat_with_file(
self.model,
system_prompt=engine.load_prompt("create-report-system-prompt"),
user_prompt=engine.load_prompt("create-report-user-prompt", materiality_topics=materiality_topics),
files=[file],
)

system_prompt = engine.load_prompt("create-report-system-prompt")

return await self.llm.chat(self.model, system_prompt=system_prompt, user_prompt=user_prompt)

async def get_company_name(self, file_content: str) -> str:
response = await self.llm.chat(
async def get_company_name(self, file: LLMFile) -> str:
response = await self.llm.chat_with_file(
self.model,
system_prompt=engine.load_prompt("find-company-name-from-file-system-prompt"),
user_prompt=engine.load_prompt(
"find-company-name-from-file-user-prompt",
file_content=file_content
),
return_json=True
user_prompt=engine.load_prompt("find-company-name-from-file-user-prompt"),
files=[file],
)
return json.loads(response)["company_name"]
10 changes: 6 additions & 4 deletions backend/src/api/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
from src.utils.scratchpad import ScratchPadMiddleware
from src.session.chat_response import get_session_chat_response_ids
from src.chat_storage_service import clear_chat_messages, get_chat_message
from src.directors.report_director import report_on_file_upload
from src.directors.report_director import create_report_from_file
from src.session.file_uploads import clear_session_file_uploads, get_report
from src.session.redis_session_middleware import reset_session
from src.utils import Config, test_connection
Expand Down Expand Up @@ -129,27 +129,29 @@ async def suggestions():
async def report(file: UploadFile):
logger.info(f"upload file type={file.content_type} name={file.filename} size={file.size}")
try:
processed_upload = await report_on_file_upload(file)
processed_upload = await create_report_from_file(file)
return JSONResponse(status_code=200, content=processed_upload)
except HTTPException as he:
raise he
except Exception as e:
logger.exception(e)
return JSONResponse(status_code=500, content=file_upload_failed_response)


@app.get("/report/{id}")
def download_report(id: str):
logger.info(f"Get report download called for id: {id}")
try:
final_result = get_report(id)
if final_result is None:
return JSONResponse(status_code=404, content=f"Message with id {id} not found")
headers = {'Content-Disposition': 'attachment; filename="report.md"'}
return Response(final_result.get("report"), headers=headers, media_type='text/markdown')
headers = {"Content-Disposition": 'attachment; filename="report.md"'}
return Response(final_result.get("report"), headers=headers, media_type="text/markdown")
except Exception as e:
logger.exception(e)
return JSONResponse(status_code=500, content=report_get_upload_failed_response)


@app.get("/uploadfile")
async def fetch_file(id: str):
logger.info(f"fetch uploaded file id={id} ")
Expand Down
43 changes: 27 additions & 16 deletions backend/src/directors/report_director.py
Original file line number Diff line number Diff line change
@@ -1,38 +1,49 @@
from fastapi import UploadFile
import sys
import uuid
from fastapi import UploadFile, HTTPException

from src.session.file_uploads import FileUploadReport, store_report
from src.utils.file_utils import handle_file_upload
from src.llm.llm import LLMFile
from src.session.file_uploads import ReportResponse, store_report
from src.agents import get_report_agent, get_materiality_agent

MAX_FILE_SIZE = 10 * 1024 * 1024

async def report_on_file_upload(upload: UploadFile) -> FileUploadReport:

file = handle_file_upload(upload)
async def create_report_from_file(upload: UploadFile) -> ReportResponse:
file_stream = await upload.read()
if upload.filename is None or upload.filename == "":
raise HTTPException(status_code=400, detail="Filename missing from file upload")

file_size = sys.getsizeof(file_stream)

if file_size > MAX_FILE_SIZE:
raise HTTPException(status_code=413, detail=f"File upload must be less than {MAX_FILE_SIZE} bytes")

file = LLMFile(file_name=upload.filename, file=file_stream)
file_id = str(uuid.uuid4())

report_agent = get_report_agent()

company_name = await report_agent.get_company_name(file["content"])
company_name = await report_agent.get_company_name(file)

topics = await get_materiality_agent().list_material_topics(company_name)

report = await get_report_agent().create_report(file["content"], topics)
report = await report_agent.create_report(file, topics)

report_upload = FileUploadReport(
filename=file["filename"],
id=file["uploadId"],
report_response = ReportResponse(
filename=file.file_name,
id=file_id,
report=report,
answer=create_report_chat_message(file["filename"], company_name, topics)
answer=create_report_chat_message(file.file_name, company_name, topics),
)

store_report(report_upload)
store_report(report_response)

return report_upload
return report_response


def create_report_chat_message(file_name: str, company_name: str, topics: dict[str, str]) -> str:
topics_with_markdown = [
f"{key}\n{value}" for key, value in topics.items()
]
topics_with_markdown = [f"{key}\n{value}" for key, value in topics.items()]
return f"""Your report for {file_name} is ready to view.
The following materiality topics were identified for {company_name} which the report focuses on:
Expand Down
Loading

0 comments on commit d66d59f

Please sign in to comment.