-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add meta data for indexing (Azure-Samples#633)
- Loading branch information
1 parent
a16c510
commit b1a5bd9
Showing
1 changed file
with
44 additions
and
28 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,25 @@ | ||
--- | ||
name: Chat with your Data Solution Accelerator | ||
description: Chat with your data using OpenAI and AI Search. | ||
page_type: sample | ||
languages: | ||
- python | ||
- typescript | ||
- bicep | ||
- azdeveloper | ||
products: | ||
- azure-openai | ||
- azure-ai-search | ||
- azure-app-service | ||
- azure | ||
urlFragment: chat-with-your-data-solution-accelerator | ||
--- | ||
# Chat with your data - Solution accelerator | ||
|
||
[**USER STORY**](#user-story) | [**DEPLOY**](#Deploy) | [**SUPPORTING DOCUMENTATION**](#supporting-documentation) | [**CUSTOMER TRUTH**](#customer-truth)\ | ||
\ | ||
\ | ||
![User Story](/media/userStory.png) | ||
![User Story](/media/userStory.png) | ||
## User story | ||
Welcome to the *Chat with your data* Solution accelerator repository! The *Chat with your data* Solution accelerator is a powerful tool that combines the capabilities of Azure AI Search and Large Language Models (LLMs) to create a conversational search experience. This solution accelerator uses an Azure OpenAI GPT model and an Azure AI Search index generated from your data, which is integrated into a web application to provide a natural language interface, including speech-to-text functionality, for search queries. Users can drag and drop files, point to storage, and take care of technical setup to transform documents. There is a web app that users can create in their own subscription with security and authentication. | ||
|
||
|
@@ -19,7 +35,7 @@ This repository provides a template for setting up the solution accelerator, alo | |
* Easy prompt configuration | ||
* Multiple chunking strategies | ||
|
||
### When should you use this repo? | ||
### When should you use this repo? | ||
|
||
If you need to customize your scenario beyond what [Azure OpenAI on your data](https://learn.microsoft.com/azure/ai-services/openai/concepts/use-your-data) offers out-of-the-box, use this repository. | ||
|
||
|
@@ -29,7 +45,7 @@ The accelerator presented here provides several options, for example: | |
* An admin site for ingesting/inspecting/configuring your dataset on the fly | ||
* Running a Retrieval Augmented Generation (RAG) solution locally | ||
|
||
*Have you seen [ChatGPT + Enterprise data with Azure OpenAI and AI Search demo](https://github.com/Azure-Samples/azure-search-openai-demo)? If you would like to experiment: Play with prompts, understanding RAG pattern different implementation approaches, see how different features interact with the RAG pattern and choose the best options for your RAG deployments, take a look at that repo. | ||
*Have you seen [ChatGPT + Enterprise data with Azure OpenAI and AI Search demo](https://github.com/Azure-Samples/azure-search-openai-demo)? If you would like to experiment: Play with prompts, understanding RAG pattern different implementation approaches, see how different features interact with the RAG pattern and choose the best options for your RAG deployments, take a look at that repo. | ||
|
||
Here is a comparison table with a few features offered by Azure, an available GitHub demo sample and this repo, that can provide guidance when you need to decide which one to use: | ||
|
||
|
@@ -40,9 +56,9 @@ Here is a comparison table with a few features offered by Azure, an available Gi | |
|["Chat with your data" Solution Accelerator](https://aka.ms/ChatWithYourDataSolutionAccelerator) - (This repo) | Azure sample | End-to-end baseline RAG pattern sample that uses Azure AI Search as a retriever. | This sample should be used by Developers when the RAG pattern implementations provided by Azure are not able to satisfy business requirements. This sample provides a means to customize the solution. Developers must add their own code to meet requirements, and adapt with best practices according to individual company policies. | | ||
|[ChatGPT + Enterprise data with Azure OpenAI and AI Search demo](https://github.com/Azure-Samples/azure-search-openai-demo) | Azure sample | RAG pattern demo that uses Azure AI Search as a retriever. | Developers who would like to use or present an end-to-end demonstration of the RAG pattern should use this sample. This includes the ability to deploy and test different retrieval modes, and prompts to support business use cases. | | ||
|
||
### Key features | ||
### Key features | ||
- **Private LLM access on your data**: Get all the benefits of ChatGPT on your private, unstructured data. | ||
- **Single application access to your full data set**: Minimize endpoints required to access internal company knowledgebases | ||
- **Single application access to your full data set**: Minimize endpoints required to access internal company knowledgebases | ||
- **Natural language interaction with your unstructured data**: Use natural language to quickly find the answers you need and ask follow-up queries to get the supplemental details. | ||
- **Easy access to source documentation when querying**: Review referenced documents in the same chat window for additional context. | ||
- **Data upload**: Batch upload documents | ||
|
@@ -63,18 +79,18 @@ Out-of-the-box, you can upload the following file types: | |
* DOCX | ||
|
||
### Target end users | ||
Company personnel (employees, executives) looking to research against internal unstructured company data would leverage this accelerator using natural language to find what they need quickly. | ||
Company personnel (employees, executives) looking to research against internal unstructured company data would leverage this accelerator using natural language to find what they need quickly. | ||
|
||
This accelerator also works across industry and roles and would be suitable for any employee who would like to get quick answers with a ChatGPT experience against their internal unstructured company data. | ||
This accelerator also works across industry and roles and would be suitable for any employee who would like to get quick answers with a ChatGPT experience against their internal unstructured company data. | ||
|
||
Tech administrators can use this accelerator to give their colleagues easy access to internal unstructured company data. Admins can customize the system configurator to tailor responses for the intended audience. | ||
Tech administrators can use this accelerator to give their colleagues easy access to internal unstructured company data. Admins can customize the system configurator to tailor responses for the intended audience. | ||
|
||
### Industry scenario | ||
The sample data illustrates how this accelerator could be used in the financial services industry (FSI). | ||
|
||
In this scenario, a financial advisor is preparing for a meeting with a potential client who has expressed interest in Woodgrove Investments’ Emerging Markets Funds. The advisor prepares for the meeting by refreshing their understanding of the emerging markets fund's overall goals and the associated risks. | ||
In this scenario, a financial advisor is preparing for a meeting with a potential client who has expressed interest in Woodgrove Investments’ Emerging Markets Funds. The advisor prepares for the meeting by refreshing their understanding of the emerging markets fund's overall goals and the associated risks. | ||
|
||
Now that the financial advisor is more informed about Woodgrove’s Emerging Markets Funds, they're better equipped to respond to questions about this fund from their client. | ||
Now that the financial advisor is more informed about Woodgrove’s Emerging Markets Funds, they're better equipped to respond to questions about this fund from their client. | ||
|
||
Note: Some of the sample data included with this accelerator was generated using AI and is for illustrative purposes only. | ||
|
||
|
@@ -86,7 +102,7 @@ Many users are used to the convenience of speech-to-text functionality in their | |
![Web - Get responses using natural language](/media/web-nlu.png)Get responses using natural language | ||
|
||
### [Teams extension](./docs/TEAMS_EXTENSION.md) | ||
By bringing the Chat with your data experience into Teams, users can stay within their current workflow and get the answers they need without switching platforms. Rather than building the Chat with your data accelerator within Teams from scratch, the same underlying backend used for the web application is leveraged within Teams. | ||
By bringing the Chat with your data experience into Teams, users can stay within their current workflow and get the answers they need without switching platforms. Rather than building the Chat with your data accelerator within Teams from scratch, the same underlying backend used for the web application is leveraged within Teams. | ||
|
||
Learn more about deploying the Teams extension [here](./docs/TEAMS_EXTENSION.md). | ||
|
||
|
@@ -95,7 +111,7 @@ Learn more about deploying the Teams extension [here](./docs/TEAMS_EXTENSION.md) | |
\ | ||
![One-click Deploy](/media/oneClickDeploy.png) | ||
## Deploy | ||
### Pre-requisites | ||
### Pre-requisites | ||
- Azure subscription - [Create one for free](https://azure.microsoft.com/free/) with owner access. | ||
- Approval to use Azure OpenAI services with your Azure subcription. To apply for approval, see [here](https://learn.microsoft.com/en-us/azure/ai-services/openai/overview#how-do-i-get-access-to-azure-openai). | ||
- [Enable custom Teams apps and turn on custom app uploading](https://learn.microsoft.com/en-us/microsoftteams/platform/concepts/build-and-test/prepare-your-o365-tenant#enable-custom-teams-apps-and-turn-on-custom-app-uploading) (optional: Teams extension only) | ||
|
@@ -133,28 +149,28 @@ There are two choices; the "Deploy to Azure" offers a one click deployment where | |
|
||
The demo, which uses containers pre-built from the main branch is available by clicking this button: | ||
|
||
[![Deploy to Azure](https://aka.ms/deploytoazurebutton)](https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fraw.githubusercontent.com%2FAzure-Samples%2Fchat-with-your-data-solution-accelerator%2Fmain%2Finfra%2Fmain.json) | ||
[![Deploy to Azure](https://aka.ms/deploytoazurebutton)](https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fraw.githubusercontent.com%2FAzure-Samples%2Fchat-with-your-data-solution-accelerator%2Fmain%2Finfra%2Fmain.json) | ||
|
||
**Note**: The default configuration deploys an OpenAI Model "gpt-35-turbo" with version 0613. However, not all | ||
locations support this version. If you're deploying to a location that doesn't support version 0613, you'll need to | ||
switch to a lower version. To find out which versions are supported in different regions, visit the | ||
**Note**: The default configuration deploys an OpenAI Model "gpt-35-turbo" with version 0613. However, not all | ||
locations support this version. If you're deploying to a location that doesn't support version 0613, you'll need to | ||
switch to a lower version. To find out which versions are supported in different regions, visit the | ||
[GPT-35 Turbo Model Availability](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#gpt-35-turbo-model-availability) page. | ||
|
||
### Testing the deployment | ||
1. Navigate to the admin site, where you can upload documents. It will be located at: | ||
|
||
`https://web-{RESOURCE_TOKEN}-admin.azurewebsites.net/` | ||
|
||
Where `{RESOURCE_TOKEN}` is uniquely generated during deployment. This is a combination of your subscription and the name of the resource group. Then select **Ingest Data** and add your data. You can find sample data in the `/data` directory. | ||
|
||
![A screenshot of the admin site.](./media/admin-site.png) | ||
|
||
|
||
2. Navigate to the web app to start chatting on top of your data. The web app can be found at: | ||
|
||
`https://web-{RESOURCE_TOKEN}.azurewebsites.net/` | ||
|
||
|
||
![A screenshot of the chat app.](./media/web-unstructureddata.png) | ||
|
||
\ | ||
|
@@ -167,17 +183,17 @@ switch to a lower version. To find out which versions are supported in different | |
|
||
Only upload data that can be accessed by any user of the application. Anyone who uses the application should also have clearance for any data that is uploaded to the application. | ||
|
||
**Depth of responses** | ||
**Depth of responses** | ||
|
||
The more limited the data set, the broader the questions should be. If the data in the repo is limited, the depth of information in the LLM response you can get with follow up questions may be limited. For more depth in your response, increase the data available for the LLM to access. | ||
The more limited the data set, the broader the questions should be. If the data in the repo is limited, the depth of information in the LLM response you can get with follow up questions may be limited. For more depth in your response, increase the data available for the LLM to access. | ||
|
||
**Response consistency** | ||
|
||
Consider tuning the configuration of prompts to the level of precision required. The more precision desired, the harder it may be to generate a response. | ||
|
||
**Numerical queries** | ||
|
||
The accelerator is optimized to summarize unstructured data, such as PDFs or text files. The ChatGPT 3.5 Turbo model used by the accelerator is not currently optimized to handle queries about specific numerical data. The ChatGPT 4 model may be better able to handle numerical queries. | ||
The accelerator is optimized to summarize unstructured data, such as PDFs or text files. The ChatGPT 3.5 Turbo model used by the accelerator is not currently optimized to handle queries about specific numerical data. The ChatGPT 4 model may be better able to handle numerical queries. | ||
|
||
**Use your own judgement** | ||
|
||
|
@@ -244,14 +260,14 @@ The data set under the /data folder is licensed under the [CDLA-Permissive-2 Lic | |
Customer stories coming soon. For early access, contact: [email protected] | ||
|
||
## Disclaimers | ||
This Software requires the use of third-party components which are governed by separate proprietary or open-source licenses as identified below, and you must comply with the terms of each applicable license in order to use the Software. You acknowledge and agree that this license does not grant you a license or other right to use any such third-party proprietary or open-source components. | ||
This Software requires the use of third-party components which are governed by separate proprietary or open-source licenses as identified below, and you must comply with the terms of each applicable license in order to use the Software. You acknowledge and agree that this license does not grant you a license or other right to use any such third-party proprietary or open-source components. | ||
|
||
To the extent that the Software includes components or code used in or derived from Microsoft products or services, including without limitation Microsoft Azure Services (collectively, “Microsoft Products and Services”), you must also comply with the Product Terms applicable to such Microsoft Products and Services. You acknowledge and agree that the license governing the Software does not grant you a license or other right to use Microsoft Products and Services. Nothing in the license or this ReadMe file will serve to supersede, amend, terminate or modify any terms in the Product Terms for any Microsoft Products and Services. | ||
To the extent that the Software includes components or code used in or derived from Microsoft products or services, including without limitation Microsoft Azure Services (collectively, “Microsoft Products and Services”), you must also comply with the Product Terms applicable to such Microsoft Products and Services. You acknowledge and agree that the license governing the Software does not grant you a license or other right to use Microsoft Products and Services. Nothing in the license or this ReadMe file will serve to supersede, amend, terminate or modify any terms in the Product Terms for any Microsoft Products and Services. | ||
|
||
You must also comply with all domestic and international export laws and regulations that apply to the Software, which include restrictions on destinations, end users, and end use. For further information on export restrictions, visit https://aka.ms/exporting. | ||
You must also comply with all domestic and international export laws and regulations that apply to the Software, which include restrictions on destinations, end users, and end use. For further information on export restrictions, visit https://aka.ms/exporting. | ||
|
||
You acknowledge that the Software and Microsoft Products and Services (1) are not designed, intended or made available as a medical device(s), and (2) are not designed or intended to be a substitute for professional medical advice, diagnosis, treatment, or judgment and should not be used to replace or as a substitute for professional medical advice, diagnosis, treatment, or judgment. Customer is solely responsible for displaying and/or obtaining appropriate consents, warnings, disclaimers, and acknowledgements to end users of Customer’s implementation of the Online Services. | ||
You acknowledge that the Software and Microsoft Products and Services (1) are not designed, intended or made available as a medical device(s), and (2) are not designed or intended to be a substitute for professional medical advice, diagnosis, treatment, or judgment and should not be used to replace or as a substitute for professional medical advice, diagnosis, treatment, or judgment. Customer is solely responsible for displaying and/or obtaining appropriate consents, warnings, disclaimers, and acknowledgements to end users of Customer’s implementation of the Online Services. | ||
|
||
You acknowledge the Software is not subject to SOC 1 and SOC 2 compliance audits. No Microsoft technology, nor any of its component technologies, including the Software, is intended or made available as a substitute for the professional advice, opinion, or judgement of a certified financial services professional. Do not use the Software to replace, substitute, or provide professional financial advice or judgment. | ||
You acknowledge the Software is not subject to SOC 1 and SOC 2 compliance audits. No Microsoft technology, nor any of its component technologies, including the Software, is intended or made available as a substitute for the professional advice, opinion, or judgement of a certified financial services professional. Do not use the Software to replace, substitute, or provide professional financial advice or judgment. | ||
|
||
BY ACCESSING OR USING THE SOFTWARE, YOU ACKNOWLEDGE THAT THE SOFTWARE IS NOT DESIGNED OR INTENDED TO SUPPORT ANY USE IN WHICH A SERVICE INTERRUPTION, DEFECT, ERROR, OR OTHER FAILURE OF THE SOFTWARE COULD RESULT IN THE DEATH OR SERIOUS BODILY INJURY OF ANY PERSON OR IN PHYSICAL OR ENVIRONMENTAL DAMAGE (COLLECTIVELY, “HIGH-RISK USE”), AND THAT YOU WILL ENSURE THAT, IN THE EVENT OF ANY INTERRUPTION, DEFECT, ERROR, OR OTHER FAILURE OF THE SOFTWARE, THE SAFETY OF PEOPLE, PROPERTY, AND THE ENVIRONMENT ARE NOT REDUCED BELOW A LEVEL THAT IS REASONABLY, APPROPRIATE, AND LEGAL, WHETHER IN GENERAL OR IN A SPECIFIC INDUSTRY. BY ACCESSING THE SOFTWARE, YOU FURTHER ACKNOWLEDGE THAT YOUR HIGH-RISK USE OF THE SOFTWARE IS AT YOUR OWN RISK. | ||
BY ACCESSING OR USING THE SOFTWARE, YOU ACKNOWLEDGE THAT THE SOFTWARE IS NOT DESIGNED OR INTENDED TO SUPPORT ANY USE IN WHICH A SERVICE INTERRUPTION, DEFECT, ERROR, OR OTHER FAILURE OF THE SOFTWARE COULD RESULT IN THE DEATH OR SERIOUS BODILY INJURY OF ANY PERSON OR IN PHYSICAL OR ENVIRONMENTAL DAMAGE (COLLECTIVELY, “HIGH-RISK USE”), AND THAT YOU WILL ENSURE THAT, IN THE EVENT OF ANY INTERRUPTION, DEFECT, ERROR, OR OTHER FAILURE OF THE SOFTWARE, THE SAFETY OF PEOPLE, PROPERTY, AND THE ENVIRONMENT ARE NOT REDUCED BELOW A LEVEL THAT IS REASONABLY, APPROPRIATE, AND LEGAL, WHETHER IN GENERAL OR IN A SPECIFIC INDUSTRY. BY ACCESSING THE SOFTWARE, YOU FURTHER ACKNOWLEDGE THAT YOUR HIGH-RISK USE OF THE SOFTWARE IS AT YOUR OWN RISK. |